Eccentric Developments

At work and online, there have been many instances where I have heard (or read) programmers complain about algorithm analysis not being an important part the day to day job; they have their reasons, but I disagree. Here I´ll show you a couple of instances where knowing about this topic can make your life better as a programmer.

For the first example, we have two lists and we need to update the content of one with the other; for the sake of the example, let's say the first one is a list of computer parts, where each entry has an ID and a Description, and the other one is a list of the prices for each part also with a PriceID and the updated price Ammount. The examples are in Javascript, because I like it.

For starters, firts we need to retrieve the data that we´ll be using for the examples, nothing relevant here, just to give some context.

let [parts, prices] = await Promise.all([axios.get('/parts'), axios.get('/prices')]);

Having both arrays now you need to mergue them, the most common implementation would look like this:

parts.forEach(part => 
    part.price = prices.find(price => 
        price.partId === part.id
    )
);

Looks pretty innocent, until you realize that the 'find' and 'forEach' methods are linear and your runtime ends up looking like this:

parts.forEach(part => 
    //this part runs O(n) times
    part.price = prices.find(price =>
        //this part runs O(m) times
        price.partId === part.id
    )
);

With those small annotations we see that the whole operation takes O(n*m), does it look fine?, not at all if you have many entries in each array, it looks an awful lot like O(n²) if you take into account that you´ll mayber have as many prices as computer parts.

Can we do better?, of course!, otherwise I would not be writting this...

let pricesMap = new Map();
prices.forEach(price => pricesMap.set(price.partId, price));
parts.forEach(part => part.price = pricesMap.get(part.id));

Now we are now using a Javascript Map, which effectively is a Hashmap, so the annotations would look like this.

let pricesMap = new Map();
prices.forEach(price => 
    //this runs O(m) times
    pricesMap.set(price.partId, price) //this insertion takes O(1)
);
parts.forEach(part => 
    //this runs O(n) times
    part.price = pricesMap.get(part.id) //this lookup takes O(1)
);

I´ve now specified how long each part takes, so putting all the prices inside the Map takes O(m * 1) or simply O(m), iterating over the parts and setting the price after looking it up takes O(n * 1) (or simply O(n) as well), as you can see, looking for an element inside a map takes constant time O(1), which I would recomend you to investigate why this is.

The final speed for this implementation ends up like O(m + n); now, which do you think is faster, O(m * n) or O(m + n)? If you guessed the later, then you are right!; lets say m = 1000, and n = 1000, the first implementation would do 1,000,000 operations while the second would only do 2,000, so 500x faster!

Second argument: making your code more readable

Many times I´ve seen code that tries to cram as much code inside a loop as possible, in order to "make it faster" preventing more loops. For this example lets say we have our computer parts array ready with prices, but now we need to do some filtering; first we will remove all parts with prices above $500, then we only need those that are compatible with windows or linux, and keep only the ones that have an inventory of over 10 units.

const filteredParts = parts.filter(part => {
    if (part.price.amount > 500)
        return null;
    if (part.compatibility !== 'windows' || part.compatibility !== 'linux')
        return null;
    if (part.available <= 10)
        return null;
    return part;
});

Talk about cramming stuff into a loop... three different 'if' statements are beign run for each iteration, naturally we would think this is the most efficient way to filter the list, and perhaps it is but readability is another matter.

Personally I find the following much more readable:

const filteredParts = parts
    .filter(part => part.price.amount <= 500)
    .filter(part => part.compatibility === 'window' || part.compatibility === 'linux')
    .filter(part => part.available > 10);

Now each filter is separated, each condition is used to let the filter know which parts to keep, and not which to remove. And if you like destructuring it would look even better, but I´ll not dwelve into that here.

Which of those functions is faster?, well, let´s annotate them using the usual asymptotic notation.

const filteredParts = parts.filter(part => {
    //this condition is executed O(n) times
    if (part.price.amount > 500)
        return null;
    //this other condition is executed O(n) times
    if (part.compatibility !== 'windows' || part.compatibility !== 'linux')
        return null;
    //this is O(n) as well
    if (part.available <= 10)
        return null;
    return part;
});

If we only look at the job done by the if statements we end up with a runtime of O(n + n + n) or O(3n). Now for the second version:

const filteredParts = parts
    //this filter is executed O(n) times
    .filter(part => part.price.amount <= 500)
    //this is also O(n)
    .filter(part => part.compatibility === 'window' || part.compatibility === 'linux')
    //and this is O(n) as well
    .filter(part => part.available > 10);

So, how much work is this new version doing?, answer: O(n + n + n) or O(3n). Meaning that both version are exactly the same!, didn´t see that one comming did you?. Granted the second version might have a bit of an overhead for running two more filter methods but, is so negible that it´s not even worth mentioning, so no worries there.

If you appreciate composition and functional programming style, the later version is much more readable, compact and maintenable, and by doing a simple analysis of the algorithm we can see that we are not incurring in any performance penalty.

Conclusion

Don´t ditch your Introduction to Algorithms book just yet.

Eccentric Developments

Simple examples of why its important to know a bit about algorithms

Second argument: making your code more readable

Conclusion

Previous Posts