JavaScript- Lodash vs Js function vs. for vs. for each

[+] zawerf|7 years ago|reply

There's no description of what's actually being tested but you can find it in the formulas.js file: https://github.com/dg92/Performance-analysis-es6/blob/master...

Even on one of the first lines, there was already a mistake with how reduce is called:

    let avg = 0;
    console.time('js reduce');
    avg = posts.reduce(p => avg+= (+p.downvotes+ +p.upvotes+ +p.commentCount)/3,0);
    avg = avg/posts.length;
    console.timeEnd('js reduce')

The way he's using it makes it no different from a forEach. It should be `posts.reduce((p, accumulation) => accumulation + blah, 0)` instead.

Also the timing isn't very sophisticated. Doing microbenchmarking right is basically black magic but the mistakes he's making are really basic (only running each version of the code once, not preventing dead code elimination, etc). Someone mentioned elsewhere that the timing for the large case is faster than the small and my guess is that the entire calculation got JIT'd out since it's unused.

[+] denomer|7 years ago|reply

Thanks for pointing out a mistake on reduce, I missed that.

The data collected is not on running it once, the results are avg. of running same code at least 15 times.

[+] pier25|7 years ago|reply

Maybe, but OTOH if you need to know about the micro benchmarking intricacies to use some native methods maybe there is something wrong with the native methods.

The difference with a simple for() are quite significant.

[+] venning|7 years ago|reply

Not mentioned here, but worth considering: Lodash's forEach works on "array-likes" not just proper Arrays.

It works on `arguments`, it works on strings (iterating through characters), it works on objects (passing the key as the iteratee's second parameter in place of an index value), it works on HTMLCollection pseudo-arrays. It also doesn't throw on null or undefined.

[+] TheAceOfHearts|7 years ago|reply

As of ES2015 you can convert an iterable like HTMLCollection into a regular array by using the spread syntax:

    [...iterable]

[+] tedeh|7 years ago|reply

A simple Object.keys(str|obj|etc...) call on your iterating object makes the other functions work on those data types too.

Lodash may be fast, but recently I've been avoiding the basic "lodash function with native js equivalent" for one particular reason: stepping into js native functions when debugging (node inspect) is a breeze, and a complete nightmare when using lodash.

[+] devmunchies|7 years ago|reply

[deleted]

[+] n8agrin|7 years ago|reply

In the context of a large client application, I often advise engineers that if we're optimizing things like the types of `for` loops we use, we've won the performance lottery. That is, I've never found a critical performance issue that is the result of using `forEach` instead of `for`.

[+] untog|7 years ago|reply

Agreed. I know that in some small use cases these differences are crucial, but in 95% of situations arguing these differences just feels like a waste of time.

[+] fvdessen|7 years ago|reply

I had the opposite experience. We had a large client application that was too slow, with no obvious bottleneck on the flame graph. I replaced all the functional iterators by for loops among other similar optimisations, and improved the performance by a factor of 50. If you use programming constructs that are 50 times slower on average, your program will be 50 times slower on average.

[+] piaste|7 years ago|reply

I've had it happen once, sort of, on a relatively small collection. Several seconds' delay due to the use of foreach, that were actually annoying production users.

The issue was that the collection being iterated over was a non-generic .NET 1.0 DataTable. Using a foreach loop would implicitly box and then re-cast each object, while the for loop directly accessed the correctly typed .Item() and did not need to do that.

Ironically, the body of the loop was a fairly tricky logistics algorithm I had just written, so I had every reason to assume the problem was on the inside. Imagine my surprise when I changed it to a for loop - strictly to access the index and print out some timings - and watched the procedure suddenly become instant...

[+] superfrank|7 years ago|reply

I haven't had a chance to dig through the code yet, but some of these results seem a bit off, especially surrounding the for loop.

For example, here are the results for the for loop 'reduce' on the small data set:

100 items - 0.030

500 items - 0.574

1000 items - 0.074

That doesn't make sense to me. How can a reduce on 1000 items be drastically less than on 500 items. Unless I'm misunderstanding something, I can only conclude that it's either A) a typo or B) they only ran this test once and the random data for the 500 test was exceptionally tough to deal with.

Either way, I would love a little more detail with the data before I trust it.

[+] Matthias247|7 years ago|reply

I've seen that too, but I think that measurement is probably just garbage. Either the garbage collector worked during that period, or the whole PC had something else to do. It's just way too far off to make sense.

[+] uryga|7 years ago|reply

Haven't tested it (or even read the article :) ), but maybe some kind of JIT optimization kicks in the 1000 element case (ie afer the "loop" ran enough times) but not in the others?

[+] partycoder|7 years ago|reply

It does not specify a vm or version. I assume it's v8 on node, but there's no way to infer what version was used.

If using node, use process.hrtime() rather than console.time()/timeEnd(): https://nodejs.org/api/process.html#process_process_hrtime_t...

Then, computing the length multiple times is not a good idea. You should save the length in a variable:

    // no
    for(let i=0; i<posts.length; i++) {

    // yes
    for(let i=0, n=posts.length; i<n; i++) {

Finally, it is not recommended to analyze performance in this manner. A slight little change elsewhere in your program can affect the performance very abruptly.

This is because the gatekeepers of performance are: inline caching, hidden classes, deoptimizations, garbage collection, pretenuring, etc.

[+] denomer|7 years ago|reply

So, after working on this for some time and reading a lot, i realized that this example is more of a practical analysis for day today js code writing that we do. so the result is more kind of related to it and what should be the choice from those 3.

However, you are right about the performance benchmarking factor. Good news, I have done analysis on the inline cache, warm cache and working on how to get GC in place and hidden classes to get better results.

[+] masswerk|7 years ago|reply

While caching the array.length had been important, it probably does little with modern engines. I remember tests from a few years ago, which actually favored the first variant. (Probably, because engines could more easily identify the local context and optimize on this. Also, in terms of runtime optimization, there's much to win and you'd want to tackle this issue as one of the very first things.)

That said, it's still a good idea, even if it's just for pointing out that the constraint on the loop won't change.

[+] ryanpetrich|7 years ago|reply

Caching the length property appears to have only a small impact on performance across Chrome, Firefox and Safari (caching is faster in Firefox, slower in Chrome, and about the same in Safari). Perhaps it's better to recommend the non-cached loop iteration instead?

The quick microbenchmark I checked this on: https://jsperf.com/for-to-length/1

[+] fenomas|7 years ago|reply

    // yes
    for(let i=0, n=posts.length; i<n; i++) {

There's usually no need to cache the length of the array this way. Modern JS VMs are plenty smart enough to do it automatically (unless there's code in the loop that looks like it might change the array length).

[+] denomer|7 years ago|reply

Thanks :)

Working on it.

[+] jgord|7 years ago|reply

Performance aside, also consider Ramda.js

Although Ramda has forEach, I augment it with a version of each(func, data) where data can be an array or map, and func(val, key) where key is the key of the map item, or the index of the item in the array.

I feel this abuse of notation makes for more readable / smaller / uniform code [ having no explicit for loops ]. Also takes less conceptual space.

[+] jrs95|7 years ago|reply

+1 for Ramda -- I've been using it for around 2 years now and once you get comfortable with how to compose it's various functions (and some of your own) it's super powerful. Might not be the best choice if your team is allergic to FP though, some people have a difficult time wrapping their head around it (or just getting used to the syntax). I've gone out of my way to document everything thoroughly knowing people who are mostly unfamiliar with FP will be looking at it though, and that's kept everyone happy.

[+] 0x0|7 years ago|reply

I looked at https://github.com/dg92/Performance-analysis-es6/blob/master... briefly and saw that the different algorithms always run one after another.

Does the first algorithm get an unfair cold cache disadvantage?

[+] denomer|7 years ago|reply

um..i am not sure, i will debug this over weekend and update :)

[+] tzs|7 years ago|reply

In the results shown there, why is the "For loop" row highlighted in each of the tables?

[+] dawnerd|7 years ago|reply

I was wondering why it was red when it was the fastest. Red implies slow commonly right?

[+] denomer|7 years ago|reply

It's just for reference, I will update soon, thanks for point out the issue. :)

[+] pier25|7 years ago|reply

It's really confusing that it's in red.

[+] fenaer|7 years ago|reply

It seems to be the most performant out of the methods tested.

[+] vonseel|7 years ago|reply

What is going on with the highlighting here? The first few, have the for loop highlighted as red and red scores the lowest time (best score). After that, it's all over the place. But it's always the for-loop that is highlighted red. This makes it look like the for-loop always wins but that is not the case. What the hell?

[+] GordonS|7 years ago|reply

This confused the hell out of me too - I've no idea what it's meant to show.

[+] denomer|7 years ago|reply

It's just for reference, I will update soon, thanks for point out the issue. :)

[+] camdenreslink|7 years ago|reply

For completeness, I feel like for...of should be included. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

[+] FrozenVoid|7 years ago|reply

JIT compilation cannot perform expensive optimizations by design, it has to be low-overhead. Functional iteration can only be fast when its optimized in to a simpler form - with JIT and dynamic variables everywhere. For loop is already the simpler form and is easier for JIT compiler to reason about. Sure the imperative code is often longer and less cool than a functional one-liner that chains everything to be done, but it has no hidden performance costs and is obvious to read and debug.

[+] bearjaws|7 years ago|reply

The real lesson is to avoid iterating over large result sets in JS by filtering them in your DB.

[+] 25563765|7 years ago|reply

This often isn’t possible.

[+] vortico|7 years ago|reply

Isn't there a jsperf of this?

[+] denomer|7 years ago|reply

not yet but yes. i will add it soon and share :)

[+] hestefisk|7 years ago|reply

Missing the Exec Summary.

[+] denomer|7 years ago|reply

I am still analyzing the results, also as mentioned in the above comments, i need to consider more things before making a conclusion, i am working on it and will update soon.

[+] 25563765|7 years ago|reply

His for loop modifies the data in-place, compared to map, which returns a new array. This is just one of many things wrong with this test.

[+] masswerk|7 years ago|reply

However, this is a thing you can do with a for loop. Why cripple it artificially? Isn't this also introducing a bias?

[+] denomer|7 years ago|reply

fixed :)

62 comments