top | item 11856005

(no title)

theanomaly | 9 years ago

Thanks for the analysis -- it is good that people have this context in their heads when designing systems. The missing conversation from this article is that some people conflate scalability with performance. They are different, and you absolutely trade one for the other. At large scale you end up getting performance simply from being able to throw more hardware at it, but it takes you quite a while to catch up to where you would have been on a single machine.

This is true not just for computing algorithms, but for developer time/brain space as well. Single-threaded applications are far simpler to understand.

The takeaway shouldn't be "test it on a single laptop first", but rather "will the volume/velocity of data now/in the future absolutely preclude doing this on a single laptop". At my work, we process probably a hundred TB in a few-hour batch processing window at night, Terabytes of which remain in memory for fast access. There is no choice there but to pay the overhead.

discuss

order