top | item 39877595

(no title)

ignoreusernames | 1 year ago

100% agree. mapReduce hype always seemed strange to me because it's basically the volcano paper from the 90s but with custom user defined operators instead of pre baked ones in a more traditional engine. To make everything worse, hadoop came along, ignoring every industry advance of the past 40 years with its "one tuple at a time" iterator based model on a garbage collected language. I realize it's very easy for me to say those things in hindsight, but it's not like vectorized execution was a weird obscure secret by the time these things came out.

On a side note, it finally looks like the industry is moving towards saner tools that implement a lot of things that this article mentions mapReduce was missing

discuss

order

rossjudson|1 year ago

How good is your vectorized execution engine at dealing with a handful of storage nodes going down for an hour or two? Or figuring out when bit flips have randomly happened? Or at sharing resources with latency-sensitive serving jobs?

"Custom user defined operators" did a lot of heavy lifting at Google over decades.

The set of appropriate use cases is getting smaller, mostly due to sql-ish systems scaling up towards what could actually (and frequently, only) be done with MapReduce/Flume.