top | item 45018588

(no title)

dmayle | 6 months ago

As to MapReduce, I think you're fundamentally mistaken. You can talk about map and reduce in the lambda calculus sense of the term, but in terms of high performance distributed calculations, MapReduce was definitely invented at Google (by Jeff Dean and Sanjay Ghemawat in 2004).

discuss

order

jonathaneunice|6 months ago

Not quite. Google brilliantly rebranded the work of John McCarthy, C.A.R. Hoare, Guy Steele, _et al_ from 1960 ff. e.g. https://dl.acm.org/doi/10.1145/367177.367199

Dean, Ghemawat, and Google at large deserve credit not for inventing map and reduce—those were already canonical in programming languages and parallel algorithm theory—but for reframing them in the early 2000s against the reality of extraordinarily large, scale-out distributed networks.

Earlier takes on these primitives had been about generalizing symbolic computation or squeezing algorithms into environments of extreme resource scarcity. The 2004 MapReduce paper was also about scarcity—but scarcity redefined, at the scale of global workloads and thousands of commodity machines. That reframing was the true innovation.

dekhn|6 months ago

CERN was doing the equivalent of MapReduce before Google existed.

wbl|6 months ago

Got a link? HEP data is really about triggers and widely distributed screening that's application specific AFAIK.

mr_toad|6 months ago

SPMD was definitely a thing before MapReduce.