Joel Spolsky: Can your programming language do this?

[+] onan_barbarian|15 years ago|reply

I think there's some reasonable stuff buried in here, I really do.

But... having actually spent some time in the trenches dealing with a hard problem on a massively parallel machine - more than once - I find it hard to believe that something like map/reduce or the like - or any given small-scale language feature is going to be particularly significant in terms of parallelizing any goddamn thing that's actually hard to do. I see a lot of fatuous claims that language feature X is the missing link in parallelism for the everyday working programmer but I don't see a lot of new solutions for anything hard as a proof of concept.

We've only had OpenMP and all sorts of other kludgy extensions for Fortran and C for what, about 15 years? I'm not saying that they're great or elegant or anything, but so many of the things that are hard about parallel programming are NOT FUCKING SOLVED BY MAP-REDUCE. Oops, sorry, shouty. But anything that can be solved by map-reduce wasn't enormously hard to begin with. Map-reduce was itself initially publicized in terms of 'less useful but easier for mere mortals than generalized parallel prefix' which made sense to me.

What doesn't make sense for me is all this frenzied enthusiasm for dragging parallel programming into the never-ending programmlng language abstraction wars; at least when the topics being discussed only touch on the very shallowest things needed by parallel programming. You want some respect, solve something hard.

Yes, you can do the same thing to each element of an array. Whaddya want, a cookie?

[+] jxcole|15 years ago|reply

I agree that map reduce does not solve parallel computing. However, that is just one example of how closures/lambdas can help you design software. It helps in almost any system in which you have two different functions that do almost the same thing, but a little different.

For example, say you are writing a double linked list. The push to head and push to tail functions may be completely the same, except one uses the head and one uses the tail, so you could just write one function and pass in the head or tail, right?

But you can't. The problem is that when you add something to one side, the links are the opposite. That is a new head goes to the right of the old head, while a new tail goes to the left of the old tail. In order to really do this, you also have to pass in a function.

Now, I've written a few doubly linked lists in Java (don't get mad, there _are_ good to do this), and in the end I just copy the function and make the inline changes. Trying to create the right interface and then pass in anonymous classes is almost impossible. And even if in my supreme cleverness I could figure it out, the code would be so confusing as to be unmaintainable.

Anyway, I would say that closures are immensely useful, not _just_ because of map reduce, but because of a whole class of examples that work in this manner.

[+] Peaker|15 years ago|reply

Here's the main Haskell hero, Simon Peyton Jones giving a talk about data parallel Haskell, which I think is a promising language abstraction to do hard things with parallelism relatively easily:

http://www.youtube.com/watch?v=NWSZ4c9yqW8

As for techniques that can be used now: Haskell's `par` and `pseq` combinators allow adding parallelization to "hard" code after-the-fact pretty easily.

[+] praptak|15 years ago|reply

"I find it hard to believe that something like map/reduce or the like - or any given small-scale language feature is going to be particularly significant in terms of parallelizing any goddamn thing that's actually hard to do."

I don't believe it either, but it wasn't the point of the article. It was not about magically solving hard problems, but getting about parallelism for free where it is easy.

OTOH it is not apparent that those language features are actually indispensable for that. In most cases where 'map' would apply, it is quite possible for a compiler to detect that the loop body does not rely on its own previous executions and can thus be parallelized automatically.

[+] kragen|15 years ago|reply

It's a little misleading to link MapReduce to language support for anonymous functions. The original implementation of MapReduce was in C++!

[+] Rusky|15 years ago|reply

Joel's point is that these things are easy now because of techniques or features like the ones he brings up. Try to do something like that in assembly or FORTRAN or C and it's not all that easy any more.

You're working a level above- map/reduce is easy there and things that would be completely unthinkable to maintain in assembly/FORTRAN/C are the real hard problems.

[+] yid|15 years ago|reply

> But anything that can be solved by map-reduce wasn't enormously hard to begin with.

Depends on your definition of hard. Algorithmically, yes the tasks are relatively simple, but try programming an 80,000 node MPI/PVM cluster to run general jobs and you'll get a very different definition of "hard".

[+] 6ren|15 years ago|reply

Also, Fortran has long been used for parallel programs (on mainframes). Guy Steele is working on Fortress, and he has some interesting insights on the issue.

But I do think that a practical solution will have to rethink not only programming languages, but the concept of programs. It will be made by a young and naive tinkerer, trying to solve a specific problem on actual parallel hardware, not by guys who have been entrenched in the field for decades. So it won't happen til that concrete technology (eg. 100 cores) is widely available. While there may be similarities with functional and object oriented programming (Alan Kay's version), it won't be based on them.

[+] earl|15 years ago|reply

In practice, as someone who has spent the last 18 months writing MR code at Quantcast, the difficult parts of writing map reduce programs all are how does one take communicative/global knowledge algorithms and either (1) change the algorithm to require only a handful of passes each with partial knowledge, or (2) create a probabilistic algorithm to solve an approximation of your problem that can be computed with only partial knowledge. No one says otherwise.

But Map Reduce does, IMO, do one thing that MPI doesn't: the easy things are easy. Hard stuff is still hard, but easy things are easy. If you have a low communication algorithm that requires only local knowledge and fits into the MR paradigm well, hadoop + hdfs makes it really simple to write and reliably execute, including automatically handling things like (i) worker nodes dying, (ii) sorters dying, (iii) storage nodes dropping out, (iv) whole steps disappearing and having to be rerun.

[+] kragen|15 years ago|reply

> Correction: The last time I used FORTRAN was 27 years ago. Apparently it got functions.

FORTRAN had user-defined functions since FORTRAN II in 1958; see http://archive.computerhistory.org/resources/text/Fortran/10... on page numbers 5, 14, and 15.

Joel unfortunately completely misses the point of why C and Java suck at this stuff: you can use functions as values in them (anonymous inner classes in Java) but they aren't closures. And his comment about automatically parallelizing "map" is a little off-base; if you take some random piece of code and stick it into a transparently parallel "map", you're very likely to discover that it isn't safe to run multiple copies of it concurrently, which is why languages like Erlang have a different name for the "parallel map" function. The "map" in MapReduce is inspired by the function of that name in Lisp and other functional languages; it isn't a drop-in replacement for it.

As usual, while Joel's overall point is reasonably accurate, most of his supporting points are actually false to the point of being ignorant nonsense. I think someone could tell as good a story in as entertaining a way without stuffing it full of lies, although admittedly my own efforts fall pretty far short.

[+] statictype|15 years ago|reply

You're nitpicking about details that are not relevant to the article at all.

His point was that languages that treat functions as first-class citizens enable a class of abstractions that are tedious/impossible in languages which don't support them.

Whether the naive implementation of map can be easily parallelized across multiple machines/threads/processes is not relevant. In fact, he even says that you need a 'super genius' to write the scaling map and reduce operations.

[+] Raphael_Amiard|15 years ago|reply

> Joel unfortunately completely misses the point of why C and Java suck at this stuff: you can use functions as values in them (anonymous inner classes in Java) but they aren't closures

I don't at all think he completely misses the point.

First, anonymous inner classes do capture their lexical environment, and you can access outer variables provided they are declared as final.

Second, even if closures allow another bunch of programming techniques, and even if they seem a natural way of programming with anonymous functions, they are not needed for everything related to being able to pass functions as arguments, at all. Notably, every example joel has given in his article doesn't rely on functions capturing their outer lexical environnment. mapping an add function to an array doesn't, and is still a very useful programming technique.

Admitedly though, i really prefer my languages to support full lexical scoping and closures. But i don't think the article was about that, and i feel more like you missed what joel had to say because you think joel should have introduced the full notion of what a first class function is.

[+] edanm|15 years ago|reply

"Joel unfortunately completely misses the point of why C and Java suck at this stuff: you can define anonymous functions in them (anonymous inner classes in Java) but they aren't closures."

You can't define anonymous functions in C.

And I wouldn't "attack" Joel for "stuffing the article full of lies". It's more like "abstracting away the details". When people talk of the benefits of some practice or programming paradigm, they don't always mention all the work going into it - they just explain the concept. That's what a good teacher does, IMO. He takes complex concepts and explains the important parts.

[+] cynicalkane|15 years ago|reply

Anonymous inner classes in Java actually are closures. The compiler will force you to declare any closed-over local variables as final. Like everything else in Java, it's verbose but it works. The problems with using closures in Java is that the type system doesn't play well with function objects and that the standard library doesn't care about functional programming, and it's really only the former that is stopping you from functional programming in Java.

[+] dantheman|15 years ago|reply

There's a great talk on parallelism, by guy steele: http://www.infoq.com/presentations/Thinking-Parallel-Program...

Notice that you need different data structures.

[+] gaius|15 years ago|reply

He also misses the point that FORTRAN compilers are bloody good at finding and exploiting opportunities for parallelization. You have much less need of these higher level language constructs if the compiler is doing it for you under the hood! On VAX this was as simple as the /PARALLEL flag at compile time...

[+] schmylan|15 years ago|reply

You're missing it. Here's what Joel is saying without saying it: C# 4 is "that language.". Both his companies are built on the MS stack.

[+] sid0|15 years ago|reply

While you need a lot of syntactic bullshit to get there, Java does have closures.

[+] grav1tas|15 years ago|reply

I think it might be important to note that while the terms map and reduce do come from Lisp, they're not one-to-one with what these functions do in Lisp. The original MapReduce paper mentions the borrowing, but doesn't really go into specifics. There's a good paper by Ralf Lämmel that describes the relation that MapReduce has to "map" and "reduce" at http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.104.... . I liked this paper much better and found it the most informative functional explanation to MapReduce (note, it's in Haskell).

I think MapReduce is really part of a more general pattern where you have an (in more Haskell-y terms) unfold (anamorphism) to a foldr (catamorphism). If your operations on the items in your intermediate set of data in the MapReduce are associative/commutative, you can work out parallelization more or less for free. It's pretty cool stuff, and really not that complicated when you sit down and think about it.

[+] Dn_Ab|15 years ago|reply

Wow. Thank you. I wish I could upvote you more. I have never looked into Google's "MapReduce" but if what you say is true then my assumptions on it were completely wrong. I assumed it was 1-1 with map and reduce. But if it is generating a set of unfolds into 'essentially' fold (is it? I only skimmed the first pages due to time) then that is a very important detail and makes the name a bit of a misnomer. Why don't people make a bigger deal about this - unfold, fold is not harder but it requires a different mindset and approach than map fold.

If you are approaching it thinking it is just a map and fold then you are going in disadvantaged and will have to unlearn that fact to properly leverage something that is actually even more impressive/useful/powerful than a mare fold o map.

[+] sthatipamala|15 years ago|reply

This article shows that Javascript is truly the common man's functional programming language. Despite its ugliness, it got lambdas/anonymous functions right.

[+] JonnieCache|15 years ago|reply

Something that I only realised the other day which made me feel kinda embarrassed: in ruby, the reduce method is called inject.

For years I've been doing MapReduce functions, without realising it. MapReduce was in my mental pile of "genius things that cleverer people than me do, must be looked into when there is time."

For info on inject: http://blog.jayfields.com/2008/03/ruby-inject.html

[+] rivalis|15 years ago|reply

Even when I'm working in a language that doesn't have first class functions, I find it easier to lay out my code by writing functional pseudocode and then "unrolling" maps into loops, closures into structs/objects, compositions into a sequence of calls, etc. It probably leads to idiomatically awful Java, but I find it easier to read and write, and nobody else needs to deal with my code. So...

[+] tybris|15 years ago|reply

Yup, any language I've worked with, including Java and C, can do that just fine. They just spread the verbosity differently. Organizing large projects is a pain in JavaScript, trivial in Java. Using anonymous functions is a pain in Java, trivial in JavaScript.

(Not so) fun fact: The public MapReduce services by Google and Amazon do not (directly) support JavaScript.

[+] ajays|15 years ago|reply

He gives reduce as an example of "purely functional programs have no side effects and are thus trivially parallelizable.", but reduce by definition is not trivially parallelizable.

[+] chuhnk|15 years ago|reply

Has anyone else just read this and realised they need to go off an learn some form of functional programming? I ignored it for such a long time because I felt it wasn't relevant to my current situation but I was wrong. You gain some incredible fundamental knowledge that you would otherwise be completely oblivious to.

Is lisp really the way to go though?

[+] merijnv|15 years ago|reply

I would recommend Haskell over Lisp, because Haskell will force you to do functional programming, where-as Lisp is less rigid in that regard. In addition the #haskell IRC channel on FreeNode is filled with people who have infinite patience with Haskell newbies and it has an excellent introductory text at http://learnyouahaskell.com/

About two years ago I decided to learn Haskell. Had about 4 false starts and didn't "get" it until my 5th attempt, at which point I figured out what the type system meant and had a bit of a Matrix "I know kung fu!"-moment. Two years of hanging around in #haskell later, I know a billion more things about programming/programming languages and learned type theory (although I still don't grok all conversations there :p).

Some other comments on the usefulness of Haskell:

http://news.ycombinator.com/item?id=1145743

[+] troels|15 years ago|reply

Javascript is just as good for learning the concept of functional programming. It's often (jokingly) called Scheme with C-like syntax. Unlike lisps though, it doesn't give you macros, which is even more mind-bogling than first-class functions.

Ruby also has decent fp-semantics, but they are not as clearly cut as in Javascript.

[+] bluehavana|15 years ago|reply

Learn Scheme in 20 minutes, because it's that simple: http://www.youtube.com/watch?v=GAHA0nJv8EA

[+] yesbabyyes|15 years ago|reply

Try javascript and/or coffeescript. Valuable skills!

[+] JonnieCache|15 years ago|reply

I hear lisp is pretty sweet. Haskell as well, there's a lot going on there. You can program in a very functional style in ruby or javascript if you want something with a more active community, its just that the resources you find on those languages won't be so targeted at FP, because its not their defining feature.

[+] gaius|15 years ago|reply

I would recommend OCaml over Lisp, once you've had type inference, there's no going back.

[+] mike_esspe|15 years ago|reply

You can try Scala (especially if you know Java).

[+] macmac|15 years ago|reply

Yes

[+] gaius|15 years ago|reply

FTA:

The very fact that Google invented MapReduce, and Microsoft didn't, says something about why Microsoft is still playing catch up trying to get basic search features to work

I don't believe this is true, and that's easy to prove: There was parallelism of SELECTs in SQL Server 2000. So there is a part of MS that is perfectly happy with the idea, even in another bit of MS isn't. They just need to talk more...

[+] justwrote|15 years ago|reply

Yes, it can! Scala:

  def Cook(i1: String, i2: String, f: String => Unit) {
    println("get the " + i1)
    f(i1)
    f(i2)
  }
  
  Cook("lobster", "water", x => println("pot " + x))
  Cook("chicken", "coconut", x => println("boom " + x))

  List(1,2,3).sum
  List(1,2,3).mkString
  List(1,2,3).map(println) // or more verbose List(1,2,3).map(x => println(x))

[+] pmr_|15 years ago|reply

Today I tried to explain someone what exactly boost::mpl::fold does and how it is supposed to be used (For those unfamiliar: boost::mpl is a collection of compile-time metaprogramming mechanisms for C++).

I took me a while to realize that the person I was explaining it to had only little problems with the templates and compile-time part but close to no idea what a fold or a lambda are.

Not knowing some basics of functional programming can keep a person from understanding so many different things and I have encountered those in different fields (e.g. explicitly like in formal semantics or implicitly in different theories of morphology).

I think the real point here is that different paradigms offer you new views onto the world and enhance your understanding all the programming language things aside.

[+] becomevocal|15 years ago|reply

I think this could also be called 'can your brain think like this?'... Many programmers stray from thinking in a massive way and tend to problems with similar, familiar codebases.

[+] svrocks|15 years ago|reply

Does anyone else think it's a travesty that the AP Computer Science curriculum is taught in Java? Java was my first programming language and I've spent the past 8 years trying to unlearn most of it

[+] hdragomir|15 years ago|reply

I remember my days as a CS student.

The single most mind-opening course I took was functional programming, where I learned LISP and Prolog.

That knowledge today is crucial as it deeply changed my mindset when tackling most any problem.

[+] bluehavana|15 years ago|reply

It's funny that he mentions Google as an example of a company that gets the paradigm, but most of Google is C++ and Java. C# has better functional paradigm support than both of those.

[+] pragmatic|15 years ago|reply

Yes the additions that were added to support Linq makes c# (imho) the most advanced production language today. By advanced I mean it's fully supported, everything works as intended. It's not academic. I can use advanced features today for real work and get full IDE support and good documentation.

I just built a search engine. I was able to make it massively parallel with some trivial parallel linq code. The power of Linq and the parallel extensions is just amazing.

[+] bad_user|15 years ago|reply

The Google Search engine was at some point written in Python.

[+] cincinnatus|15 years ago|reply

I don't like the way in line functions hurt the readability of code. Is there anything out there that solves that issue?

Also I haven't had an excuse to use it yet but F# seems to have great syntactic sugar for parallelizing things in a more natural way than the typical map reduce.

[+] dpritchett|15 years ago|reply

Inline anonymous functions don't have to look so funny. Here's an example in coffeescript that takes no arguments and performs a quick action: https://github.com/dpritchett/chatbox/blob/master/public/cha...

Explanation: After the 500ms textbox fadeIn finishes it fires a an anonymous callback function which calls a 3000ms fadeOut.

[+] ericf|15 years ago|reply

I implemented these examples in Ruby 1.9, would love to know if there are more efficient ways of doing some of these:

    def cook(p1, p2, f)
      puts "get the " + p1.to_s
      f.call(p1)
      f.call(p2)
    end

    cook( "lobster", "water", lambda {|x| puts "pot " + x })
    cook( "chicken", "coconut", lambda {|x| puts "boom " + x })

    @a = [1,2,3]
    @a.map {|x| puts x*2}
    @a.map {|x| puts x}

    def sum(a)
      @a.reduce(0) do |a, b|
        a + b
      end
    end

    def join(a)
      @a.reduce("") do |a, b|
        a.to_s + b.to_s
      end
    end

    puts "sum " + sum(@a).to_s
    puts "join " + join(@a)

[+] flatwhatson|15 years ago|reply

Here's one more efficient way... use perl!

  #!/usr/bin/perl
  
  use Modern::Perl;
  use List::Util 'reduce';
  
  sub cook {
    my ($i1, $i2, $f) = @_;
    say "get the $i1";
    $f->($i1);
    $f->($i2);
  }
  
  cook "lobster", "water",   sub { say "pot "  . shift };
  cook "chicken", "coconut", sub { say "boom " . shift };
  
  my @a = (1, 2, 3);
  
  map { say $_ * 2 } @a;
  map { say $_     } @a;
  
  sub my_sum {
    reduce { $a + $b } 0, @_;
  }
  
  sub my_join {
    reduce { $a . $b } "", @_;
  }
  
  say "sum "  . my_sum(@a);
  say "join " . my_join(@a);

[+] mncolinlee|15 years ago|reply

The moment I read this, I immediately thought of the work I performed on Cray's Chapel parallel language. Chapel has an elegant way of expressing functional parallel code like this that is much more difficult to write in Unified Parallel C and High Performance Fortran. In fact, one Google search later and I found a student's presentation on Chapel and MapReduce.

http://www.slidefinder.net/l/l22_parallel_programming_langua...

[+] buddydvd|15 years ago|reply

Can Xcode 4 compile code using Objective-c blocks into iOS 3.x compatible binaries? This article made me realize how much I miss anonymous functions/lambda expressions from C# and javascript.

[+] astrange|15 years ago|reply

http://code.google.com/p/plblocks/

110 comments