top | item 3067333

(no title)

mdirolf | 14 years ago

That's a great question, and to be honest it's a bit of a pain point so I probably should've talked about it in the post.

When developing a new M/R job from scratch I start by mirroring (at least part of) the data to a local database. Then I can iterate locally on the M/R using print() and printjson() to debug the map() and reduce() functions - those will print directly to the database log.

I tend to just embed the map() & reduce() functions as Python strings like you see in the post. I'm confident that there are better ways to handle this, though. One approach that can be interesting is to do development from the shell, that way you can write and debug the map() & reduce() in an actual JS environment. Once you're happy with them you can just drop them in as strings with the rest of your application code. Would love to hear how other people are approaching this stuff, too.

All of that said, I expect that the tooling here will improve over time.

discuss

order

No comments yet.