(no title)
bkirwi | 11 years ago
- Any monoid operation is trivially parallelizable: take the dataset, split it into chunks, combine all the elements in each chunk together, then combine those results together as a final step.
- If I'm updating a row in my database with a monoid operation, I can always 'pre-aggregate' a batch of values together on the client side before taking that result and combining it with the stored value.
- If I store some statistics for every day of the year, I can calculate the monthly statistics very cheaply -- as long as those stats are a monoid.
The monoid abstraction seems weird at first, because it's so dang general, but it ends up hitting a bit of a sweet spot: the rules are just strong enough to be useful for a bunch of things, but simple enough that be applied to all kinds of different situations. You can think of it kind of like a hub-and-spokes thing -- this interface connects all kinds of different data types to all kinds of different cool situations, so you get a lot of functionality with a lot less typing and thinking.
No comments yet.