top | item 28158846

(no title)

dsies | 4 years ago

The "reimplementing" part resonates with me 100%. I've also reimplemented this solution many times and every time it has been a pain in the ass.

I built https://batch.sh specifically to address this - not having to reinvent the wheel for storage, search and replay.

For some cases, storing your events in a pg database is probably good enough - but if you're planning on storing billions of complex records AND fetching a particular group of them every now and then - it'll get rough and you need a more sophisticated storage system.

What storage mechanism did you use? And how did you choose which events to replay?

discuss

order

acjohnson55|4 years ago

In our case, we used Postgres. Our event volume was quite small, and we needed strict consistency (i.e. all participants in the auction see the same state). So we stored the log of events for an auction lot as a JSON blob. A new command was processed by taking a row-level lock (SELECT FOR UPDATE) on the lot's row, validating the command, and then persisting the events. Then we'd broadcast the new derived state to all interested parties.

All command processing and state queries required us to read and process the whole event log. But this was fairly cheap, because we're talking maybe a couple dozen events per lot. To optimize, we might have considered serializing core aspects of the derived state, to use as a snapshot. But this wasn't necessary.

Batch looks pretty cool! I'll keep that in mind next time I'm considering reinventing the wheel :)