sfg75 | 6 years ago | on: In praise of S3
sfg75's comments
sfg75 | 8 years ago | on: Building Real Time Analytics APIs at Scale
It's very possible we could have done the same with RedShift but it didn't seem obvious how. With Citus offering extensions like topn and hll we however quickly saw how that could work for us.
Thanks for the link btw!
sfg75 | 8 years ago | on: Building Real Time Analytics APIs at Scale
After building the aggregation and ingestion services in Go, sticking with this language for the API sounded like a good idea as well since Go makes it trivial to build an http server and the logic of the API is simple enough that we didn’t see the need for any web framework.
sfg75 | 8 years ago | on: Building Real Time Analytics APIs at Scale
We decided not to go with ClickHouse because we were mostly looking for a SaaS solution. That's pretty much why we also didn't spend too much time on Druid either.
Choosing Citus meant we could leverage a technology that we already had a bit of experience with (Postgres) and not have to really care about the infrastructure underneath it. We're still a fairly small team and those are meaningful factor to us.
At the end of day I'm sure all those systems would do the job fine (ClickHouse or Druid), we just went for what seemed the easiest to implement and scale.
sfg75 | 8 years ago | on: TopN for your Postgres database
We heavily rely on this to power our analytics API. We use it precompute tops for billions of daily events. We can then fetch tops across specific time range usually in the order of the milliseconds on the fly. This was a game changer for us :)
sfg75 | 8 years ago | on: Building real-time analytics dashboards with Postgres and Citus
The article mentions HLL, but there are even more useful extensions (e.g topn to handle tops through the jsonb format).
Note that those 9s are for durability
From https://aws.amazon.com/s3/faqs/#How_durable_is_Amazon_S3:
"This durability level corresponds to an average annual expected loss of 0.000000001% of objects. For example, if you store 10,000,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000 years"