sfg75's comments

sfg75 | 6 years ago | on: In praise of S3

> Plus you get 11x9's of availability (99.999999999% availability).

Note that those 9s are for durability

From https://aws.amazon.com/s3/faqs/#How_durable_is_Amazon_S3:

"This durability level corresponds to an average annual expected loss of 0.000000001% of objects. For example, if you store 10,000,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000 years"

sfg75 | 8 years ago | on: Building Real Time Analytics APIs at Scale

That's a fair point. Indeed we started looking at doing aggregations across raw events, before realizing this was probably ill fated.

It's very possible we could have done the same with RedShift but it didn't seem obvious how. With Citus offering extensions like topn and hll we however quickly saw how that could work for us.

Thanks for the link btw!

sfg75 | 8 years ago | on: Building Real Time Analytics APIs at Scale

Pretty much automatic. With the exception of our search engine which is in C++ (as performance is paramount there), Go is becoming our language of choice for most of our backend services. We found in Go a great balance in terms of productivity and performance.

After building the aggregation and ingestion services in Go, sticking with this language for the API sounded like a good idea as well since Go makes it trivial to build an http server and the logic of the API is simple enough that we didn’t see the need for any web framework.

sfg75 | 8 years ago | on: Building Real Time Analytics APIs at Scale

Hey, sorry if that wasn't clear enough (author here).

We decided not to go with ClickHouse because we were mostly looking for a SaaS solution. That's pretty much why we also didn't spend too much time on Druid either.

Choosing Citus meant we could leverage a technology that we already had a bit of experience with (Postgres) and not have to really care about the infrastructure underneath it. We're still a fairly small team and those are meaningful factor to us.

At the end of day I'm sure all those systems would do the job fine (ClickHouse or Druid), we just went for what seemed the easiest to implement and scale.

sfg75 | 8 years ago | on: TopN for your Postgres database

We've been using this extension for a while now at Algolia, great to see that it's now open sourced!

We heavily rely on this to power our analytics API. We use it precompute tops for billions of daily events. We can then fetch tops across specific time range usually in the order of the milliseconds on the fly. This was a game changer for us :)

page 1