top | item 31487706

(no title)

temuze | 3 years ago

Back at my old job in ~2016, we built a cheap homegrown data warehouse via Postgres, SQLite and Lambda.

Basically, it worked like this:

- All of our data lived in compressed SQLite DBs on S3.

- Upon receiving a query, Postgres would use a custom foreign data wrapper we built.

- This FDW would forward the query to a web service.

- This web service would start one lambda per SQLite file. Each lambda would fetch the file, query it, and return the result to the web service.

- This web service would re-issue lambdas as needed and return the results to the FDW.

- Postgres (hosted on a memory-optimized EC2 instance) would aggregate.

It was straight magic. Separated compute + storage with basically zero cost and better performance than Redshift and Vertica. All of our data was time-series data, so it was extraordinarily easy to partition.

Also, it was also considerably cheaper than Athena. On Athena, our queries would cost us ~$5/TB (which hasn't changed today!), so it was easily >$100 for most queries and we were running thousands of queries per hour.

I still think, to this day, that the inevitable open-source solution for DWs might look like this. Insert your data as SQLite or DuckDB into a bucket, pop in a Postgres extension, create a FDW, and `terraform apply` the lambdas + api gateway. It'll be harder for non-timeseries data but you can probably make something that stores other partitions.

discuss

ignoramous|3 years ago

We do something similar, but:

- instead of S3, we now use R2.

- instead of Postgres+Sqlite3, we use DuckDB+CSV/Parquet.

- instead of Lambda, we use AWS AppRunner (considering moving it to Fly.io or Workers).

It worked gloriously for variety of analytical workloads, even if slower had we used Clickhouse/Timescale/Redshift/Elasticsearch.

Cwizard|3 years ago

How has your experience been with DuckDB in production? It is a relatively new project. How is it’s reliability?

teh|3 years ago

I've looked into this but saw hugely variable throughput, sometimes as little as 20 MB / second. Even if full throughput I think s3 single key performance maxes out at ~130 MB / second. How did you get these huge s3 blobs into lambda in a reasonable amount of time?

petethepig|3 years ago

* With larger lambdas you get more predictable performance, 2GB RAM lambdas should get you ~ 90MB/s [0]

* Assuming you can parse faster than you read from S3 (true for most workloads?) that read throughput is your bottleneck.

* Set target query time, e.g 1s. That means for queries to finish in 1s each record on S3 has to be 90MB or smaller.

* Partition your data in such a way that each record on S3 is smaller than 90 MBs.

* Forgot to mention, you can also do parallel reads from S3, depending on your data format / parsing speed might be something to look into as well.

This is somewhat of a simplified guide (e.g for some workloads merging data takes time and we're not including that here) but should be good enough to start with.

[0] - https://bryson3gps.wordpress.com/2021/04/01/a-quick-look-at-...

simonw|3 years ago

How large were the SQLite database files you were working with here?

I've been thinking about building systems that store SQLite in S3 and pull them to a lambda for querying, but I'm nervous about how feasible it is based on database file size and how long it would take to perform the fetch.

I honestly hadn't thought about compressing them, but that would obviously be a big win.

simonw|3 years ago

Looks like you already answered my question here: https://news.ycombinator.com/item?id=31487825

Cwizard|3 years ago

Do you have a blog post or something similar where you go into more details on this architecture? I’d be very interested in reading it!

simonw|3 years ago

Just so you know I tweeted a link to this and it's got quite a bit of attention on Twitter: https://twitter.com/simonw/status/1529134311806410752

dfinninger|3 years ago

This sounds very similar to Trino’s (and by extension, Athena’s) architecture.

SQLite -> parquet (for columnar instead of row storage) Lambda -> Worker Tasks FDW -> Connector Postgres Aggregation -> Worker Stage

We run it in Kubernetes (EKS) with auto-scaling, so that works sort of like lambda.

notmattbark|3 years ago

Quick question, you on LinkedIn? Please send it my way

joevandyk|3 years ago

curious: how large was each compressed s3 sqlite db?

temuze|3 years ago

Sorry, it's been a while, I forget :(

We had to balance between making the files too big (which would be slow) and making them too small (too many lambdas to start)

I _think_ they were around ~10 GB each, but I might be off by an order of magnitude.