(no title)
cmollis | 1 year ago
I can execute some pretty hairy scans against a huge s3 parquet dataset in Duckdb that I would typically have to run in either spark or athena.. it's a little slower, but not ridiculously slower. And, it does all of that from my desktop.. no clusters, no mem or task configs.. just run the query. Being able to integrate all of the expensive historical scanning and knitting that back into an ML pipeline with desktop python is pretty nice.
No comments yet.