Unless I’m missing some big numbers somewhere you could do that locally on a pi 5 with efficient code. Nothing heroic required, just a decently fast language like Go.
My laptop can run 70B LLMs at usable speeds.
I know. Doesn’t scale. No redundancy. No auto redeploy on failures. This is what I mean.
Do we really have to sacrifice this much efficiency for those things or are we doing it wrong? Does the ability to redeploy on failures, cluster, and scale really require order of magnitude performance penalties across the whole stack?
Totally fair point. For stable, known workloads, you can get really far with something lightweight on a single machine. The challenge comes when you need fault tolerance, scaling, and delivery guarantees without constantly jumping in to fix things. Often heard from data teams talking about data peaks that they cannot predict as easily. But yes, a lot of existing tools make you pay a high-efficiency cost for that. At GlassFlow we are trying to hit that sweet spot...efficient but still resilient.
super_ar|8 months ago
One of the top questions we received was: “How well does it perform at high throughput?”
We ran a load test and would like to share some results with you.
Summary of the test:
- Tested on 20m records
- Kafka produced 55,000 records/sec
- Processing rate of GlassFlow (deduplication): 9,000+ records/sec
- Measured on a MacBook Pro (M3 Max)
- End-to-end latency: <0.12 ms per request
Here is the blog post with full test results and tried with different parameters (rps, # of publishers, etc.): https://www.glassflow.dev/blog/load-test-glass-flow-for-clic...
It was important to us to set up the testing in a way that everybody could reproduce. Here are the docs: https://docs.glassflow.dev/load-test/setup
We would love to get feedback, especially from folks consuming high-throughput in ClickHouse.
Thanks for reading!
Ashish and Armend (founders)
secondcoming|8 months ago
Everything was running on the same machine?
kI3RO|8 months ago
super_ar|8 months ago
Setup: https://docs.glassflow.dev/load-test/setup
Results: https://docs.glassflow.dev/load-test/results
sml156|8 months ago
api|8 months ago
My laptop can run 70B LLMs at usable speeds.
I know. Doesn’t scale. No redundancy. No auto redeploy on failures. This is what I mean.
Do we really have to sacrifice this much efficiency for those things or are we doing it wrong? Does the ability to redeploy on failures, cluster, and scale really require order of magnitude performance penalties across the whole stack?
super_ar|8 months ago