Seems similar ideas, although SlateDB seems a bit more lightweight and using Parquet as primitive (even using Arrow) might mean more compute-heavy on client-side?
>SlateDB is designed for key/value (KV) online transaction processing (OLTP) workloads. It is optimized for lowish-latency, high-throughput writes. It is not optimized for analytical queries that scan large amounts of columnar data. For online analytical processing (OLAP) workloads, we recommend checking out Tonbo.
Owner of Tonbo here. This critique makes sense in a classic web-app model.
What's shifting is workloads. More and more compute runs in short-lived sandboxes: WASM runtimes (browser, edge), Firecracker, etc. These are edge environments, but not just for web applications.
We're exploring a different architecture for these workloads: ephemeral, stateless compute with storage treated as a format rather than a service.
This also maps to how many AI agent service want per-user or per-workspace isolation at large scale, without operating millions of always-on database servers.
If you're happy running a long-lived Postgres service, Neon or Supabase are great choices.
Lovely project. Also @rubenvanwyk mentioned SlateDB. I am not sure if this will fit my use-case but, today, I was looking for data hosting options for a self-hosted LLM+bot for email/calendar.
I have this product I have tried and stopped before: https://github.com/pixlie/dwata and I want to restart it. The idea is to create a knowledge graph (use Gliner for NER). Compute would either be on desktop or cloud (instances).
Then store the data on S3 or Cloudflare Workers KV or AWS Dynamo DB and access with cloud functions to hook up to WhatsApp/Telegram bot. I may stick with Dynamo or Cloudflare options eventually though (both have cloud functions support).
I need a persistent storage of key/value data (the graph, maybe embedding) for cloud functions. Completely self-hosted email/calendar bot with LLM, own cloud, own API keys. Super low running cost.
Sounds very interesting, but the README has me pondering the downsides. Is the latency very high? Are requests not immediately durable? Is it super expensive?
rubenvanwyk|2 months ago
Seems similar ideas, although SlateDB seems a bit more lightweight and using Parquet as primitive (even using Arrow) might mean more compute-heavy on client-side?
pdyc|2 months ago
>SlateDB is designed for key/value (KV) online transaction processing (OLTP) workloads. It is optimized for lowish-latency, high-throughput writes. It is not optimized for analytical queries that scan large amounts of columnar data. For online analytical processing (OLAP) workloads, we recommend checking out Tonbo.
spwa4|2 months ago
1) your serverless and edge runtime needs to have internet access, so it can contact anyone
2) you're obviously not going to be able to efficiently write to S3 while providing guarantees, so it'll be expensive
3) you're writing in rust, so you really care about correctness and efficiency
This seems like a contradiction. Why would you do this as opposed to hosting a redundant postgres on 2 Hetzner/OVH/... servers and writing to that?
ethegwo|2 months ago
What's shifting is workloads. More and more compute runs in short-lived sandboxes: WASM runtimes (browser, edge), Firecracker, etc. These are edge environments, but not just for web applications.
We're exploring a different architecture for these workloads: ephemeral, stateless compute with storage treated as a format rather than a service.
This also maps to how many AI agent service want per-user or per-workspace isolation at large scale, without operating millions of always-on database servers.
If you're happy running a long-lived Postgres service, Neon or Supabase are great choices.
rglover|2 months ago
brainless|2 months ago
I have this product I have tried and stopped before: https://github.com/pixlie/dwata and I want to restart it. The idea is to create a knowledge graph (use Gliner for NER). Compute would either be on desktop or cloud (instances).
Then store the data on S3 or Cloudflare Workers KV or AWS Dynamo DB and access with cloud functions to hook up to WhatsApp/Telegram bot. I may stick with Dynamo or Cloudflare options eventually though (both have cloud functions support).
I need a persistent storage of key/value data (the graph, maybe embedding) for cloud functions. Completely self-hosted email/calendar bot with LLM, own cloud, own API keys. Super low running cost.
Eikon|2 months ago
[0] https://github.com/Barre/ZeroFS
foodbaby|2 months ago
WilcoKruijer|2 months ago
ethegwo|2 months ago
rubenvanwyk|2 months ago
niek_pas|2 months ago
ethegwo|2 months ago
canadiantim|2 months ago
ethegwo|2 months ago
unknown|2 months ago
[deleted]