top | item 39853183

(no title)

this is a really cool idea.

one followup I was thinking of is whether this can generalize to queries other than key value point lookups. if I'm understanding correctly, the article is suggesting to take a key value store, and for every `(key, value)` in the system, split `value` into fragments that are stored on different shards with some `k` of `M` code. then at query time, we can split a query for `key` into `k` subqueries that we send to the relevant shards and reassemble the query results into `value`.

so, if we were to do the same business for an ordered map with range queries, we'd need to find a way to turn a query for `interval: [start, end]` into some number of subqueries that we could send to the different shards and reassemble into the final result. any ideas?

discuss

ddorian43|1 year ago

There https://ydb.tech/ open source db that uses erasure coding for replication in single zone/region.

eivanov89|1 year ago

In YDB with block 4+2 erasure coding, you need half the disk space compared to mirror-3-dc schema. Meanwhile CPU usage is just a little bit higher, thus in high throughput tests mirror-3-dc wins. Indeed as mentioned in the post there might be a tail latency win in latency runs, but if your task is throughput with a reasonable latencies, replication might be a better choice.

ddorian43|1 year ago

> so, if we were to do the same business for an ordered map with range queries, we'd need to find a way to turn a query for `interval: [start, end]` into some number of subqueries that we could send to the different shards and reassemble into the final result. any ideas?

Dbs that are backed by s3-like-storage, the storage does this for you, but for blocks of, say, 1MB, and not per-kv (high overhead).

Think you use rocksdb in your db, and erasure-code the sstables.