top | item 39854027

(no title)

In YDB with block 4+2 erasure coding, you need half the disk space compared to mirror-3-dc schema. Meanwhile CPU usage is just a little bit higher, thus in high throughput tests mirror-3-dc wins. Indeed as mentioned in the post there might be a tail latency win in latency runs, but if your task is throughput with a reasonable latencies, replication might be a better choice.

discuss

pjdesno|1 year ago

If you only care about throughput, just fetch the data and read should be the same speed as triple replicated.

For writing, triple-rep has to write 2x as much data or more, so it's going to be slower unless your CPUs are horribly slow compared to your drives.

ddorian43|1 year ago

I expect it to save a lot of CPU by only needing 1/3x of compactions. You might want to do a benchmark on that ;). An example is quickwit (building inverted indexes is very expensive).