top | item 46196590

(no title)

merb | 2 months ago

> 3.4 Lazy fsync by Default

Why? Why do some databases do that? To have better performance in benchmarks? It’s not like that it’s ok to do that if you have a better default or at least write a lot about it. But especially when you run stuff in a small cluster you get bitten by stuff like that.

discuss

order

aaronbwebber|2 months ago

It's not just better performance on latency benchmarks, it likely improves throughput as well because the writes will be batched together.

Many applications do not require true durability and it is likely that many applications benefit from lazy fsync. Whether it should be the default is a lot more questionable though.

johncolanduoni|2 months ago

It’s like using a non-cryptographically secure RNG: if you don’t know enough to look for the fsync flag off yourself, it’s unlikely you know enough to evaluate the impact of durability on your application.

semiquaver|2 months ago

You can batch writes while at the same time not acknowledging them to clients until they are flushed, it just takes more bookkeeping.

tybit|2 months ago

I also think fsync before acking writes is a better default. That aside, if you were to choose async for batching writes, their default value surprises me. 2 minutes seems like an eternity. Would you not get very good batching for throughout even at something like 2 seconds too? Still not safe, but safer.

senderista|2 months ago

For transactional durability, the writes will definitely be batched ("group commit"), because otherwise throughput would collapse.

otabdeveloper4|2 months ago

> Many applications do not require true durability

Pretty much no application requires true durability.

millipede|2 months ago

I always wondered why the fsync has to be lazy. It seems like the fsync's can be bundled up together, and the notification messages held for a few millis while the write completes. Similar to TCP corking. There doesn't need to be one fsync per consensus.

aphyr|2 months ago

Yes, good call! You can batch up multiple operations into a single call to fsync. You can also tune the number of milliseconds or bytes you're willing to buffer before calling `fsync` to balance latency and throughput. This is how databases like Postgres work by default--see the `commit_delay` option here: https://www.postgresql.org/docs/8.1/runtime-config-wal.html

kbenson|2 months ago

That was my immediate thought as well, under the assumption the lazy fsync is for performance. I imagine in some situations, delaying the write until the write confirmation actually happens is okay (depending on delay), but it also occurred to me that if you delay enough, and you have a busy enough system, and your time to send the message is small enough, the number of open connections you need to keep open can be some small or large multiple of the amount you would need without delaying the confirmation message to actual write time.

senderista|2 months ago

In practice, there must be a delay (from batching) if you fsync every transaction before acknowledging commit. The database would be unusably slow otherwise.

mrkeen|2 months ago

One of the perks of being distributed, I guess.

The kind of failure that a system can tolerate with strict fsync but can't tolerate with lazy fsync (i.e. the software 'confirms' a write to its caller but then crashes) is probably not the kind of failure you'd expect to encounter on a majority of your nodes all at the same time.

johncolanduoni|2 months ago

It is if they’re in the same physical datacenter. Usually the way this is done is to wait for at least M replicas to fsync, but only require the data to be in memory for the rest. It smooths out the tail latencies, which are quite high for SSDs.

thinkharderdev|2 months ago

> To have better performance in benchmarks

Yes, exactly.

dilyevsky|2 months ago

Massively improves benchmark performance. Like 5-10x

speedgoose|2 months ago

/dev/null is even faster.

cnlwsu|2 months ago

durability through replication and distribution and better throughput to build up more within the window on a lazy fsync