top | item 43443597

(no title)

marsovo | 11 months ago

I'm not sure we're using the same terminology. Committed means the transaction has been hardened to disk. That's the D in ACID.

Otherwise, is the suggestion that there be an artificial delay to allow other transactions to piggyback before returning success on commit 1?

Should that be a default? (That was the context of this thread)

discuss

toast0|11 months ago

> I'm not sure we're using the same terminology. Committed means the transaction has been hardened to disk. That's the D in ACID.

Yes, the transaction is committed when the transaction is durably written to disk. However, there's not a great API for durably writing to disk, you can write on an FD (or on a mmaped file) and it'll get written eventually hopefully. fsync asks the OS to confirm the writes on an FD are committed durably, but is not without its quirks.

> Otherwise, is the suggestion that there be an artificial delay to allow other transactions to piggyback before returning success on commit 1?

Not really an artificial delay. More that if you have multiple transactions waiting to be comitted, you shouldn't commit them to disk one at a time.

Instead, write several to disk, then fsync, then send commit notices.

A responsible database engine writes transaction data to an FD, then does an fsync, then signals completion to the client; then moves onto the next transaction right?

The suggestion is because fsync is rate limited and blocks further writes while it's pending, you can get better throughput by writing several transactions before calling fsync. The database engine still doesn't signal completion until an fsync after a transaction is written, but you have more data written per fsync. There is a latency penalty for the first transaction in the batch, because you must wait for writes for the whole batch to become durable, but because you're increasing throughput, average latency likely decreases.

Really, there's a fundamental mismatch between the capabilities of the system, the requirements of the database engine, and the interface between them. Synchronous fsync meets the requirements, but an asynchronous fsync would be better for throughput. Then the database engine could write transaction 1, call for fsync 1, write transaction 2, call for fsync 2, etc and once the responses came in, signal commits to the relevant clients. Having more requests in pipeline is key to throughput in a communicating system.

unknown|11 months ago

[deleted]

marsovo|11 months ago

I looked into this some more. There are other ways than explicit fsync. See this blurb on FUA (which basically treats the I/O as write-through)

https://techcommunity.microsoft.com/blog/sqlserver/sql-serve...