top | item 40993559

(no title)

seddonm1 | 1 year ago

Thanks.

All good and valid questions.

1. I work mostly in Rust so I'll answer there in terms of async. This library [0] uses queues to manage workload. I run a modified version [1] which creates 1 writer and n reader connections to a WAL backed SQLite and dispatch async transactions against them. The n readers will pull work from a shared common queue.

2. Yes there is not much you can do about file IO but SQLite is still a full database engine with caching. You could use this benchmarking tool to help understand where your limits would be (you can do a run against a ramdisk then against your real storage).

3. As per #1, I keep connections open and distribute transactions across them myself. Checkpointing will only be a problem under considerable sustained write load but you should be able to simulate your load and observe the behavior. The WAL2 branch of SQLite is intended to prevent sustained load problems.

[0]: https://github.com/programatik29/tokio-rusqlite [1]: https://github.com/seddonm1/s3ite/blob/0.5.0/src/database.rs

discuss

leononame|1 year ago

Thanks for your answer.

For 1, what is a good n? More than NUM_CPU probably does not make sense, right? But would I want to keep it lower?

Also, you dispatch transactions in your queue? You define your whole workload upfront, send it to the queue and wait for it to finish?

seddonm1|1 year ago

I went through the same mental process as you and also use num_cpus [0] but this is based only on intuition that is likely wrong. More benchmarking is needed as my benchmarks show that more parallelism only works to a point.

You can see how the transactions work in this example[1]. I have a connection `.write()` or `.read()` which decides which queue to use. I am in the process [2] of trying to do a PR against rusqlite to set the default transaction behavior as a result of this benchmarking so hopefully `write()` will default to IMMEDIATE and `read()` remains DEFERRED.

[0] https://docs.rs/num_cpus/latest/num_cpus/ [1] https://github.com/seddonm1/s3ite/blob/0.5.0/src/s3.rs#L147 [2] https://github.com/rusqlite/rusqlite/pull/1532

pdimitar|1 year ago

Valuable info and links, instant bookmarks, thank you!

If you don't mind me asking, why did you go with rusqlite + a tokio wrapper for it and not go with sqlx?

seddonm1|1 year ago

Whilst I love the idea of SQLX compile-time checked queries it is not always practical to need a database connection to compile the code in my experience. If it works for you then thats great but we had a few tricky edge cases when dealing with migrations etc.

Also, and more fundamentally, your application state is the most valuable thing you have. Do whatever you feel makes you most comfortable to make sure that state (and state transitions) is as well understood as possible. rusqlite is that for me.