top | item 31324556

(no title)

dsincl12 | 3 years ago

Uhm... experience from a large project that used SQLite was that we where hit with SQLite only allowing one write transaction at a time. That is madness for any web app really.

Why do everyone seem so hyped on this when it can't really work properly IRL? If you have large amounts of data that need to be stored the app would die instantly, or leave all your users waiting for their changes to be saved.

What am I missing?

discuss

order

masklinn|3 years ago

> Uhm... experience from a large project that used SQLite was that we where hit with SQLite only allowing one write transaction at a time. That is madness for any web app really.

"Lots of readers few writers" is an extremely common application property tho. Your average HN has significantly more reads than it has writes, especially if you bump the "worthless" writes (e.g. vote counts) out of using the DB and only flush them to the durable DB one in a while, for instance.

And with SQLite's WAL support it's supported even better: while it still has a singular writer, the writer doesn't block the readers anymore, which is a workload issue in the exclusive case (as the single writer would bring read concurrency down to 0).

littlecranky67|3 years ago

Another pattern to avoid "worthless" writes is using statistical sampling. I.e. if you have votes in the >100.000 range, generate a random number p in [0, 1], and only perform a write if p < 0.01 - when reading votes, multiply by 100 etc. Of course, you have to assess individually if its feasible for your operation, hence the "worthless".

pc86|3 years ago

Is there something you can point to that explains this "flush them to the durable DB once in a while" pattern in more detail?

jbverschoor|3 years ago

- Most transactions are read-only

- "Large" applications can usually be sharded by account. This means 1 file per account, and can easily be put on the most optimal geolocation of the account

- If you defer locking until commit, allowing multiple writers ( https://www.sqlite.org/cgi/src/doc/begin-concurrent/doc/begi... ). This is good enough for most applications anayway.

- Sqlite simple, fast enough for almost anything, supports a good set of features and datatypes, is very easy to embed.

tidenly|3 years ago

Why would I bake all of those assumptions and limitations into my system though just on the hope it won't ever become a problem

samwillis|3 years ago

Quite right it’s not one size fits all but for any site that’s mostly read only it’s a brilliant solution.

Simon Willison has written about it and coined the term “baked data”: https://simonwillison.net/2021/Jul/28/baked-data/

Mozilla.org uses this architecture, Django app running off SQLite with the db rsync’ed to each application server.

quickthrower2|3 years ago

The confusion is probably a lot of us work at smaller companies that serve a wide solution to a niche customer, and that kind of app has a lot of reads and writes but doesn't need to scale. This app might be doing the invoicing/shipments/specialist parts of a business for example.

Whereas there is another different kind of Engineering which I probably will never be a part of (simply due to mathematics of available positions doing it) where you are scaling something up for millions of users but the app is much simpler like a Twitter or Reddit, and the challenge is in the scaling.

phaedrus|3 years ago

The default settings of SQLite are very conservative and essentially enforce serial writes. With tuning and loosening that enforcement, you can go from 50 writes per second to 50,000.

Edit: forgot to mention that yes a major part of that is batching writes into fewer, bigger transactions; AFAIK you can't really get around that.

andai|3 years ago

https://www.sqlite.org/faq.html#q19

>INSERT is really slow - I can only do few dozen INSERTs per second

>Actually, SQLite will easily do 50,000 or more INSERT statements per second on an average desktop computer. But it will only do a few dozen transactions per second.

>By default, each INSERT statement is its own transaction. But if you surround multiple INSERT statements with BEGIN...COMMIT then all the inserts are grouped into a single transaction.

>Another option is to run PRAGMA synchronous=OFF. This command will cause SQLite to not wait on data to reach the disk surface, which will make write operations appear to be much faster. But if you lose power in the middle of a transaction, your database file might go corrupt.

dagw|3 years ago

What am I missing?

Many sites are Read (almost) Only. For sites where users interactively query/view/explore the data, but (almost) never write their own, it works great.

unicornporn|3 years ago

Speaking of this, I really wish there was SQLite support in WordPress...

beberlei|3 years ago

use more than one SQLite file? we have one per day and project for example.

smt88|3 years ago

I don't know if you're joking or not, but this would just be reinventing the Postgres/SQL Server/Oracle/MySQL wheel using duct tape and wishes.

If you're doing something that multiple systems have had millions of hours of development to do, just use one of those.

sph|3 years ago

But why? That seems such an unnecessary hack.