top | item 44565326

(no title)

d1l | 7 months ago

This is strange on so many levels.

SQLite does not even do network I/O.

How does sharing a connection (and transaction scope) in an asyncio environment even work? Won’t you still need a connection per asyncio context?

Does sqlite_open really take long compared to the inevitable contention for the write lock you’ll see when you have many concurrent contexts?

Does sqlite_open even register in comparison with the overhead of the python interpreter?

What is an asyncio SQLite connection anyways? Isn’t it just a regular one that gets hucked into a separate thread?

discuss

simonw|7 months ago

If you're talking to a 100KB SQLite database file this kind of thing is likely unnecessary, just opening and closing a connection for each query is probably fine.

If you're querying a multi-GB SQLite database there are things like per-connection caches that may benefit from a connection pool.

> What is an asyncio SQLite connection anyways? Isn’t it just a regular one that gets hucked into a separate thread?

Basically yes - aiosqlite works by opening each connection in a dedicated thread and then sending async queries to it and waiting for a response that gets sent to a Future.

https://github.com/omnilib/aiosqlite/blob/895fd9183b43cecce8...

d1l|7 months ago

That's even crazier - so you're using asyncio because you have a ton of slow network-bound stuff - but for your database access you are running every sqlite connection in it's own thread and just managing those threads via the asyncio event loop?

crazygringo|7 months ago

> If you're querying a multi-GB SQLite database

In which case SQLite is probably the wrong tool for the job, and you should be using Postgres or MySQL that is actually designed from the ground up for lots of concurrent connections.

SQLite is amazing. I love SQLite. But I love it for single-user single-machine scenarios. Not multi-user. Not over a network.

Retr0id|7 months ago

My preferred python wrapper for sqlite is apsw. The maintainer gives a good answer here for why not to use an async interface in most cases: https://github.com/rogerbinns/apsw/discussions/456#discussio...

It really depends on what your workload looks like, but I think synchronous will win most of the time.

charleslmunger|7 months ago

A connection pool is absolutely a best practice. One of the biggest benefits is managing a cache of prepared statements, the page cache, etc. Maybe you have temp tables or temp triggers too.

Even better is to have separate pools for the writer connection and readers in WAL mode. Then you can cache write relevant statements only once. I am skeptical about a dedicated thread per call because that seems like it would add a bunch of latency.

pjmlp|7 months ago

For some strange reason, some people feel like using SQLite all over the place, even when a proper RDMS would be the right answer.

9rx|7 months ago

It is not that strange when you consider the history. You see, as we started to move away from generated HTML into rich browser applications, we started to need minimal direct DBMS features to serve the rich application. At first, few functions were exposed as "REST APIs". But soon enough those few featured turned into full-on DBMSes, resulting in a DMBS in front of a DBMS. But then people, rightfully, started asking: "Why are we putting a DBMS in front of a DBMS?"

The trouble is that nobody took a step back and asked: "Can we simply use the backing DBMS?" Instead, they trudged forward with "Let's get rid of the backing DBMS and embed the database engine into our own DBMS!" And since SQLite is a convenient database engine...

fidotron|7 months ago

I recently encountered a shared SQLite db being used for inter process pub sub for real time data . . . in a safety critical system.

Wrong on so many levels it's frightening.