top | item 39957867

(no title)

jgeurts | 1 year ago

How do you manage cron jobs with a setup like that? Does each lambda pull down its own SQLite db, write changes, and then litestream replicates the changes to the web and other lambda instances?

Similar scenario for multiple web nodes and saving data. Also, do you use sticky sessions so that any routes that write to a db also read from the same node/db so you don’t have to wait for litestream?

What do you do for BI? Are you able to ETL the data from sqlite to a warehouse? If so, what does that look like?

discuss

hruk|1 year ago

Interesting question. Our services currently contain "own" all of cron jobs, so the jobs run as background tasks on the host containing the db. I think we used Huey for the Django app, and just some CLI applications fired via systemd timers for the Go services.

Every service is running on a single (somewhat meaty) host, so we get true snapshot isolation on reads and serialized isolation on writes without doing anything extra. We run weekly tests to determine how long AWS takes to spin everything back up upon catastrophic failure, and it's between 6 and 10 minutes, depending on how much data the service has. It does bother me that this can only get longer, but we've never actually had a host crash in production.

Like the other commentator, we did our BI for a while by restoring the Litestream backup to another server. Then that started to get expensive (because the file was very large), so we just added a scraping endpoint to each table. I think there's probably a more elegant way of doing this.

xomodo|1 year ago

Sounds complicated. What object storage you use? If on aws, 's3 sync' might help.

raihansaputra|1 year ago

re: BI and Metabase usecase

If it fits the usecase, you can use a VFS SQLite so the process will just pull the needed ranges from storage.

martinbaun|1 year ago

Not the author of this reply, but what we do is just to scp the file into another instance and use Metabase on it. It is pretty sweet as we can have metabase not running on production and "pollute" the environment. Extremely easy.