top | item 39708846

SEqlite – Minimal Stack Exchange Data Dump in SQLite Format

28 points| JasonPunyon | 1 year ago |seqlite.puny.engineering

11 comments

order

panqueca|1 year ago

I liked the idea. I think SQLite is very powerful way of storing and sharing data. But this site is only about Stack Exchange Network.

Do you know if is there a place to host other of sqlite dumps? I mean from other websites? Recently I dumped the whole hackernews api and I got thinking about it.

JasonPunyon|1 year ago

This site is on a Cloudflare R2 bucket because (and only because) they have free egress. While not datacenter sized some of these files are large. Just opening up your credit card to 10 cents a gigabyte will be a bad time anywhere else.

0cf8612b2e1e|1 year ago

Datasette is built to serve SQLite databases. For querying purposes, not bulk download.

black_puppydog|1 year ago

It's always a good reminder how small useful data really is. None of these files is "need a datacenter" sized, and yet they contain just about any question you ever wanted to ask about, plus some answers...

0cf8612b2e1e|1 year ago

I built one of these myself that I keep on my laptop. Never had real need to use it, but glad I have .

I keep meaning to do the same thing with Wikipedia. Although the Wikipedia dumps are so inscrutably named and seemingly undocumented it seems the organization does not want me to pursue the idea.

giantrobot|1 year ago

I've had the same problem with Fandom née Wikia dumps. Just gigabytes of XML with questionable adherence to schemas. Fandom also has a ton of custom-to-Fandom tags which are a further pain to handle.

Pulling useful content out of the dumps has been an exercise in frustration. I'm sure I could figure something out if I had a bunch of time to dedicate to the effort.

If I just had sqlite dumps they'd be trivial to work with and I'd be much happier with them.