- decompress (users|posts)
- split into batches of 10,000
- xsltproc the batch into sql statements
- pipe the batches of statements into sqlite in parallel using flocks for coordination
On my M1 Max it takes about 40 minutes for the whole network. Then I compress each database with brotli which takes about 5 hours.
wolfgang42|1 year ago
Presumably they have a script that does something similar to that process, and then writes the resulting data into a predefined table structure.
JasonPunyon|1 year ago
Yep, my process is similar. It goes...
On my M1 Max it takes about 40 minutes for the whole network. Then I compress each database with brotli which takes about 5 hours.