top | item 43947828

(no title)

noer | 9 months ago

This is just a single table DB though? At that point, why not just export to a csv or dataframe or whatever and leverage analysis packages to analyze whatever you wanted to.

I admittedly might just not have or understand the use case nor have I thought about how large a Gmail account actually is so feel free to ignore if I'm missing something!

discuss

order

hiAndrewQuinn|9 months ago

A couple of reasons which pop to mind:

- Searching a plain text data file is O(n). Searching a SQLite database that has been properly indexed, which is very easy to do nowadays with FTS5, is O(log n) worst case scenario and O(1) in the best case. This doesn't explain why SQLite over a dataframe or anything, but it definitely justifies it over plain text for large email collections.

- SQLite is really easy to write custom views and programs around. Virtually every major programming language can work with it without issue. See also: simonw's wonderful https://datasette.io/ .

- SQLite is an accepted archival format by the Library of Congress, if you ever want to go down the rabbit hole of digital preservation.