I did this for MBTiles, for storing (at the time, raster) map tiles at Mapbox. I was working on the iPad wing of R&D early in the company and we were focusing on offline mapping for the iPad. Problem was, moving lots of tiny map tiles (generally 256px square PNGs) was tedious over USB and network. We had a thing called Maps on a Stick for moving things around by USB, but it just didn’t scale well to the iPad interface & file transfer needs.
Bundled the tiles into SQLite (I was inspired by seeing Dr. Hipp speak at a conference) and voila, things both easy to move and to checksum. Tiles were identified by X & Y offset at a given (Z)oom level, which made for super easy indexing in a relational DB like SQLite. On the iPad, it was then easy to give map bundles an application icon, associated datatype from file extension, metadata in a table, etc. At the time, I was fairly intimidated by the idea of creating a file format, but databases, I knew. And then making some CLI tools for working with the files in any language was trivial after that.
Most application's file formats are structured as a tree, not as flat tables. If your application's data is flat tables or name-value pairs then SQLite is an obvious choice. But if it is tree structured then it is less obvious. You can still save your tree in JSON format as a blob in a SQLite table but in this case the benefits are fewer. But if in addition to the JSON you have images or other binary data then once again SQLite offers benefits, because each of those binary files can be additional rows in the SQLite table. This is far easier to handle than storing them in ZIP format.
Maybe not as obvious for those without formal education in """database normalization""" but it's pretty trivial to convert from a tree structure to a flat table structure using foreign key relations. Recursive queries aren't even that difficult in SQLite, so self-referential data can be represented cleanly too, if not a bit more difficult to write. IME most applications "tree structures" aren't self-referential and are better formalized as distinct entities with one-to-one relationships (ie. a subtree gets a table).
There's always the lazy approach of storing JSON blobs in TEXT fields, but I personally shy away from that because you lose out on a huge part of the benefits of using a SQL DB in the first place, most importantly migrations and querying/indexing.
I am not really classically trained on the subject but I think this is the idea behind relational storage, it is to have better extraction options, you don't have to treat your data as a single document at a time.
Naively, most data looks hierarchical and the instinctive reaction is to make your file format match. But if you think of this as a set of documents stacked on top of each other if you take the data as a bunch of 90 degree slices down through the stack now your data is relational, you loose the nice native hierarchical format, but you gain all sorts of interesting analysis and extraction options.
It is too bad relational data types tend to be so poorly represented in our programming languages, generally everything has to be mapped back to a hierarchical type.
I had some json data that I wanted an annotation interface for. So I asked codex to put it into sqlite and make a little annotation webserver. It worked quickly/easily and without hassle. Sqlite supports queries over json-like objects.
Maybe a very simple document oriented db would have been better?
My biggest gripe is that the sqlite cli is brutally minimal (makes sense given design), but I probably should have been using a nicer cli.
Actually used it for a desktop blogging app a few years ago. It was great! I could set up a blog skeleton, send the file to a family member. They could focus on writing content and hitting deploy.
SQLite is abolutely amazing as an app format! I couldn't list how many tools are available to read SQLite data, or how easy and friendly they are. Even its CLI does wonders when you're dealing with data with it. SQLite has been around for 20+ years and is one of the most heavily tested softwares in the world.
SQLite is very simple, yet very reliable and powerful. Using SQLite as file format might be the best decision an engineer can take when it comes to future-proofing preservation of data.
I have used SQLite file as the application itself. Almost. The tables would store the application features, UIs and logic. A generic kernel would bring up the application from the database.
This approach has really helped me out in my work. I do something very similar using DuckDB to slurp output files anytime I write a custom hierarchical model. The single sql queryable file simplified my storage and analytics pipeline. I imagine SQLite would be especially ideal where long term data preservation is critical.
Bit unrelated rant but I'm still not sure why ZIP has been adopted as an Application File Format rather than anything else. It is a remanent of a DOS era with questionable choices, why would you pick it over anything else?
Because Windows can view and extract them out of the box without installing any additional applications. If it supported anything better out of the box I'd guess people would use that instead.
ZIP isn’t an application format, it’s a container, no? You store files with any format in a .zip, and that’s what applications do - they read files with other formats out of the .zip. What are your goals; what else would you pick, and why? What are the questionable choices you refer to?
We are developing using sqlite to transfer configurations from uat to production environment. Since the configurations are already saved in a postgres table in uat, moving some configs from uat to production an sqlite file is very easy. since it's a binary format, we are also saved from any inadvertent edits by people doing production deployment.
Also, another usecase is to export data from production to uat for testing some scenarios, it can be easily encoded in a sqlite file.
Somehow my first thought from the title was using sqlite as a format for applications. So like a replacement for ELF. I think this idea is both fascinating and horrifying.
I worked @fzakaria on developing that idea. It actually worked surprisingly well. The benefits are mostly in the ability to analyze the binary afterward though rather than any measurable benefit in load time or anything like that though. I don’t have the repo for the musl-based loader handy, but here’s the one for the virtual table plugin for SQLite to read from raw ELF files: https://github.com/fzakaria/sqlelf
Recently reverse engineered the Money Pro backup format, it's a binary file with SQLite with some additional XML information backed in. It feels like they're purposefully making it harder for users to export their data in a useful format, especially after the changes they made to their financial model.
Something to consider when using SQLite as a file format is compression (correct me if I'm wrong!). You might end up with a large file unless you consider this, and can't/won't just gz the entire db. Nothing is compressed by default.
Sure. But if you have reasonably small files just compress the whole file, like MS Office or EPUB files do.
Or if your files are large and composed of lots of blobs, then compress those blobs individually.
Whereas if your files are large and truly database-y made of tabular data like integers and floats and small strings, then compression isn't really very viable. You usually want speed of lookup, which isn't generally compatible with compression.
There seems to be no single software solution "out there" for mounting an SQLite DB (or an SQLite archive) as a file system, with or without per-record relative paths.
I remember someone mentioning the Acorn image editor on Mac uses sql files to store image data. It probably makes backwards compatibility much easier to work with.
It does, here's a schema from an image I just saved with the latest version. Pretty simple.
CREATE TABLE image_attributes ( name text, value blob);
CREATE TABLE layers (id text, parent_id text, sequence integer, uti text, name text, data blob);
CREATE TABLE layer_attributes ( id text, name text, value blob);
Also, document-based apps that use Apple's Core Data framework (kinda ORM) usually use SQLite files for storage.
[+] [-] incanus77|4 months ago|reply
Bundled the tiles into SQLite (I was inspired by seeing Dr. Hipp speak at a conference) and voila, things both easy to move and to checksum. Tiles were identified by X & Y offset at a given (Z)oom level, which made for super easy indexing in a relational DB like SQLite. On the iPad, it was then easy to give map bundles an application icon, associated datatype from file extension, metadata in a table, etc. At the time, I was fairly intimidated by the idea of creating a file format, but databases, I knew. And then making some CLI tools for working with the files in any language was trivial after that.
[+] [-] jeffypoo|4 months ago|reply
[+] [-] out_of_protocol|4 months ago|reply
[+] [-] lateforwork|4 months ago|reply
[+] [-] packetlost|4 months ago|reply
There's always the lazy approach of storing JSON blobs in TEXT fields, but I personally shy away from that because you lose out on a huge part of the benefits of using a SQL DB in the first place, most importantly migrations and querying/indexing.
[+] [-] somat|4 months ago|reply
Naively, most data looks hierarchical and the instinctive reaction is to make your file format match. But if you think of this as a set of documents stacked on top of each other if you take the data as a bunch of 90 degree slices down through the stack now your data is relational, you loose the nice native hierarchical format, but you gain all sorts of interesting analysis and extraction options.
It is too bad relational data types tend to be so poorly represented in our programming languages, generally everything has to be mapped back to a hierarchical type.
[+] [-] robrenaud|4 months ago|reply
Maybe a very simple document oriented db would have been better?
My biggest gripe is that the sqlite cli is brutally minimal (makes sense given design), but I probably should have been using a nicer cli.
[+] [-] unknown|4 months ago|reply
[deleted]
[+] [-] elephantum|4 months ago|reply
[+] [-] renegat0x0|4 months ago|reply
- https://github.com/rumca-js/Internet-Places-Database
For UI I use HTML, because it already provides components with bootrap, and everybody can use it without installation of any software.
All data comes from a single SQLite that is easy read, and returns data.
My database is really big, so it takes time to browse it, I wanted to provide more meaningful way to limit scope of searching
[+] [-] scary-size|4 months ago|reply
https://blog.project-daily.com/pages/file-format_3705.html
[+] [-] stavarotti|4 months ago|reply
[+] [-] gjvc|4 months ago|reply
[deleted]
[+] [-] joelwallis|4 months ago|reply
SQLite is very simple, yet very reliable and powerful. Using SQLite as file format might be the best decision an engineer can take when it comes to future-proofing preservation of data.
[+] [-] zkmon|4 months ago|reply
[+] [-] kianN|4 months ago|reply
[+] [-] spdegabrielle|4 months ago|reply
[+] [-] rtyu1120|4 months ago|reply
[+] [-] amiga386|4 months ago|reply
- files can be individually extracted, in any order, from the archive
- thousands of implementations available, in every language and every architecture. no more than 32KiB RAM needed for decompression
- absolutely no possibility of patent challenges
[+] [-] crazygringo|4 months ago|reply
If you don't need any table/relational data and are always happy to rewrite the entire file on every save, ZIP is a perfectly fine choice.
It's easier than e.g. a SQLite file with a bunch of individually gzipped blobs.
[+] [-] tetraca|4 months ago|reply
[+] [-] dahart|4 months ago|reply
[+] [-] thijson|4 months ago|reply
[+] [-] mikkupikku|4 months ago|reply
[+] [-] gus_massa|4 months ago|reply
[+] [-] abhashanand1501|4 months ago|reply
Also, another usecase is to export data from production to uat for testing some scenarios, it can be easily encoded in a sqlite file.
[+] [-] askl|4 months ago|reply
[+] [-] trws|4 months ago|reply
[+] [-] ejstembler|4 months ago|reply
[+] [-] itopaloglu83|4 months ago|reply
[+] [-] incanus77|4 months ago|reply
https://github.com/ccgus/fmdb
[+] [-] jansommer|4 months ago|reply
[+] [-] crazygringo|4 months ago|reply
Or if your files are large and composed of lots of blobs, then compress those blobs individually.
Whereas if your files are large and truly database-y made of tabular data like integers and floats and small strings, then compression isn't really very viable. You usually want speed of lookup, which isn't generally compatible with compression.
[+] [-] lateforwork|4 months ago|reply
[+] [-] euroderf|4 months ago|reply
[+] [-] psnehanshu|4 months ago|reply
[+] [-] jrochkind1|4 months ago|reply
> and is backwards compatible to its inception in 2004 and which promises to continue to be compatible in decades to come.
That is pretty amazing. You could do a lot worse.
[+] [-] dist-epoch|4 months ago|reply
Doesn't mean that whatever the app stores inside will remain backward compatible which is the harder problem to solve.
[+] [-] seanalltogether|4 months ago|reply
[+] [-] dchest|4 months ago|reply
[+] [-] setr|4 months ago|reply
[+] [-] SoKamil|4 months ago|reply
[+] [-] ianberdin|4 months ago|reply
I always wonder when people can sell ideas or products so effectively.
[+] [-] born-jre|4 months ago|reply
[+] [-] actionfromafar|4 months ago|reply