top | item 22365215

(no title)

stikypad | 6 years ago

I get where the author is coming from, and I'm often dismayed by how much of programming is just shuffling data from one platform/format/structure/persistence layer to another. It's tedious AF.

Still, there appears to be a lot of conflation of concepts here, and I don't see anything approaching crystallization of a coherent solution.

What do I gain by telling the OS that I want a JPG rather than a sequence of bytes that I know is a JPG? I'm probably already using a library to deserialize this for me, unless I'm a masochist, so the heavy lifting has already been done. And that object/struct/whatever is going to be represented somewhat differently in memory for Java than for C++ than for Python. There is no way to reconcile those differences without creating even more problems. So a programming language would need to either be written or adopted to standardize on.

Most application-specific data is already stored in a database of some sort anyway, which is itself (potentially) platform agnostic. To assert that applications should instead rely on the OS and filesystem to provide this persistence layer is to assert that there is a universally appropriate choice of database -- a bold claim.

Moreover, if structure can (must?) be defined by the application which is interfacing with the OS, then nothing is gained but overhead. Every object would need a description of itself, so we either end up with redundant structural information for every file, or else we have some centralized table of "object types" that the OS has to look at every time we request something.

Maybe I'm missing something, but I don't see the appeal here. I understand the desire to reduce overhead, but as far as I can see, this just creates more.

discuss

sliken|6 years ago

Ok, today we have shells like bash that can help you navigate around a file system, find files, run files, search files, view files etc. Common tools are grep, less, cat, "|", awk, sed, perl, tr, wc, find, etc. The <cr> is a poor mans record seperator.

Experienced people can often do non-trivial things in shell by combining a tools, regular expressions, and manually filtering out false positives. I.e. using grep foo | grep bar for that one email you are looking for.

But as a result things that need more structure require significant coding and create a sandbox that doesn't work well with other systems. Like say thunderbird (an email client).

Now imagine something different that has some higher level of abstraction. Maybe every file gets by default a list of functions to help the OS understand it. How about dump (raw byte encoding), list, add, view, delete, and search. Each file type supported by the OS would get those features, one of those might be jpeg. So of you created an address book, you'd define a record type called person, and a list of fields. One of those fields might even be a JPEG for a image of the person. In a GUI (like thunderbird) you'd just wrap a function called addressbook.insert much like just about every GUI platform has a file picker.

Ideally every application on this OS would make use of these function calls so every app that needed an addressbook could share the OS calls to interact with the addressbook. But also instead of being frustrated at thunderbird you could use your addressbook for new and different things. Like say a map viewer might put icons up for every home address in your addressbook. Or you could query you addressbook for the home address nearest you... from the command line.

Ideally this files are actually objects that include data and code. Said code could inherit based on primitives from the OS like records and fields. That would enable things like extending the addressbook to handle a new field like your keybase ID, PGP key, 3D representation of your face or whatever.

Similarly any image viewer, or even a pipeline using standard image tools could iterate over all your JPEGs and extract geo tags.

The object aware "find" replacement could access records/fields for all file types so you could find photos within a distance of a long/lat.

By combining the above with a relational database instead of a filesystem you could mix and match and create virtual folders for things like the top 1000 newest images on your "filesystem". Suddenly replacements awk, sed, wc, less, etc would understand email message metadata.

Or make a directory that contains the newest email messages you haven't replied to. Running ls --fields=From,To,Date,subject would give you a summary of email in a folder.

The code replication for understanding a file format, serialization, communication, etc would be greatly reduced and moved largely from the application space to the OS space and result in significantly increased compatibility between applications. Imagine instead of a zillion files under ~/.thunderbird that instead it was all in a database compatible with any email client that's supporting the new record/field standard.