top | item 42774758

File Systems: The Original Hypermedia

79 points| pilgrim0 | 1 year ago |jon.work

50 comments

> if we always had hypermedia with directories and files, why hasn't the web evolved into a mesh of interconnected file systems?

It kind of has. URLs have the notion of paths, which are obviously strongly associated with the notion of file system hierarchies. People sometimes (sometimes accidentally) put their file systems directly on the web, see DirectoryIndex in apache for example.

> Why isn't a website just a remote directory on someone's computer that we can explore via a file browser?

Well now we are getting into the meat of it. To be a hypermedia requires the presence of hypermedia controls in a media. Hypermedia controls can be as simple as links, but the web introduced more sophisticated controls such as forms, and allowed HTML authors to specify more significant interactions beyond click-to-link.

IMO the uniform interface is the most interesting aspect of hypermedia, and that really emerged post Web. I like the authors concept of a file explorer enhanced with hypermedia ideas though and would be interested to see more details on it.

grumbel|1 year ago

The directly support you get with HTTP is rather terrible. It works ok for a human browsing a directory manually, but if you want to actually download one it becomes a mess, since you can't tell if "index.html" is a file or just something Apache generated on the fly. There is no "list directory" command in HTTP (there is PROPFIND in WebDAV) and there is no "download directory" in your browser, you have to fiddle with wget and friends.

It's one of the things I love about IPFS, it has native directories along with fuse services, so you can just `cd /ipfs/...` and browse around. That has a lot of beautiful side effects in that you no longer need .zip to package directories and for a lot of things you no longer even need to download anything, you just access them directly via your file system.

Especially with FTP being removed from browsers, we could really need a proper official successor that can act as an online file system.

6510|1 year ago

How many forms do you need? It should be 10, 50 or 5 million standard forms and free-form ones that could be required to request standardization.

I one time tried to make a form that works with iphone auto fill just for name and address fields. Apple insists putting the house number on the end of the street name. House numbers may have multiple letters and streets are named after people who may have a single letter second name. Boulevard Peng Da nr 1 vs Boulevard Peng nr Da 1. This doesn't work.

During corona I needed a document to work my night shifts. I received a pdf with some form fields. After filling those out the document changed it self into a static permit. I had never seen such a thing before.

When writing blog software I ponder organization a good bit. My conclusion was that depending on the size, quantity and nature of the writings different methods would work best. however! As it grows it is hard to tell which one is currently the best formula and it is very easy to maintain the full set that I define as: Hierarchical tree, Categories, tags and search.

How to do the hierarchical tree is obvious.

A limited number of Categories that should be defined in advance (if possible)

You should have as many tags as possible. This comment could be tagged #history #writing #organizing #ideas #search #link #interface etc. The user can be exposed to a small sub set of sufficiently populated tags. The tag pages can be sorted by the amount of tags per word.

Search is just search but could have topical filters and use all of the before mentioned but could also take one (or more) articles as queries (and list the results under the text)

How and when to stitch on an LLM I don't know.

ianburrell|1 year ago

If you think that file system is a viable approach to web site, how would implement Hacker News? Hacker News is one of the old-school static-rendered sites. It may not even use a database.

But it needs to be dynamic. The ordering on the front page is dynamic. The vote counts are dynamic. What does the voting? These are all easier to have a database with values that query and then render the page.

The other issue adding new content. How would someone post a comment? How and where does it get written? How do you make sure they write to right place? How do you make it easy to use as typing in box and hitting button?

Dynamic responses are the special sauce of the web, they are why it is a success. Without it, it would be good-looking Gopher.

pilgrim0|1 year ago

> If you think that file system is a viable approach to web site, how would implement Hacker News?

Hacker news is a web application, with a client-server architecture. Indeed would be impractical to replicate with the file system model.

Still, I think it would be useful to have standardized hypermedia documents. It would allow for content that's naturally multimedia to be much more easily handled and distributed. I find it super weird that we need to create a 'website' just to have a multimedia, responsive document. It practically must be hosted on a server because nobody sends HTML around. Mind you that I used the file system just as a metaphor for what an offline-first hypermedia document model could look like.

6510|1 year ago

With torrents the number of seeds and leaches do something similar. It actually tells you more as crap is deleted rather than seeded. Most of HN is frozen archives. That you can still vote on things you may not comment on isn't all that useful. HN (while huge) is a collection of things that are easy to do on centralized platform. It could be much more complicated but simplicity is valuable.

I would have to think a bit to come up with something that could reasonably match the mature centralized architecture. First thought would be that if I like your comment I could chose to seed your most recent GB of comments. They would load faster for everyone for each who seeds them.

pilgrim0|1 year ago

Let me give a use case where an offline-first hypermedia document model would be useful. I used to work as an instructional designer, making art and photography courses. This sort of material requires a lot of media, practically of all kinds. You either have to know how to code or use some platform to build that kind of thing. In either case, there'll be friction between reconciling the data source and the final document/volume. If you use a managed solution like a platform, you'll be locking away the content on their servers. Exporting the raw data does nothing because it'll lose all structure—that's basically what happens when you try to export from Notion, from instance. Yes, you can export, but it's lossy, because available offline formats, even markdown, cannot express the structure and ergonomics available within the authoring environment. There's a disconnect between source and presentation. You'll be left with a scattered set of files. The other option is not much better: coding it yourself. You'll need a CMS, and your data will be either JSON, MD, XML, or worse, tabular within an SQL database. Then you'll have to develop a build system or make some sort of SPA. And you'll need to setup a webserver and configure it to distribute the content. This is absurdly complex. It makes producing such materials very expensive. You just can't distribute your course offline. You can't back it up easily, either.

Compare that with print media. Want to make a memo? Fire whatever word processor, write it down, export to PDF, share anywhere. Done. If hypermedia was easy like this, do you have any doubt that people would put it to good use? Do you realize how PDF might be overused simply because of the durability, simplicity and fidelity traits it has?

It's not a law of nature that hypermedia should exist only in the context of browsers. Neither that it has to use HTML, CSS and JS.

moritz|1 year ago

> You would create and manage content directly from the file explorer application, in the most natural way possible. This version of the web wouldn’t require users to learn advanced computer skills in order to participate.

My students at university (Gen Z) have no concept of the “file system”.

dialup_sounds|1 year ago

That's not even a generational thing. People have been (e.g.) saving everything to the desktop for as long as there have been desktops. "Managing files" has always been a subsidiary task to the things people wanted to use computers for.

pilgrim0|1 year ago

Inline, ordered multimedia is the backbone of all consumer information systems. So your students have internalized the the archetypal equivalent of file systems through a different vocabulary, such as tweet (for files) and threads (for directories)

esafak|1 year ago

Do they not save files on their personal computers (phones, laptops)?

andyferris|1 year ago

I found it took a while to get to the point.

In the end, I actually agree with this! I have also been thinking about filesystems, which are trees of of dictionaries (directories) and blobs (files), and that there are many other examples of tree-like data. Data structures in our programs are tree shaped, perhaps with references/pointers to other parts of the tree. JSON is a tree of dictionaries (objects), arrays, and data (the primitive string/number/null). The arrays are ordered, much like the content inside a HTML/XML block.

I agree that adding an "array" style of directory to our file systems would be really cool. I've been toying with the idea of writing a FUSE driver that holds some structured data (possibly including arrays) and just converts the (integer) index into a string. The idea is that you could e.g. view and edit some JSON tree with the file explorer. And not just JSON - basically any piece of structured data that we have in our programs can be "viewed" as a FS this way (e.g. just convert structs into directories of fields). It could even be a pretty cool and universal debugger - a breakpoint could make the program pause and serve a FUSE driver and later continue when it is unmounted :)

And yes, exactly what follows from this is some program could be written to open some "directory" and render a "document" based on the contents. The filesystem supports links, so we have the "web" like experience.

The "document" angle does require adding a kind of directory with ordered array semantics rather than dictionary/map semantics. It's the first missing ingredient listed in the article. Though some filesystems use sorted dictionaries (b-trees or whatever) for directory maps so you could maybe hack this ordered semantics in that way.

The second missing ingredient listed in the article is the hypermedia part. I mean my computer is actually OK at inferring if a file is a movie or photo or text document, so we kind of have a way of dealing with that, too. The "blob of bytes is a narrow waste" thing is quite powerful. That said, sum types could be useful to demark different kinds of "stuff", and there's no reason an implementation of this idea couldn't support sum types as part of its data model.

pilgrim0|1 year ago

> I found it took a while to get to the point.

You're just a great understandeour

> I agree that adding an "array" style of directory to our file systems would be really cool.

I think this is sort of a low-hanging fruit people have slept on. We've proved list-based systems are extremely versatile for data structures (s-expr) and programming (lisps). What about for media in general? Everywhere I look I just see lists, with very minor stylistic distinctions between them. Of course there're abysmal infrastructural differences between chats, feeds and what not, but it does not invalidate a universal list-based frontend, similar to what you developed in your comment.

andyferris|1 year ago

I should probably have mentioned that there already exists FUSE-based JSON FS mounters, for example this one is 13 years old:

https://github.com/calebcase/jsonfs

xnx|1 year ago

A usable equivalent of the file system is sorely missing from the web. Every email address should come with a place to publically share files. It could be as easy as https://user@emaildomain.com/

mongol|1 year ago

That is an http basic auth URL.

Borg3|1 year ago

O really? It seems people forgot about userdir.path (or similar). So user can expose whatever he wants via: https://emaildomain.com/~user/

jedi3335|1 year ago

Reading this I couldn't help but imagine of an alternate universe where Gopher won out in the early 90s, but with a more flexible presentation layer. Great writeup

pilgrim0|1 year ago

What a dream this would have been

p_ing|1 year ago

I struggle to parse this with every paragraph surrounded by a border. It feels unnatural and extremely distracting.

To the author's take of using a file system as an interconnected 'web', we have networked file systems today, typically clustered though.

We've also had the _concept_ of a media-rich file system, like WinFS [as an overlay to NTFS], which was dead before it was alive due to the WWW.

Networking file systems is _complex_. All vendors would need to agree to a common export model on top of their preferred file systems. Or users would need a specialized partition/overlay developed just for this purpose.

FSes are great, but they're not fit for WWW. Without a control plane, they lack any tooling that makes the WWW better -- redirection, access control, programmatic execution of content (ASP.NET, PHP, CGI ...), etc.

Ultimately this would be a complex solution. Just like many don't simply "open up" their web server to any and all traffic to any and all content, a file system would need to be carefully partitioned the same.

The time for file systems as the vehicle for WWW content is long since past. We have better ways to do things, better caching mechanisms, better performance [through CDNs], better security mechanisms, and so on.

...not to mention, I certainly don't want to open my personal computer's file system up to the Internet.

There would have to be a big leap in evolution of file systems across all major operating systems for the author's dream to come true. I would certainly be excited to see it, but we're talking about allocating _talented_ developers to create a new file system and certainly an open source file system. Like many file systems, it would take years to become a trusted file system to host any content of value.

In the mean time, the author can always investigate WebDAV. Slower than dog shit, but it's available with every major web server.

pilgrim0|1 year ago

Sorry for the poor experience with the current design, still experimenting.

I cannot disagree with you, you’re on point on everything when considering the file system as an OS component.

But if we entertain the thought of file system as a document model, or as a transactional data structure, it should come naturally that we can piggyback on the modern infrastructure, at the application level, to achieve the desired qualities.

This very website is an experiment on how this could be done. The main takeaway with my research is that we have much to gain if we leave presentation and layout concerns out of hypermedia documents, letting the client software decide on it, like we do with our editors and IDEs, choosing the theme and font we like, the information is the same no matter. To abandon the fetishism inherited from print media and to transact pure data is to make the web democratic. That’s precisely the recipe used by all social networks: standardized, systemic presentation of schematic payloads following a given ontological model. We need only to copy them with an open model

mickael-kerjean|1 year ago

> WebDAV. Slower than dog shit, but it's available with every major web server.

You lost me there. WebDAV is nothing more than HTTP calls with some XML data with a slightly different syntax than the S3 API. There is no fundamental stuff in the spec that command the protocol to be "slower than dog shit" as a file transfer protocol. Please prove me wrong with another argument than: "the particular server implementation I tried was dog shit"

groby_b|1 year ago

> This version of the web wouldn’t require users to learn advanced computer skills in order to participate.

The web doesn't require "advanced computer skills". (Unless you use non-flexbox CSS alignments ;) It is fairly trivial to create basic HTML files. SSG + MD have removed a lot of the remaining obstacles. Most web sites are structured files, just with a "compiler" and possibly a database to store the files.

But what they still do require is the ability to reason about structured data and its best configuration. And that is the truly hard problem, ever since Ted Nelson first talked about it.

It also requires us to reason about how to best make that data consumable for humans. It doesn't just magically "arise from the structure", as much as I wish it did. The web site is a clear example - the lack of understanding how humans consume info, and what helps/hinders, leads to odd boxes around each paragraph.

I still agree with the fundamental idea. The more structure we can encode in an easily graspable way, the easier it becomes to impose structure.

But even then, the fundamental advantage of the web over hierarchical file systems is the non-linearity. And yes, correct, hierarchies matter, but the fundamental point the article misses is that there isn't just _one_ hierarchy. Wikipedia is a great example here - it fundamentally cannot be expressed in a meaningful way as a tree, even though it has many hierarchies.

And hierarchies alone are insufficient. We've now learned, thoroughly I think, that hierarchical taxonomies always break down. If we're given to snark, Linnaeus took a good stab, he failed. In more practical terms, the emergence of "tags" has shown that we need a way to have non-hierarchical cross-cutting data.

I think for a discussion of the subject, there's value in separating a few topics:

* Presentation. The author is right, HTML made a grave mistake including that

* Local representation. Again, agreement here, giving a file system structure that allows to infer meaning for later presentation is super helpful. (See point about SSG/MD)

* Organization/Navigation: Any sufficiently complex set of data requires several separate overlaid structures to help humans navigate.

* Human psychology: We're bad thinking about relation schemes beyond trees & grids. That means our organization schemes need to mirror them at least partially so we don't break our head. Corollary is that any sufficiently complex set of data needs searchability.

There's probably more. It's a topic that's been brewing in my head for a while, you're getting a very rough first draft, sorry :)

pilgrim0|1 year ago

> It is fairly trivial to create basic HTML files. SSG + MD have removed a lot of the remaining obstacles

These are advanced computer skills IMO

> It doesn't just magically "arise from the structure", as much as I wish it did. The web site is a clear example - the lack of understanding how humans consume info, and what helps/hinders, leads to odd boxes around each paragraph

I appreciate the criticism. It's impossible to please everyone in terms of design, and I think your antipathy towards this particular style agrees with the general premises.

Regarding if disposition arising from structure is desirable or not, I think it's a matter of culture and habit. The time and complexity savings for authoring and publishing afforded by this model, for me, satisfactorily offsets whatever could be said that it misses in the aesthetic or funcional department, which can always be patched and improved. The positive feedback I had from interested users, all of them tech-illiterate, is what gave me the confidence to pursue investing in the research, and also made me realize that my insecurities towards its acceptability, which stemmed from sentiments quite similar to what you put forth as criticism, were mere whims. As far as experience and perceptions can be trusted, I believe serial multimedia has been proved as a viable format.

> And hierarchies alone are insufficient.

Not disputing that. The fact the document model is hierarchical does not mean the document system has to be. In fact it was never planned to be. There are many mechanisms in place affording hypernavigation, down to the design of the in-memory representation. Just haven't been implemented for lack of resources.

> you're getting a very rough first draft, sorry :)

I'd love to hear more. Feel free to ping me anytime if this is a subject you find exciting to discuss!