I've sometimes wondered about a system where the URL of a document is an actual hash, like SHA-1, of the document. That'd chance the semantics of hyperlinks from "link to document at this internet address" to "link to document with these contents", just like Hashify does, but it could do arbitrarily large documents.
The tricky part with that system would be that you'd also need some new mechanism to retrieve the files. Instead of the regular WWW stack, you'd need something like a massive distributed hash table that could handle massive distributed querying and transferring the hashed files. Many P2P file sharing systems are already doing this, but a sparse collection of end-user machines containing a few hashed files each isn't a very efficient service cloud. If every ISP had this sort of thing in their service stack or if Amazon and Google decided to run the service, all of them dynamically caching documents in greater demand in more nodes, things might look very different.
This would mean that very old hypertext documents would still be trivially readable with working links, as long as a few copies of the page documents were still hashed somewhere, even if the original hosting servers were long gone. It would also make it easy to do distributed page caching, so that pages that get a sudden large influx of traffic wouldn't create massive load on a single server.
On the other hand, any sort of news sites where the contents of the URL are expected to change wouldn't work, nor would URLs expected to point to a latest version of a document instead of the one at the time of linking. Once the hash URL was out, no revision to the hashed document visible from following the URL would be possible without some additional protocol layer. The URL strings would also be opaque to humans and too long and random to be committed to memory or typed by hand. The web would probably need to be somehow split into human-readable URLs for dynamic pages and hash URLs for the static pieces of content served by those pages.
I'm probably reinventing the wheel here, and someone's already worked out a more thought out version of this idea.
> I've sometimes wondered about a system where the URL of a document is an actual hash, like SHA-1, of the document
Git.
-
It may be of interest to view this duality as an analog to the duality of location addressing (iterative) vs value addressing (functional) in context of memory mangers. The general (hand wavy as of now) idea is a distributed memory system with a functional front-end (e.g. Scala/Haskell).
The creator of Freenet made something called dijjer, which mirrors http files in a p2p network accessible by prepending http://dijjer.org/get/ to it. But it looks like he's no longer maintaining it.
Very neat idea, but I think the reliance in bit.ly is self-defeating. This kind of approach would allow people to distribute documents using the web without having to trust them to a particular server, which can be very convenient if your target audience is in a country where access to the server storing your documents can be closed. For this to work you need to be able to recover the document from the URL locally.
Some years ago a friend and I wrote http://notamap.com, a very similar idea for sharing/storing/embed geotagged notes fully encoded on a URL, without having to rely on a server. Looking at it now I wish we had not put all the crazy animations. Maybe I should recover it and simplify the UI.
Very neat idea, but I think the reliance in bit.ly is self-defeating. This kind of approach would allow people to distribute documents using the web without having to trust them to a particular server, which can be very convenient if your target audience is in a country where access to the server storing your documents can be closed.
I just can't see the gain here. You need a server to distribute the URLs in any case. You are just moving the data from the server that served the document to the server that servers the URLs. It is still the same data, just in different form.
For this to work you need to be able to recover the document from the URL locally.
I looked at URL shortener limits some time ago and found these approximate limits by trial-and-error:
* TinyURL 65,536 characters and probably more, but requests timed out; there isn’t an explicit limit apparently
* Bit.ly 2000 characters.
* Is.Gd 2000 characters.
* Twurl.nl 255 characters.
This was 2.5 years ago, not sure how much of these have changed (other than bit.ly, which the linked article confirms is 2048, probably the same as when I tested it).
Boiling it down, it's a new file format with a built in viewer. You need to find a way to store the data.
Interesting, but I can't think of any practical application, apart from the service provider not having to worry about storage (maybe that's key ... more thinking needed).
The original version of Mr Doob's GLSL Sandbox at http://mrdoob.com/projects/glsl_sandbox/ used the same approach, but increased the maximum possible size of the document by doing LZMA compression before base64.
The project later moved to http://glsl.heroku.com/ with an app-driven gallery, and that particular feature went away. I think that is a pretty natural evolution of any such idea, so I'm not convinced of hashify's logevity, but hey, simple sometimes is really enough.
Cool to see but stupid idea, who in their right mind would use this for production?! By using such a "technology", you lose SEO strength due to urls-not-being-like-this.html and even worse, what can stop me from publishing a fake press release on there site/spamming porn and getting that URL indexed? And what are the benefits? To also bring SOPA into this, couldn't I share copyrighted material on someone's site like this? How could they control that?! Besides blocking each URL manually. Just seems dumb. As a concept, cool, but for production.... Yikes?!
Not really a good reply, but I think that hashify.me's potential for an IE audience was probably small to start with. But consider this: if this idea took off, wouldn't this press MS into keeping IE more modern?
I took a similar approach to this with http://cueyoutube.com and recently found snapbird which gives extended twitter search capabilities. So the URL contains the playlist and twitter becomes the data base, so I just tweet my playlists and they're "saved". You can see all the lists I've created by searching the account iaindooley and search term cueyoutube in snapbird.
What becomes possible? The entire internet could effectively get rid of hosting account providers, with each page in every site being contained in a hashify URL, and with each page linking to other pages using other hashify URLs.
Trouble is, there might be a DNS-like system needed to match hashify URLs to more human-readable strings (or a way for existing DNS to resolve to hashify style URLs).
The data needs to be stored somewhere. In their implementation they in effect use bit.ly as the hosting provider for the data by shortening the url's, so while it's a fun little experiment, it boils down to a content addressable system. We already have good examples of content addressable systems. Git for example is built on content addressable storage.
The real trouble is that when you link to a hashified URL, you are actually embedding in your web page (an encoding of) the content of the page you are linking to. Think matryoshka.
in essence, this would be moving away from a model of "large networks of connected pages/sites" to "a large amount of single documents with no meaningful mechanism of inter-connectedness".
Think about this like a PDF where stuff is embedded instead of in separate files.
This is an ancient idea. I read a 2600 article back in the early 2000s or possibly late 1990s that did essentially this same thing using a bash script and one of the first URL shortening services available at the time.
[+] [-] rsaarelm|14 years ago|reply
The tricky part with that system would be that you'd also need some new mechanism to retrieve the files. Instead of the regular WWW stack, you'd need something like a massive distributed hash table that could handle massive distributed querying and transferring the hashed files. Many P2P file sharing systems are already doing this, but a sparse collection of end-user machines containing a few hashed files each isn't a very efficient service cloud. If every ISP had this sort of thing in their service stack or if Amazon and Google decided to run the service, all of them dynamically caching documents in greater demand in more nodes, things might look very different.
This would mean that very old hypertext documents would still be trivially readable with working links, as long as a few copies of the page documents were still hashed somewhere, even if the original hosting servers were long gone. It would also make it easy to do distributed page caching, so that pages that get a sudden large influx of traffic wouldn't create massive load on a single server.
On the other hand, any sort of news sites where the contents of the URL are expected to change wouldn't work, nor would URLs expected to point to a latest version of a document instead of the one at the time of linking. Once the hash URL was out, no revision to the hashed document visible from following the URL would be possible without some additional protocol layer. The URL strings would also be opaque to humans and too long and random to be committed to memory or typed by hand. The web would probably need to be somehow split into human-readable URLs for dynamic pages and hash URLs for the static pieces of content served by those pages.
I'm probably reinventing the wheel here, and someone's already worked out a more thought out version of this idea.
[+] [-] sp332|14 years ago|reply
[+] [-] eternalban|14 years ago|reply
Git.
-
It may be of interest to view this duality as an analog to the duality of location addressing (iterative) vs value addressing (functional) in context of memory mangers. The general (hand wavy as of now) idea is a distributed memory system with a functional front-end (e.g. Scala/Haskell).
[+] [-] ivank|14 years ago|reply
"A New Way to look at Networking" http://www.youtube.com/watch?v=8Z685OF-PS8
[+] [-] mhitza|14 years ago|reply
[+] [-] antimatter15|14 years ago|reply
http://code.google.com/p/dijjer/
[+] [-] pixelcort|14 years ago|reply
In a single link for a file, it can contain multiple hashes for multiple means of retrieval.
[+] [-] juanre|14 years ago|reply
Some years ago a friend and I wrote http://notamap.com, a very similar idea for sharing/storing/embed geotagged notes fully encoded on a URL, without having to rely on a server. Looking at it now I wish we had not put all the crazy animations. Maybe I should recover it and simplify the UI.
[+] [-] rapala|14 years ago|reply
I just can't see the gain here. You need a server to distribute the URLs in any case. You are just moving the data from the server that served the document to the server that servers the URLs. It is still the same data, just in different form.
For this to work you need to be able to recover the document from the URL locally.
How about saving the document?
[+] [-] aurelianito|14 years ago|reply
[+] [-] sp332|14 years ago|reply
[+] [-] ams6110|14 years ago|reply
[+] [-] mmahemoff|14 years ago|reply
* TinyURL 65,536 characters and probably more, but requests timed out; there isn’t an explicit limit apparently
* Bit.ly 2000 characters.
* Is.Gd 2000 characters.
* Twurl.nl 255 characters.
This was 2.5 years ago, not sure how much of these have changed (other than bit.ly, which the linked article confirms is 2048, probably the same as when I tested it).
http://softwareas.com/the-url-shortener-as-a-cloud-database
[+] [-] eli|14 years ago|reply
[+] [-] choffstein|14 years ago|reply
[+] [-] mattvot|14 years ago|reply
Interesting, but I can't think of any practical application, apart from the service provider not having to worry about storage (maybe that's key ... more thinking needed).
[+] [-] irrumator|14 years ago|reply
[+] [-] sgdesign|14 years ago|reply
This way the whole tool can be 100% client-side javascript, without a need for any back-end.
[+] [-] Jare|14 years ago|reply
The project later moved to http://glsl.heroku.com/ with an app-driven gallery, and that particular feature went away. I think that is a pretty natural evolution of any such idea, so I'm not convinced of hashify's logevity, but hey, simple sometimes is really enough.
[+] [-] samgranger|14 years ago|reply
[+] [-] samgranger|14 years ago|reply
[+] [-] Angostura|14 years ago|reply
[+] [-] unknown|14 years ago|reply
[deleted]
[+] [-] tkellogg|14 years ago|reply
[+] [-] dools|14 years ago|reply
[+] [-] unknown|14 years ago|reply
[deleted]
[+] [-] cobychapple|14 years ago|reply
Trouble is, there might be a DNS-like system needed to match hashify URLs to more human-readable strings (or a way for existing DNS to resolve to hashify style URLs).
Neat idea.
[+] [-] vidarh|14 years ago|reply
[+] [-] friggeri|14 years ago|reply
[+] [-] seanp2k2|14 years ago|reply
Think about this like a PDF where stuff is embedded instead of in separate files.
[+] [-] feralchimp|14 years ago|reply
But URL shortening services are a public good, and hacking one to be your personal cloud storage platform is kind of a dick move.
[+] [-] jheriko|14 years ago|reply
[+] [-] jroseattle|14 years ago|reply
[+] [-] tony_le_montana|14 years ago|reply
[+] [-] orclev|14 years ago|reply
[+] [-] eternalban|14 years ago|reply
[+] [-] seanp2k2|14 years ago|reply
[+] [-] unknown|14 years ago|reply
[deleted]
[+] [-] markkum|14 years ago|reply
[+] [-] djbender|14 years ago|reply