top | item 19647692

IPFS Project Roadmap

242 points| robto | 7 years ago |github.com | reply

149 comments

order
[+] fwip|7 years ago|reply
"2019 Goal: The most used code and binary Package Managers are powered by IPFS."

That's kind of stupid-ambitious for 2019 when another 2019 goal is "a production-ready implementation" and IPFS has been around for 3 years already.

This isn't a roadmap, it's a wishlist. And I'm someone who wants to see IPFS succeed.

[+] gritzko|7 years ago|reply
In theory, there are incremental paths to achieve that objective, while serving clients all the way along.

1. Define a P2P ready data model and protocol (which they have, Merkle forest and everything)

2. Run a single server/cluster, make a very lightweight client library for that

3. Expand the server to a makeshift CDN (start simple, e.g. rsync like mirrors)

4. Federate the CDN (still in a hierarchical fashion, so you always know whom you are talking to). Also, look for peers on the local subnet (broadcast is simple).

5. Expand that to a friend-of-a-friend network, use PEX like peer finding (PEX is shockingly simple https://en.m.wikipedia.org/wiki/Peer_exchange)

6. Go full P2P, talk to strangers on the Net, use DHT or anything.

The only trick is to start with a data model that can go all the way to (6). So backward compatibility/ installed base issues do not stop you at earlier stages. Think globally, act locally.

The dangerous thing is deploying an untested complex codebase to serve the most hardcore use case for live customers on Day#1. That may not work. Because even all these shockingly simple steps will turn quite long and tedious in practice. There will be "issues". Advancing one step a year is quite impressive if done under the full load.

[+] StavrosK|7 years ago|reply
The IPFS client is such an untunable memory hog that I turn it off whenever I'm not using it (which, of course, defeats the entire purpose). I would be ecstatic if we had something like the old uTorrent, but for IPFS. A nice UI, easy configuration, an ultralight implementation. It would be a dream come true.
[+] rnhmjoj|7 years ago|reply
I would love to see IPFS used in projects like Nix, but in the current state it's downright impossibile.

I have seen IPNS taking between 5-10 minutes to solve a single address and go-ipfs with a few pinned files taking more that 5GB of memory after running for a day.

[+] sideeffffect|7 years ago|reply
It's a bit sad, that Dhall (programmable configuration language for YAML & co.) used to use IPFS for its source/packages, but stopped, because of reliability :( (I'm wondering if there are/were others?)

> Early on in the language history we used IPFS to distribute the Dhall Prelude, but due to reliability issues we’ve switched to using GitHub for hosting Dhall code.

http://www.haskellforall.com/2019/01/dhall-year-in-review-20...

https://github.com/dhall-lang/dhall-lang/issues/162

I wish IPFS the best, because at least in theory this seems like a perfect use case

[+] pdxww|7 years ago|reply
This is not a roadmap, but rather a wishlist. There is a fundamental problem that IPFS needs to solve first. This problem is called an efficient WebRTC-based DHT. In order to change the web, IPFS needs to become usable in the browsers. Since the backbone of IPFS is DHT, there need to be an efficient UDP-based solution for "DHT in the web". Right now this isn't possible and reasons are not just technical, but political. The IPFS team would need to convince all major players that enabling this DHT scenario is a good idea.
[+] nmca|7 years ago|reply
I actually wrote a DHT that operated over WebRTC itself with in-band signalling for my undergrad thesis, in the application/js layer. Total PITA, but a ... "good?" learning experience.
[+] fiatjaf|7 years ago|reply
IPFS is not usable outside of browsers yet, so I guess you're too optimistic.
[+] Ericson2314|7 years ago|reply
If you want to do package managers, your #1 priority should be Nix. Don't do something more popular where you help less, go with the thing that you can really provide the killer missing feature.

Nix + IPFS has been tried before, but what was missing is the Nix-side hierarchical content addressing of large data (not just plans). With the "intensional store" proposal, this should finally happen. Please push it along.

Data shouldn't be hashed like Nix's NARs, or IPFS's UnixFS for maximum interopt. Instead please go with git's model for all it's problems.

Thanks, hope someone can take me up on this because I'm up to my neck with other open source stuff already.

[+] fiatjaf|7 years ago|reply
The vision of a IPFS-powered web working is beautiful.

However I would love to see a reference implementation that works at minimum and not just drains out your computer up to latest resource it may have. If we're so near the "production-ready" status of the reference implementations then I think that goal will never be achieved.

[+] dymk|7 years ago|reply
I see this comment often when IPFS is discussed - but the devil is in the details when it comes to replacing the underlying tech of "the web" with something else.

How does an IPFS powered website do dynamic content? User sessions? Is all the client's session data encoded in the IPFS address itself?

Even if there's no user sessions, but the page content updates, how do you continuously point clients to fetch the right updated page (e.g. how would you implement a Hacker News style aggregator that updates every minute)?

IPFS does static content just fine - CAS-es are wonderful for that - but websites are much more than static content.

[+] robto|7 years ago|reply
That's what's got me excited - they've managed to articulate a vision for the future that I'm totally on board with: decentralized, privacy respecting, and user owned. I really want to see that vision become a reality.
[+] techntoke|7 years ago|reply
Would love to see Arch/Alpine Linux repo move to IPFS by default. Would also like to see better integration with Git, and an SCM platform comparable to GitHub (or GitLab). That could really get the developer community heavily involved in the project if it was sponsored by Protocol Labs.
[+] anandology|7 years ago|reply
In addition to apt and npm, I would like to see docker image distribution powered by IPFS. It really feels stupid to pull images from a central registry sitting on the other side of the globe when the image is already present in the next node in your kubernetes cluster.
[+] microcolonel|7 years ago|reply
Discovery performance is the biggest issue I see. If I deliberately load the same file on a couple of peers, it can take hours (or forever) to be able to find a peer with that file to pin it. It is clumsy and difficult to explicitly connect to peers (because you can't just try to discover peers at an address, you need to include the node ID as well), and even if you manage to enter the right information, you won't necessarily succeed at connecting to the peer the first time.
[+] rob-olmos|7 years ago|reply
It's nice to see #2 for package managers, something I've been thinking about recently. I haven't look much into this yet, but I wonder if IPNS could provide a step forward in supply chain protection since package signing isn't available yet in certain managers/repos or not commonly utilized.
[+] 0xb100db1ade|7 years ago|reply
I love the idea of IPFS, but I can't think of a use case not covered by torrents.

Would someone mind enlightening me regarding what sets IPFS apart from torrents?

[+] hombre_fatal|7 years ago|reply
One practical difference is that all the parts of a "collection" are individually addressable in IPFS.

For example, unlike torrents, you can seed a collection like "My Web Show (All Seasons)" and add new files as new episodes become available. With torrents, you have to repackage them as new torrent files. IPFS also then encourages file canonicalization instead of everyone seeding their own copy of a file.

[+] pfranz|7 years ago|reply
I was at a place that tried to distribute their files via bittorrent. I wasn't there for the initial implementation, but have dealt with it after it was used.

The data was immutable, so we didn't have that use-case. The tracker software we were using (one of the often used open source C++ ones) seemed to handle a couple hundred torrents just fine, but couldn't handle tens of thousands. Even if only a few were active. I'm not sure if it was excessive RAM or high CPU, but they built a wrapper tool to expire and re-add torrents as needed. I think technically it was limiting the number of seeds (from the central server) for different torrents.

There was also a lot of time/overhead in initiating a new download. This was exacerbated by the kludge mentioned above. Client would add the torrent, you would trigger a re-seed, then the client would wait awhile before checking again and finding the seed. Often this dance took much longer than the download itself.

[+] hiccuphippo|7 years ago|reply
Think of torrents that you can update: You have the magnet link for the one version you are downloading, but also you have available the magnet link for the current version so the uploader can update at anytime and you would receive the update. And if both versions share some pieces, then people can share them across both torrents, and any other torrent that happens to have a piece with the same hash.
[+] fwip|7 years ago|reply
Main feature is automatic data-sharing between distributions. With torrents, everything is siloed, and data is only exchanged between peers of that torrent. IPFS doesn't care /why/ you're getting information or the link you found it from, just that it can find it by its hash.

Say you distribute "Julie's Webcast Complete Series" and somebody else distributes "Julie's Webcast - Episode 3, with Russian subtitles," peers and seeders from both distributions can share data for the shared content. Similarly, updating a dataset only requires downloading the new data.

This is done automatically, both per-file hashing and (optionally, not sure the current state) of in-file block hashing.

[+] viksit|7 years ago|reply
One of the biggest challenges with IPFS in my mind is the lack of a story around how to delete content.

There may be a variety of reasons to delete things,

- Old packages that you simply don't want to version (think npm or pip)

- Content that is pirated or proprietary or offensive that needs to be removed from the system

But in its current avatar, there isn't an easy way for you to delete data from other people's IPFS hosts in case they choose to host your data. You can delete it from your own. There are solutions proposed with IPNS and pinning etc - but they don't really seem feasible to me last I looked around.

This list as @fwip said is great as a wishlist - but I would love to see them address some of the things needed in making this a much more usable system as well in this roadmap.

[+] mirimir|7 years ago|reply
> But in its current avatar, there isn't an easy way for you to delete data from other people's IPFS hosts in case they choose to host your data.

If you put it on IPFS, it's not "your data" any longer. If that doesn't work for you, then don't use IFPS.

Edit: I do get why people are concerned about persistence of bad stuff. But it's not at all unique to IPFS. And even IPFS forgets stuff that nobody is serving. I mean, try to find these files that I uploaded a few years ago: https://ipfs.io/ipfs/QmUDV2KHrAgs84oUc7z9zQmZ3whx1NB6YDPv8ZR... and https://ipfs.io/ipfs/QmSp8p6d3Gxxq1mCVG85jFHMax8pSBzdAyBL2jZ.... As far as I can tell, they're just gone.

[+] wehriam|7 years ago|reply
> Old packages that you simply don’t want to version

It’s important to think of IPFS as a way to share using content hashes - essentially file fingerprints - as URLs. Every bit of information added is inherently and permanently versioned.

This is a tremendous asset in many ways, for example de-duplication is free. But once a file has been added and copied to another host, any person with the fingerprint can find it again.

While IPFS systematically exacerbates the meaningful problems around deletion that you describe, they are not unique. Once information is put out in the world, it’s hard to hide it.

[+] Risord|7 years ago|reply
Managing universal data removal is not universally solved (or even wanted) on internet scale. So it sounds weird to demand it from technology which is trying to solve completely different problem.
[+] gritzko|7 years ago|reply
What you actually want is to break the universe. It is physically impossible to revoke information unless it happens by a strange coincidence. 24x7, you emit information that races away with the speed of light. You can't chase it down. Physically.

On the other hand, if you can decide which information stays, you essentially own the system.

So, given the project's mission, I guess it is a requirement that information stays online as long as someone somewhere is willing to keep it.

[+] tick_tock_tick|7 years ago|reply
I doubt there will every be a way to delete content as every legitimate method of deleting will be commandeered for censorship. Even if they did add something you can never really know the other nodes actually deleted it.
[+] zrm|7 years ago|reply
> One of the biggest challenges with IPFS in my mind is the lack of a story around how to delete content.

"One of the biggest challenges with [HTTP|ZFS|TLS|USB|ATX|VHS|USPS] in my mind is the lack of a story around how to delete content."

If you delete your copy of some data, someone else may still have theirs, but then it's them who controls whether to delete it. It's not a challenge for Serial ATA that it doesn't have a function to delete certain data from every hard drive in the world at the same time. Most systems don't work that way, not least because it's inherently dangerous.

[+] peterkelly|7 years ago|reply
The inability to delete things is a feature of IPFS.

Do you want yet another way of serving content that is subject to censorship and ridiculous content takedown policies?

[+] StreamBright|7 years ago|reply
Just watch how fast warez people will start to use IPFS to host all seasons and all episodes of Friends.
[+] pertsix|7 years ago|reply
why not encrypt it? that way it's junk data to anyone that can't decrypt it
[+] miguelrochefort|7 years ago|reply
Data should never be deleted. Ever.

Start by accepting that, and everything starts to make sense.

[+] woodandsteel|7 years ago|reply
I have a question for the IPFS people. I am a non-techy who really likes the IPFS idea and wants to see it succeed.

However, whenever this topic comes up here at HN, we get a bunch of people who say they tried to use it but it was basically unworkable, like too much RAM usage and various sorts of failures. And rarely does anyone respond by saying that it is working just fine for them.

So my question to the IPFS people is, when is it going to get really usable? I am asking for something reasonably specific, like 2 or 3 years, or what? And I am supposing that would mean a different promise/prediction for each main different use case. So how about some answers, not just "We are aware of those problems and are working on them"

[+] momack2|7 years ago|reply
As you seem to foresee - being "really usable" depends on the use case. The one we're focused on this year is package managers - and making IPFS work really well for that use case in particular. There is lots of room for improvement on performance and usability - setting the package managers goal gives us really a specific target to focus and deliver on. This won't solve "all the problems" (there's a lot to solve for package managers alone!) - but will help us take a big step forward in production readiness and hopefully knock out a swath of performance issues experienced by everyone.
[+] paulsutter|7 years ago|reply
So I can store a file in ipfs by its hash, but there’s no way to link to the next version of the file. I can only link to older versions?

I’m a giant advocate for decentralized architectures but so far I’ve never found a use for it that doesn’t rely on a centralized way to find out about new data

[+] nullobject|7 years ago|reply
This is exciting. I would love to make it to the 1st IPFS Camp in June.
[+] ezoe|7 years ago|reply
IPFS is a joke. They have name lookup feature but relies on traditional DNS! What are they thinking?

Also, if the IPFS's idea of working as local server is sound, BitTorrent DNA(browser plugin, steaming video over BitTorrent) should had been worked.

It seems to me, they suffered NIH syndrome. They tried to reinvent the wheel. The P2P file transfer protocol over IP has already been covered by BitTorrent. What we need is a nice front end which use BitTorrent protocol as back end and offer a illusion of Web site.

[+] viraptor|7 years ago|reply
> IPFS is a joke. They have name lookup feature but relies on traditional DNS!

This is not a criticism. That's describing a feature. Yes, they do have that. You could implement your own name resolution in a different way if you need that.

[+] miguelrochefort|7 years ago|reply
The real innovation is making files content-addressable.
[+] StreamBright|7 years ago|reply
On the top of that when you ask them about it the answer usually is:

- but this is decentralized

- it is not a bug|lack of implementation but a feature

[+] caprese|7 years ago|reply
I've been considering Swarm distributed file system because of its closeness with the Ethereum development.

It seems to do the same thing and works already but hardly gets any press. IPFS and the Protocol Lab's Filecoin sale seemed to generate a lot of marketing despite it becoming clearer later that Filecoin is for an unrelated incentivized network.

It is hard understand the pros and cons of choosing to use IPFS over Swarm, or where they are in comparative development cycle.

I know many decentralized applications that opt for IPFS for their storage component, and know of the libraries to help different software stacks with that. But I can't tell if it is right for me, versus the state of Swarm.

[+] nonsens3|7 years ago|reply
Swarm and IPFS together with Filecoin try to address the same problem - persistent data storage in a decentralised network.

Swarm is not at all "working already" - the incentivisation layer for nodes to store data for other users is not implemented and currently mostly theoretical and work-in-progress.

IPFS is more mature in comparison to Swarm, but the underlying architecture is rather different.

[+] theamk|7 years ago|reply
Does Swarm’s closeness with the Ethereum means it will be unsuitable for any tasks which do not rely on financial incentives?

For example, if in the future debian is moved to IPFS, then many organizations are likely run local IPFS servers with Debian repos pinned. But if debian is moved to Swarm, I do not think that many organizations will be incentivized - the money are insignificant in the total spending, while engineering effort and organizational overhead (finance) is likely to be very big.