top | item 45480106

Personal data storage is an idea whose time has come

407 points| erlend_sh | 4 months ago |blog.muni.town

279 comments

order

brendoncarroll|4 months ago

I work on a FOSS project in this space, Blobcache.

https://github.com/blobcache/blobcache

Trusting a server to store an application's state is a different thing from trusting it to author changes or to read the data. Servers should become dumber, and clients should become smarter. When I use an app, I want the app to load E2E encrypted state from storage (possibly on another machine, possibly not owned by me) make whatever changes and produce new encrypted data to send back to the server. The server should just be trusted for durability, and to prevent unauthorized access, but not to tell the truth about doing either of those things. Blobcache provides an API to facilitate transactions on E2EE state between a dumb storage server and any smart client.

Blobcache can be installed on old hardware along with a VPN like Tailscale and then loaded up with data from other devices. Configuration is like SSH, drop a key in a configuration file to grant access. It removes most of the friction associated with consuming and producing storage as a resource.

I'm using it to build E2EE version control like Git, but for your whole home directory.

https://github.com/gotvc/got

apitman|4 months ago

> Configuration is like SSH, drop a key in a configuration file to grant access. It removes most of the friction associated with consuming and producing storage as a resource.

What's the story for people who don't know what an SSH hey is?

gcanyon|4 months ago

Both of these proposals (as far as I've read them, YMMV) fail the evolutionary test. At the scale we're talking about, ideas must proceed as evolution does: not with a far-away goal in mind, but with incremental changes, each of which individually must be an improvement over the status quo.

We are at (near) a significant local maximum, and (again, as far as I've read, which is not all of it for sure) the people pitching this form of information control have given no set of steps from here to there without significant cost/effort.

Of course they don't have to have the whole path in mind. By definition they just need the first step or two. But they must be steps up.

You don't get wings by wanting to fly; first you need feathers to keep warm (I am not an evolutionary biologist, I don't know if that's a valid theory).

jauntywundrkind|4 months ago

99.9% of BlueSky users use only Bluesky services. But BlueSky has a Personal Data Service for each. That means:

Those users have credible exit to take their data off BlueSky's hosting to someplace else (and as of a week or two ago to move back to BlueSky if they want).

Those users can put whatever kind of data they want in their PDS. They can host their git data via https://tangled.org . They can store their music listening scrobbles with https://teal.fm . They can blog on https://leaflet.pub .

And there's been rapidly advancing host it yourself options. Plenty of folk individually or collectively host PDS. There are alternate relays that collect &n syndicate out everyone's PDS data as that changes. Hosting the aggregation layer is significantly harder especially if you are trying to fully connect the network but there are a couple & progress is good.

it feels like a huge improvement over the status quo, and there's extremely visible developer energy building forward & rolling with the concepts. The breakdown on architecture allows for wins and work in various areas. The base seems solid, the core seems coherent & well built, built to scale not as one big thing but coherent layers. I think it's doing what you are asking for, and the signs of advancement & uptake warm my heart to see.

ineptech|4 months ago

The realistic path off looks like this, I think:

* I use Bluesky to chat as a Twitter replacement, which gets me into the Fediverse and gets me a PDS

* I use my PDS to store my payment details, giving me a (at first client-side) way to submit stored payment details that feels similar to storing it in the browser, but stores it in my "server"

* From there, it's a natural step to giving the retailer a token that can be used to pull payment details from my PDS; early adopter retailers are incentivized to do this because it frees them from the burden of storing and updating PII/PCI

* After some subset of users and retailers do this, users see the benefit of controlling their data as a viable alternative to some of the worst user-hostile patterns, e.g. the New York Times' "we don't have a cancel subscription page, you have to call an 800 number" nonsense.

* To the extent that storing PCI/PII in a PDS is as easy as storing it in the browser but with perceived additional benefits, user demand drives wider adoption

* Once it's technically feasible for sites to maintain their business model without storing any PII/PCI, it is much more realistic to write laws that proscribe it effectively for those users who choose that

seandoe|4 months ago

> each of which individually must be an improvement over the status quo

I agree. And looking at the average web user specifically, is "owning your own data" enough of a UX improvement? Maybe paired with less ads and products that optimize for the end-user rather than advertisers? I think... maybe. I hope so. It's going to take a lot of work done for little money, which is concerning, but I'm optimistic.

InMice|4 months ago

Among the first page and 2nd page (top 60) there is always atleast 1 post about how we're gonnna "take back the web" or make it back into some form of our 90s millenial nostalgia memories, self hosting, federated this or that, etc etc.

Meanwhile - Nothing changes, everything generally gets worse and younger generations come into the world with no memories of the 90s internet or the world before mobile devices or surveillence everywhere.

Applying for a job or apartment or anything today means creating endless pointless copies of your pesonal information in databases across the world that will eventually be neglected, hacked, exploited, sold off etc

I dont know the way out if there is one, I guess we can keep fantasizing and thinking about it. It just feels like it would be easier to get the earth to start spinning the other way sometimes.

pavlov|4 months ago

> “Applying for a job or apartment or anything today means creating endless pointless copies of your pesonal information in databases across the world that will eventually be neglected, hacked, exploited, sold off etc”

This problem is practically fixed in the EU (to the extent that legislation can fix it). Data protection laws have enough teeth that real companies can’t afford to keep or sell customer information illegally.

But people only see the tip of the iceberg and think EU data protection is something to do with annoying cookie banners. We need to do a better job of celebrating Europe’s real achievements in making the digital world better for its citizens. Instant zero-fee bank transfers are another example.

xandrius|4 months ago

If even the people who experience a different time gives up because "nothing changes" then it's truly over.

We need to do what we preach: sure, things are worse in certain things but for sure setting up a local network with top-level open source self-hosted alternatives is the easiest it has ever been ever.

Also I think people forget to realise that the type of people who were online in the 90s are still online, many still does exactly the same things. The Internet just got so much easier to use for the rest of the people who doesn't really see the magic of it all. And that's ok.

People always complaining how bad things currently are, they are doing a disservice to all the services and communities still around. They are not sexy or cool but they exist.

teeray|4 months ago

> creating endless pointless copies of your pesonal (sic) information in databases across the world that will eventually be neglected, hacked, exploited, sold off etc… I dont know the way out if there is

The data needs to be viewed by the holder of that data as a dangerous liability, not an asset. If there were headlines about “Megabank Files Bankruptcy Over Data Breach, Executives Jailed” instead of the general sentiment of “LOL another data breach, here’s a free trial of LifeLock,” there would be changing attitudes about storing arbitrary user data.

erlend_sh|4 months ago

This is demonstrably not fantasy as the example case is a fully productionized network (Bluesky and the rest of AT-net) that’s having real-world impact to the point where it’s under threat from several authoritarian states.

sholladay|4 months ago

The most compelling and plausible solution to this that I have seen is a set of standards called Solid, made by Tim Berners Lee, who invented the web.

https://en.wikipedia.org/wiki/Solid_(web_decentralization_pr...

You’d think that if anybody could pull off reshaping how data is stored and shared on the Internet, it would be him. And the technology is, well, solid.

Unfortunately, it doesn’t have as much traction as I would hope. Probably because it requires a new way of thinking about many parts of the tech stack. It’s not as simple as swapping out one library for another one. The existing web has so much momentum, and so many of today’s tools and frameworks have assumptions built into them that aren’t necessarily convenient for building a web where users have true data ownership.

Still, I’m rooting for Solid and the team behind it. They clearly understand these issues. They’ve been building libraries and scaffolding tools to make it easier to adopt Solid, For new projects, it’s pretty easy these days.

abetusk|4 months ago

In general, I think these types of sticky behaviors only change when there's an application that people gravitate towards with the changing behavior embedded.

One such candidate is cryptocurrency and personal finances. The cryptocurrency wallet will necessarily need to be cryptographically secure, so this at least provides an opening for privacy. Tying it to finances means that there's an immediate application, payment processing, that people might want to use and put up with clunky behavior, at least initially.

All this lacks specificity and finances, cryptocurrency or no, bring their own drawbacks, but it does seem like it's possible to me.

The Internet's attention can be fickle and it's easy to forget that sometimes. IBM used to be a titan before Microsoft supplanted it. Proprietary server operating system, including web servers and databases used to deeply embedded until they were supplanted by FOSS alternatives. Digg, Friendster, Myspace, Yahoo, etc. used to fixtures of the Internet until they weren't.

mariusor|4 months ago

> Meanwhile - Nothing changes

Well, TFA, and sibling posts to mine, point out some ways in which federated networks are leading the change in this direction. I would add that alongside SOLID and the AT Protocol, ActivityPub also encourages people taking ownership of their own data.

So probably you need to focus your attention to where the change happens instead of waiting for large, ad filled, for profit networks to act on it. Because indeed they have no incentive.

Frieren|4 months ago

> I guess we can keep fantasizing and thinking about it.

Strong regulations is the answer. To think that big corporations are going to do anything for us out of their good heart is naive and dangerous.

If a society wants nice things then they need to fight for it. Get elected officials that care to fix things, that fights against big corporations, and that help to split their monopolies.

The USA thinks that they can get a better Internet by doing nothing, like by magic. The reality is that government and civil society are going to need to put a lot of effort to reign in the big tech monopolies.

aprilfoo|4 months ago

I think it's about showing that different models are possible for people who do care and are willing to reflect and change the way they operate.

The big majority goes with the comfort of the mainstream, almost by definition.

m463|4 months ago

> Applying for a job or apartment

Let along actually Living in the apartment or working at the job...

A friend's apartment required you to sign up with a third party to get your packages. They made you create an account and accept that they would make pictures and videos of you to access the package room.

Don't even get me started on connected appliances/wifi and app access for doors.

Arthurian|4 months ago

Yep, it’s all totally pointless so why bother thinking and dreaming of a way out, right? Even if the ideas in this post are a little unrealistic in the face of modern convenience, it’s productive to talk about it. Is there something else we should be doing instead?

torginus|4 months ago

The weird thing is that there are still IRC federators - big servers with channels much like discord, but presumably running on some dude's computer in a basement, and there are tons of people (usually niche interest groups) are still using those.

akho|4 months ago

> self hosting, federated this or that

> creating endless pointless copies of your pesonal information in databases across the world

These are completely different, unrelated concerns.

h2zizzle|4 months ago

The way out is mostly antitrust and regulation of the private data market. But too many portfolios depend on the status quo; the way will be opened once the AI bubble pops. The Chrome lawsuit was the jab before an AdX haymaker is thrown just as the arena lights go out.

Workaccount2|4 months ago

Everyone wants "free ad-free no tracking no payment" Internet. Nobody wants to compensate anyone for it, and therefore nobody wants to host it.

Then the people who have not viewed an ad or paid a subscription in 20 years complain that the internet sucks and we need to go back to IRC and chan boards. As if ideologically non-paying customers have a voice worth listening to.

jstummbillig|4 months ago

Nothing changes because the ask is silly and disconnected from the reality of normal people's lives. So what happens if Google has all your data? To the best of my observations over the past 20 years: best in class services, cheap, paired with excellent security and data availability.

neya|4 months ago

But such consistent "nagging" is what gets attention to the problem. In the EU, you have GDPR exactly because of this kind of nagging. Privacy has nothing to do with nostalgia.

lukeschlather|4 months ago

I love the idea of personal data storage and I want it to be the default, but I think there are some possibly insurmountable technical problems. This article doesn't mention schema once, and schemas make seamless data portability virtually impossible. I've spent a week making sure a simple CRUD app could change a string field to a UUID field without causing any outage or bugs.

You can export your data from Google or Facebook today, but then you need to write a copy of the source UI that faithfully replicates the way all those data fields are supposed to display. And tomorrow the source makes a change so what used to be one field is now two fields, oh and they also removed another field entirely so that data is just gone. Well, in future dumps anyway. Are you going to use the old schema or the new schema for your display? Is it possible to do both?

When everything is in data silos, you can freely and safely change data format, which is something that needs to happen a lot as applications evolve. Even in a data silo, doing this is pretty tricky and bugs and data loss are significant risks. If you're trying to sync between an unbounded number of data repositories where each repository has potentially conflicting relationships with the data schema, data loss is practically assured.

Another big problem is schema permissions and identity. I might have some piece of data that says "person A is allowed to see this set of fields" and another piece that says "person A is blocked from seeing this other set of fields." This gets synced to 3 different servers, one of those servers has no idea that userA is in fact person A. So you fail closed, but then the data on that server practically does not exist if the goal of this data repository is sharing some data with person A. You really can't do any sort of fine-grained access controls in a system where trust/identity/auditing is decentralized.

Al-Khwarizmi|4 months ago

Glad to see a mention to Opera Unite. I found it to be a really revolutionary idea, anyone could have a simple static website running in their browser with zero tech knowledge needed. I think the world would have been better if that idea succeeded as a way for people to share their content, rather than the highly monetized and manipulative social networks.

Khaine|4 months ago

It was an idea that never went away. Many people have wanted to self host everything. Sadly companies have found it easier to centralise, and then as a bonus can monetise that data.

9dev|4 months ago

It wasn’t the companies but the users that found it easier. There’s a reason why everyone’s on Facebook, instagram, and gmail instead of running their own hosts—because it’s vastly easier for the majority of people to do so, and because everyone else is there.

We have not solved decentralisation in an accessible and useful way yet, and the incentives won’t change until we do. If ever.

Forgeties79|4 months ago

I’ve always had this like 70% formed idea about Plex and how it’s indicative of how people want to self host more than we realize, but I’ve never quite been able to articulate what I’m thinking here and what the larger implications are.

Plex is obviously not true self hosting, but it’s a lot closer to it than a Netflix subscription, and the number of people who I do not consider very tech savvy who have not only been joining other people servers but trying to set up their own is staggering lately. And they’re not simply doing it because they want free movies or something. A lot of them have done it for the same reason I initially started: their kids.

I am concerned about the media that is put in front of my kids. I care about what shows they are watching. Kids are going to get their hands on screens there almost is no getting around it, so I would rather not trust YouTube et al with deciding what my kids do and don’t see. I can’t realistically be there to catch literally everything they watch, but if they’re using my server I know they only have access to a certain Library at all times so I can rest a lot easier. In a lot of ways I imagine this is how our parents felt when we were kids. On cable television growing up there were only so many “weird” or troubling things that could pop up, definitely nothing as extreme as we see today, and you could be reasonably aware of what most of those things were and know what channels to forbid/what times your kids should not have free access to the TV.

I found a lot of other parents feel the same way here. They’re just tired of feeling like the Internet is such an incredibly hostile place and want to find ways to take a little power back into their own hands.

I don’t know hopefully something useful popped up in that rant above. I have a lot of disjointed thoughts about this I really haven’t been able to bring together.

crazygringo|4 months ago

> Rather than being in countless separate places on the internet in the hands of whomever it had been resold to, your data is in one place, controlled by you.

I don't see how this follows. The moment you create/share data with a site, what's to prevent them from reselling it?

The only thing this seems to attempt to solve is portability/interop (and moving control of and responsibility for blocking/moderation/spam to users rather than sites).

I don't see how it helps at all with privacy or you "controlling" who gets your data. If you give it to site A but not data collector B, what's preventing A from selling it to B? As far as I can tell, the situation will remain identical to how it is today.

Your data will never be in one place unless you never share it. The moment you use it with other sites or services, it is stored there too, out of your control.

majkinetor|4 months ago

Nothing is preventing it, but 3rd party operates on a copy. You are still owner of the data and it is on one place which makes it easier for you to access it, share it, backup it, analyze it. So, this doesn't prevent reselling in general but prevents data locking. From there, I guess its not that hard to demonstrate which 3rd party sold your data and sue them. It also mandates nonproprietary data formats.

All that is much, much better than what we have now.

theshrike79|4 months ago

"Control" here means that there are people who write blogpost worthy comments or messages on Facebook or other Meta properties. That's the only place where they exist.

When Meta (or any other company) decides to destroy them, they go away forever. You have no "control" over it.

https://indieweb.org/POSSE is the way to go.

You want to write a long post on a 3rd party platform? Write it on your own device, that you control. Then you save it, copy the content and post wherever you like.

If your 3rd party blogging or social media platform goes tits up and everything disppears, you still have your own copy you can just Ctrl-C Ctrl-V anywhere.

You can go as fancy with this as you like, depending on your nerd-level. You can have a self-hosted N8N system that automatically reposts everything to new sites you add to the flow. Or you can just have your stuff in a directory in Obsidian.

erlend_sh|4 months ago

> The moment you create/share data with a site, what's to prevent them from reselling it?

If I can clearly assert origin and personal ownership of my data, I can forbid further reselling of it.

EU legislation shows that we can actually have the right to demand that a company forgets about us. Asserting such rights become easier the more accurately we define what data is ours.

anonbuddy|4 months ago

current data points are much more valuable than historical data points, so storing old data doesn't have much incentives

also by having ability to enable/disable access to your data, you have the power of who gets what and for which purpose

also reselling of your data should become illegal to start with, would you be OKAY if your lawyer sells your data? or your colorectal surgeon? off course not, we have laws in place for that, and same laws should be applied to whoever handles your personal data

dd_xplore|4 months ago

When I was a kid, a 4GB pendrive was a huge thing for me. I used to think my 40GB HDD would never fill up, but then Internet started to grow. Today it doesn’t even matter how muc storage you have it’ll always fill up.

I have started to self host quite a lot of stuff but eve then every storage solution has a life of 5-6 years in which atleast one of the components would fail. We click enormous amounts of photos but they do not have any impact like printed photo albums. With ever growing storage costs (both cloud based and self hosted) I’m thinking of going back to keep only important stuff that too in print format.

AdrianB1|4 months ago

I run a NAS, in various forms, for almost 20 years. The lifetime is quite longer, I still have ~ 10 year old drives in the backup NAS built on a Ryzen 1600 (8 years) and the average power supply works for me 10-12 years. The primary NAS is still on hardware that is more than 5 years old, except the drives that I just replaced with higher capacity.

As I find the size of current drives bigger than my yearly additions (personal pictures and movies), I am quite happy with a 10 year lifetime at low usage. I would love some reliable and affordable long term offline storage, but backup tapes and a reader are not affordable and not in common use for end users. Otherwise I would build a tiered storage system with more reliability and even performance (nvme hot tier? maybe).

Hendrikto|4 months ago

> ever growing storage costs (both cloud based and self hosted)

That’s not my experience at all.

Jaxan|4 months ago

We still print photo albums. I can strongly recommend this!

ivanjermakov|4 months ago

In the age of abundance, smart prioritization is needed.

herf|4 months ago

Vertically integrated apps are much cheaper to run - Instagram stores only a small fraction of your photos and makes a lot of money from them. It is somewhat harder to explain why we pay for things like iCloud, which mostly has no web API, only APIs for Apple devices. (Plenty of value there because it keeps you from having to buy a bigger iPhone.) But there are lots of these "almost general purpose" solutions, paying to upload files and store them, but where you cannot use them as you like.

Why not dozens of apps running over the "web filesystem" like happens on the desktop? Two reasons: 1. Amazon pricing for transit/bandwidth is way higher than storage, and so it makes accessing your own data quite expensive if it is not in the same datacenter. 2. And there is a huge security and usability gap between "pick one photo" vs "give me [scoped] access to your Dropbox" Often the general-purpose mode does not work that well, is quite slow, or just costs a lot in bandwidth, a thing nobody wants to pay extra for when they're already paying for storage.

akoboldfrying|4 months ago

Who has an incentive to provide a Solid server? Not big social media companies, who want the personal information that Solid attempts to withhold. I don't think anyone is prepared to offer a convenient, high quality Solid-based social media experience to everyone for free, because that costs a lot of money. And if you know anything about human nature, it will have to be convenient and completely free in order to have a chance of capturing any mindshare outside of weird tech nerd circles.

> the platforms should be asking us what kinds of data they may copy from our servers, and only with strictly temporary allowances.

Until practical homomorphic encryption arrives, I don't see how this temporariness can be enforced. If we rely on promises or regulation instead of the technical ability to enforce this, how is that any better than today's social media companies promising not to do anything bad with the data they have on us?

anonbuddy|4 months ago

'that costs a lot of money'

price of intelligence is dropping day by day like it or not, sooner or later price incentives for someone to host such social media experience could become financially viable

seu|4 months ago

The fact that the AT Protocol relies on everyone having a domain name, which is a centralized system over which few people have control, and about whose workings most people have no clue about, is problematic. Also impractical, once we consider that - as far as I can understand - 8 billion people should have their own domain name.

switknee|4 months ago

What's impractical about everyone having a domain name? It surely isn't due to lack of domain names, because foo.bar.baz.bim.bim.bap.com is a valid domain name.

It is true that full data sovereignty isn't something most people are interested in, but this is more about a cooperative model for data ownership and access. Having your data identifier be JackDaniels@yahoo.com isn't particularly different from it being jackdaniels.is.technically.bourbon.com. In both cases another organization owns some of the path to your identifier and could potentially lock you out of it. In both cases, verizon is near the top of that list (.com).

As far as the domain name system being centralized, I'm not sure I agree. DNS is like a feudal system with hundreds of kings (top level domains) who all work together with one pope (ICANN), and various lords and ladies occupying positions under those kings. If ICANN goes completely bonkers the kings can get a new pope, some of them are literally sovereign because they are nation states. Just for fun, some of those states are ruled by literal kings, too. There are experiments to run a TLD by Decentralized Autonomous Organization (DAO), but I think for the most part nobody really cares because the current system happens to work pretty OK. If you have an idea for a more decentralized way to organize a namespace that doesn't involve your grandmother typing in a massive UUID or onion address, and doesn't result in someone being able to domain squat literally everything; I would love to hear about it.

diggan|4 months ago

> The fact that the AT Protocol relies on everyone having a domain name

Well, either that or someone else hosting their identity (see did:plc), which seems to be the part you say should exist?

Probably DNS is the most decentralized centralized system we have available today that most people can actually use, unless I'm missing some obviously better way of doing the same thing?

erlend_sh|4 months ago

It doesn’t really rely absolutely on domain names; at the very root there’s just a DID. DNS happens to be the best we’ve got right now as a human-readable username and address in-one goes.

We can work to make DNS /ICANN et.al. more democratically operated and people-owned while at the same time devising wholly alternate paradigms like Handshake and similar: https://blog.webb.page/2025-08-21-dap-the-handshake-successo...

btbuildem|4 months ago

> 8 billion people should have their own domain name

That is something that could be feathered in gradually -- your country, region, city, neighbourhood, etc could have their own domains, and you could be anon237@milan.italy or whatever, until you find it necessary or inspiring to obtain your own domain.

layer8|4 months ago

There are around 10^99 different possible domain name labels (the part between the dots), so I don’t quite see the impracticality. Even going the route of Reddit’s autogenerated usernames like Eloquent-Salad9443.net would be viable.

weinzierl|4 months ago

But what is the alternative. Systems that bind identity to the phone number give even less control. Systems that use a self generated cryptographic key (like Scuttlebutt) are even less practical.

DNS is not perfect but I think the best we have for now.

Hendrikto|4 months ago

With did:plc, you don’t have to have your own domain, if you are willing to delegate some responsibility.

est|4 months ago

> everyone having a domain name

This idea is an incremental improvement over "everyone is posting x.com"

pydry|4 months ago

The problem isnt technical feasibility it is market incentives.

Most companies have no incentive to let you hold your data when they can just hold it for you.

If they do this they can mine it for data to improve their product as well as sell or otherwise indirectly profit from it. And, it's easier.

Also, while the market for privacy focused products isnt nothing, the number of people willing to pay a lot extra to compensate for the missed opportunities companies get by collecting your data is, i think, smaller than many people imagine. Which is sad.

I think the only way it will grow to an appreciable size is by seeing up close and personal what a really vicious stasi-like secret police does with dragnet surveillance and come out the other side, with scars. I believe we've only seen a small taste of this.

fidotron|4 months ago

> The problem isnt technical feasibility it is market incentives.

This is understating it honestly.

The software industry has become completely reliant on renting data access back to users to maintain subscription revenue. One effect of this is it has devalued the actual software in the eyes of users to such a degree that virtually no one will pay for alternatives, certainly not enough to compensate the development cost.

dist-epoch|4 months ago

You got the market incentives wrong.

Most people have no incentive of owning their data. Otherwise the companies which don't give you that would die out because people wouldn't use them if they cared.

Same fallacy as believing smartphones are giant and with non-user swappable batteries because somehow smartphone making companies are forcing this on the market, instead of the real reason which is that it's what consumers want.

theshrike79|4 months ago

Of all the big name corporations Apple is the only one I can see doing this.

I'm still hoping they release an Apple TV Pro with fully local LLM capability that's shared with everyone in the family - adding a few TB of disk space to it for local data storage and backups wouldn't be a massive thing.

Lumoscore|4 months ago

It’s completely true that the system we use today—where a few big companies hold all of our private information in one place—is a bad model. It’s risky for security, and it means you have no real power or ownership over your own data.

The good news is that we don’t have to wonder if a better way is possible. The technology is already here! Projects like Solid (Pods) and AT Protocol (PDS) have proven we can separate your information from the applications you use. You can put your data into your own secure digital "locker" or vault.

The difficulty now is not the technology, but getting people to actually use it:

1- It’s Too Hard to Use: Setting up and managing your personal data locker is currently as complicated as managing a super-secret password for a crypto account. For everyone to adopt it, it needs to be way simpler than just clicking "Log in with Google." If it’s too much work for regular people, it will fail.

2- Big Companies Don't Want to Change (The Incentive Problem): The biggest tech companies make billions by collecting and using your data. They have no reason to switch to a system where they have to ask permission to use data they don't own, unless a major law forces them to, or a new competitor steals their users.

3- Privacy Isn't Enough (The Benefit Problem): Most people won't switch just for "privacy." The new system must offer clear, positive benefits, like letting you move all your friends to a new social app instantly, or securely filling out long forms with a single click from your data locker.

The key to success is building user-friendly tools that hide all the complexity and make this new, secure way of managing data simple for everyone.

zeroCalories|4 months ago

I find the ideas of data coops to be very appealing. I don't want to depend on faceless mega-corps like Google to host stuff like my email, but I also don't find the idea self-hosting to be realistic. I wouldn't mind paying for the security since losing access to certain accounts would be a disaster, but I'm already locked in, and the benefits of existing services would be marginal compared to the cost of moving.

anonbuddy|4 months ago

ideally you should be able in a simple way to host your stuff, in this case in a POD. That service should be provided by a utility company, same way we have internet providers now. They will be well regulated and it would be in their interest to safely hold your data because if not, they would face legal and financial consequences.

All other services would read/write from your Pod.

ksec|4 months ago

In terms of NAS, I have long wonder if there is a market for a combination of both online and offline. We will need at least 2 HDD for redundancy and to prevent bit riot. And the NAS will be sold as a whole package and subscription, with an encrypted backup services included for first 2 years and requires the backup subscription to work there after. The profit margin is first on the hardware and then on long tail backup which is charged like iCloud and Google storage per tier. Where your 1.5TB storage will be charged at 2TB storage.

Before 2014 I would have thought Apple to potentially take this route for Time Capsule. Instead they doubled down on iCloud. Google will never take this route. Microsoft is not interested. Amazon should have done this and bundled with cold storage back up but their track record are not good enough. I doubt people trust Meta enough even if the solution was perfect.

In pre 2012 you could at least bet on Apple to be somewhat customer centric.

May be UniFi will do it. They just announced their 2 Bay UNAS and I only just discovered, they are a 40B market cap company. ( I thought they were much smaller )

Larrikin|4 months ago

>with an encrypted backup services included for first 2 years and requires the backup subscription to work there after.

Its confusing if you mean the NAS will stop working if you stop paying for the subscription or not. If you can no longer access your data on the NAS without a subscription, then the NAS just becomes the cloud with an extra up front cost plus the cost of your own electricity.

Personally I have started moving as much of my data out of the cloud as possible. I've got a Synology and a few single board computers running various services with a Synology in my parent's home for their photos. Their photos back up to my NAS and my data to their Synology.

Its a shame Synology decided to enshitify this year for all products going forward, but UGreen looks like a suitable replacement when I outgrow my current NAS.

detaro|4 months ago

Synology sells cloud backup services for their NASes. And a bunch of other brands at least can easily connect to other services.

anticorporate|4 months ago

> for redundancy and to prevent bit riot

What are you doing to your hard drives that the bits are rioting?

phkahler|4 months ago

>> And the NAS will be sold as a whole package and subscription...

Misses the point entirely.

nayuki|4 months ago

> Data Ownership as a conversation changes when data resides primarily with people-governed institutions rather than corporations.

This is a false contrast. Corporations are institutions governed by people - specifically a board of directors, elected by shareholders. They aren't governed by aliens nor are they self-sentient. https://en.wikipedia.org/wiki/Institution#Examples , https://en.wikipedia.org/wiki/Institution#Examples

Perhaps you meant that you are against for-profit corporations where the customer (who stores data) has no vote in the operation of the corporation? If so, then say that and don't imply it.

People often use "corporation" as a pejorative, often in contrast to individual people. But they forget that a corporation is composed of people and ultimately owned by (some) people - but the kind of people that the writer does not like (shareholders, profit-makers, etc.).

> Notice that Alice’s handle is now @alice.com.

It's funny you're using .com as the example, because:

> The domain com is a top-level domain (TLD) in the Domain Name System (DNS) of the Internet. Created in the first group of Internet domains in March of 1985, its name is derived from the word commercial, indicating its original intended purpose for subdomains registered by commercial organizations. Later, the domain opened for general purposes. -- https://en.wikipedia.org/wiki/.com

Even when you're arguing against commercial organizations for storing personal data. Now you're just naming individual people as if they were companies.

HenriTEL|4 months ago

To be fair nowadays .com refer much more to the default, main or official domain of an entity. Say you know the name of a non corporate website, are going to try .com first of something else?

righthand|4 months ago

> Whether these providers are strictly cooperatives in the formal sense isn't what's most important here though;

I think the context of “encouraging people to switch” to a pds/solid/data coop, how they operate IS important. For two reasons:

- data coop and controlling data opens the door to a new market if we’re going to join data coops, then we may as well try to share the profits from said coop fairly. Otherwise Facebook can step in as a “data-coop” and keep-on-keeping-on

- a secondary effect is that now there is an incentive to move off facebook. If I can join my local Nowheresville.USA.town data coop and benefit directly to my community by storing data together then I am encouraged to switch to this new paradigm

That is the major undiscussed shift to me. I believe the only way out of the Big Tech dystopia is to incentivize the switch. Even if the reward is pennies. Invest in the community oil well.

tjpnz|4 months ago

If this takes off I fear big tech very quickly finding friends among those pushing for things like chat control, while potentially reevaluating some of its more consumer friendly "views" towards privacy. Very easy to undermine something when you start speaking of its potential to facilitate CSAM.

anonbuddy|4 months ago

that is exactly what is going to happen, as more people become aware.

that's why we all need to exercise our rights and freedoms. I'm scared that if we fail to do this in next few years. And let the AI be used in similar ways like it has been used to create social media algorithms. Then we are all fucked!

Whoever owns your AI owns you, so it better be you who owns it!

outime|4 months ago

This guy has eyes and eyes can be used to visualize CSAM! What if...

mactavish88|4 months ago

For those of us who've been around for some time and still value privacy, this sort of paradigm is obvious.

The trouble isn't a lack of the right technologies - I'd argue it's a problem in the go-to-market strategy of those building these products/technologies.

Ideas flow along lines carved out by power/influence. Facebook's early strategy was to start with restricting its usage to people at Harvard University - arguably a highly influential institution - and then expand outwards to other highly influential institutions. Only once the "who's who" from those institutions were already onboard did they let down the walls to allow us plebs in, and we all rushed in head-first.

X's current strategy leverages Musk's visibility and influence (for better or worse).

Get the most prominent influencers onboard with your decentralized social network, and others will follow (dramatically easier said than done, of course). But without a significant contingent of influencers/powerful people, your network's DoA.

btbuildem|4 months ago

> prominent influencers onboard with your decentralized social network

That's sort of a contradiction, no? Or at least it assumes transplanting the same mechanisms into a new milieu -- which I argue is something to leave behind, because it's those very mechanisms that have ruined the current internet.

I think instead of tapping into the same addictive attention economy schemes, the distributed / decentralized socials could onboard people en-masse by providing what's missing there, and filling a real need.

dangus|4 months ago

This article seems pretty far detached from the problems that people experience using technology. It’s the kind of thing that only deeply technical people consider.

When someone uses a service like Dropbox or iCloud Drive or Google Drive, they really aren’t experiencing any kind of problem where their data “isn’t theirs” or is “trapped.” It’s not that hard to migrate to something else and the services themselves are reasonably low-friction.

In terms of social data, users don’t really have a major issue with the status quo, and those who do have already developed relatively popular solutions like Mastodon and BlueSky.

Even “proprietary” photos applications like Apple Photos and Google Photos have very easy migration paths to other services.

So what exactly is the problem we’re trying to solve here? Giving me an @Bob handle? Did I want that or need that?

crazygringo|4 months ago

> In terms of social data, users don’t really have a major issue with the status quo

That's exactly it. And with social media (unlike files and photo storage) migration isn't really something people care about, because it's about the present not the past.

If you move from Twitter to Bluesky, does anyone care about moving their tweet history? They just want their list of followers to migrate over as much as possible, which happens relatively organically anyways.

dist-epoch|4 months ago

How do I post a message on Discord/Twitter/Instagram from my personal data storage? If this is not supported, this idea is born-dead. Very few will use it, for the regular person the conversation goes like this:

- Who can see my personal data storage posts? Can someone with Twitter see them?

- No, but you'll own your data

- Bye

So maybe start with something which backs-up what you post on Twitter/Instagram/Discord to your personal data storage through APIs/data export.... This has no downside if it's easy to "activate"

CuriouslyC|4 months ago

At this point distributed protocols are getting good enough that for a large class of social applications, network effects are the only thing keeping the incumbents in place.

The irony of ad supported free services is that if you just let the advertisers pay you directly for eyeball time then paid for your services, it'd be better for you financially while keeping the web pure outside of the "paid to consume ads" app.

theshrike79|4 months ago

The push model is easier, all of the above three protect automated data exfiltratration pretty severely.

There are SO MANY bots on both Twitter and Instagram that a legit developer shouldn't have any issues automating posts.

Discord is a bit harder, you an post as a "bot" easily, but if you want the posts coming from your actual user, you need to poke the actual client.

viraptor|4 months ago

You just wait. The closed services will close down or become hostile enough that people will migrate. Not everyone will, but over a longer period - enough.

People getting into Solid and ATproto today are like people using own XMPP servers decades ago, or Mastodon years ago, or Matrix. Some projects like that will succeed, others will fade. But one day, you won't be able to post to Discord due to some policy changes and you'll have to reevaluate options.

Also, you can't backup from Twitter anymore. Or Discord. Or google photos. Or many others - they cut off that option once they're big enough.

BoredPositron|4 months ago

The creator/consumer divide is still 90/10. Your example just doesn't matter.

dzonga|4 months ago

I like the convenience of the cloud. but don't know whether its due to declining literacy rates / awareness etc. the cloud is nice and e.g google storage, iCloud but now with fast microsd's you can buy 1TB for $100. have a few copies then boom, you own your own data. but now phones don't allow you to have microsd's so here we are.

likewise things like email etc instead of all of us being on gmail we could have community email servers etc.

Larrikin|4 months ago

Sony phones continue to have MicroSD slots, headphone jacks, AND remain water resistant. They have been that way for at least a decade.

layer8|4 months ago

I use Dropbox, but with an encryption overlay that also integrates into the iOS Files app for ease of use on mobile. So it’s possible to use cloud storage and still keep your data private.

AlienRobot|4 months ago

When I read the title I couldn't help but think "did everyone forgot about hard disks?"

I'm sure Tim Berners-Lee is much smarter than me, but I kind of feel there are some parallels between the idea of "owning" posts you made in a platform and the ludicrous idea of "owning" game items as NFTs in a blockchain. The latter promises interoperability that games would never deliver. I wonder about the former.

At least I feel the major dealbreaker with this technology is just that it's not worth it for both parties involved.

Right now, Facebook hosts all the posts and monetizes them with ads. So long as they are making money with ads, they have no reason to delete the posts they're hosting, as the posts are their money maker.

But what happens if Facebook no longer "owns" the posts?

So now your posts are in your "personal cloud", which means that unless they are encrypted any website or local app can display them, even without any ads. This means Facebook is no longer making money off the posts. Why would they accept this?

On the flip side, who is paying for the hosting? Facebook? It's no longer their servers hosting the content, so I don't think so? Is Facebook supposed to pay the cloud service for metered API access? Can a cloud service offer different rates to different companies? Is the user supposed to pay for their cloud storage? So you're going to make users pay money to use facebook?

What happens if a post violates the ToS? Can facebook delete my post in my cloud storage against my will? What happens if content that is legal where facebook operates is illegal where the cloud servers operate?

Can I manually edit the data in my cloud storage like I'd be able with a file and then facebook has to treat every post as if it were untrusted input?

What happens if my cloud storage closes my account? I just lose everything? Will I be able to back up my cloud to my hard disk and reupload it to another cloud so facebook can access it? How is facebook going to handle a single user with 2 clouds that have different content?

I feel like this is a very complex thing and there are infinite questions that we can have about how this would be implemented in practice, while it's presented as simply "you own your data."

selinkocalar|4 months ago

The concept makes sense but the execution is always where these things fall apart. Most people don't want to manage their own data infrastructure.

The bigger issue is interoperability. Your personal data store is only useful if apps actually integrate with it, and getting developers to adopt new standards is tough.

gatestone|4 months ago

No one mentioned Upspin? A global file namespace (URL, but better...) and protocol to isolate public data users from private governance and storage, by gurus like Rob Pike. https://github.com/upspin/upspin

system7rocks|4 months ago

I love this idea, and I imagine with years of successful lobbying efforts we could potentially get some laws passed to provide rights and clarity around our own data that could move us into this direction. But until then, while BlueSky is solid, I'll wait and see.

didip|4 months ago

As in self hosting? I love self hosting idea for myself out of principles.

But unforunately it will never take off in a huge way because convenience is king. Average Joe and Jane want to install things with as little efforts as possible.

AdrianB1|4 months ago

You can self host, but in order to be reachable you need to be discoverable. If the discovery is based on a mechanism that is controlled by someone else that can become an evil party, self-hosting in isolation is not too useful.

est|4 months ago

PDS is a cool idea, I hope the community addresses problems like content farm, spam and original attribution as a higher priority.

Or I see malicious actors would wreck the federation mechanism.

This is already the case with Email SMTPs

lerp-io|4 months ago

you store ur photos on fb same way you store your money at the bank and your code on github, its delegation of concerns, you can make same argument for literally anything....not using your own silicon, growing your own food, financing your own venture, owning your own land, etc etc.... maybe its more "secure" vs "less efficient" or some other tradeoff. and you have to get the right balance or take risks for optimal efficiency / profit/whatever your values are

esafak|4 months ago

Isn't this what web3 was about? Was it the wrong approach?

purpleKiwi|4 months ago

How do I, as a complete noob, use the powers of atproto and the fact I own a domain?

bawolff|4 months ago

This is never going to happen.

The incentives do not make sense.

Any utopian future that requires a party to put in a lot of effort to change something in a way that would be a net negative for them, is just not going to happen.

People do not spend money to change the world in a way that would be worse for them but better for other people.

JumpCrisscross|4 months ago

> The incentives do not make sense

Commercial incentives, no. If this preference exists, it would need to be pursued civically.

jauntywundrkind|4 months ago

> Another spiritually similar idea being championed at the time came from the Opera browser folks who wanted to put "a web server in your browser".

Opera Unite was such an awesome idea. https://arstechnica.com/information-technology/2009/06/opera...

There was a neat idea a bit back to allow Service Workers to work across origin: foreign fetch. It wasn't on the internet, was only in the scope of your browser, but I thought it was such a neat advancement. Would have done so much to allow the offline web to weave itself. Alas, deprecated. https://developer.chrome.com/blog/foreign-fetch

BrenBarn|4 months ago

There are good ideas here. They won't come to fruition without some form of force. I'm not sure if TBL doesn't realize, is unwilling to accept, or just wants to avoid saying out loud that the only reason it worked for him to create the web as an open protocol is that no one was prepared for it so no one was in a position to co-opt it, commercialize it, and enshittify it. Now corporations are prepared. They will co-opt, commercialize, and enshittify whatever system you come up with unless it is accompanied by a giant hammer that will brutally destroy them if they don't change their wicked ways.

xenodium|4 months ago

> Meanwhile - Nothing changes, everything generally gets worse

https://LMNO.lol is my grain of sand.

I wasn't happy the state of blogging (tracking, bloat, ads, paywalls...), so I built https://LMNO.lol. It's offline first and you can browse blogs from anywhere (even terminal). Your blog is a single Markdown file. Drag and drop it to the browser and your entire blog is generated.

Custom domains are welcome. My blog is running off LMNO.lol that https://xenodium.com

browningstreet|4 months ago

Ideas like the Solid protocol have a limited timeframe to make it or go away. Not sure why anyone is still talking about it. TBL is rightfully a legend but this is now just a windmill.

Next, please.

righthand|4 months ago

This comment has inspired me to target SOLID and “things I can do to help” on my Sunday afternoon research block. This type of commentary is rife in this article thread and is now just a windmill.

Next, please.

impure-aqua|4 months ago

I don't see what advantage any company gets from choosing to build products that enable personal data ownership. I say this as someone working on a venture with these sorts of design aims, it feels like pushing a boulder uphill often.

The business model of cloud service providers makes a lot of sense- we have a system which stores and operates on your data, you pay some rental fee for us to store it and operate on it, easy peasy. The cost is related to both the utility of the operations the operator performs (to both the operator and the user) and the amount of data the user stores.

Fundamentally this is how everything from Dropbox to Facebook is governed- Dropbox does not devise much utility per GB and users store a lot, so you rent per GB, but at Facebook, they don't store lots of your stuff, and on the data side maybe you don't get much value from it as it's a cesspit, but the data is valuable to Facebook to sell ads, etc, so they can provide the service for free.

Importantly, you don't need to improve the product to continue extracting this rent, because the product you are selling is not Dropbox v4, Facebook v2.3, rather you are selling ongoing access to the rental.

As soon as you introduce even simply a federated system where a few corporate operators are involved, it becomes very hard to justify extracting rent there as the network designer, as the operators are taking on the cost of actually storing the data. You have to really be iterating on the core product to use a SaaS business model here. Some things simply don't need a v4, does Dropbox really need that much iteration?

Meanwhile as the system designer, life has become a lot more complex for you. Suddenly you cannot push unilateral sweeping changes to APIs, you need to version things in a way that is compatible between, say, one university updating their system but not the other. Since your users are a few large operators rather than millions of individuals, you lose the network effect advantage of being able to screw over a few users for the "greater good", since if you irritate one corporate client, you lose a lot of your install base. Why would you voluntarily choose this harder path as a company?

Things get even worse as you increase the level of decentralization. The reality is users expect the polished experience that the rental companies can give you; they want their data always accessible so that their friend can see the pic they shared without needing to keep their own computers running, they want the "like counter" to go up without their personal node subscribing to messages from other nodes, etc. The only users that will accept a worse experience are people who have are motivated by their philosophy re: personal data ownership, and this crowd will want a FOSS solution, so you can say goodbye to charging them for Dropbox v4, they are simply not interested if you're not giving them the source code for free. (I suspect this is where the author sits, but fundamentally I don't think it will get mass appeal, most people simply do not care about data ownership above something that "just works".)

So now you are dealing with problems like dynamic generation of redundant data and fault- and Byzantine-tolerant consensus algorithms so that your system can maintain function even when the user turns their computer off, and you have to deal with wrapped-key cryptography so that the redundant data can be split across all these user nodes without you worrying that an unauthorized user can read it, and then you have issues like how do you deal with nodes that are too slow to process updates (perhaps some user data needs to be stored in this conflict-free replicated datatype you devise), and eventually you go through all of this to... create a system that is less monetizable than the rental model, because you can't extract that rent for ongoing data storage, and we know users are not interested in actually paying for software.

keepamovin|4 months ago

I’m continuing to explore ideas like this in my DN project (short for DownloadNet or Discernet). The core concept: a browser controller / instrumentation harness that, by default, saves everything you browse to disk, and makes it available via full-text search or a browsable alphabetical index.

The browser controller actually runs its own local server that handles indexing and archiving on your disk, while the front end lives inside your browser as a dashboard or control pane. So it’s both a locally hosted app and a browser extension of sorts.

This is still a work in progress, but one direction I want to push further is allowing users to publish curated collections or search indexes of their browsing history.

More likely, though, you’d create a separate archive centered on a topic you care about, and as you browse you selectively add pages to that topic. Over time, you end up with a niche search engine tied to your expertise.

If that archive is good, others might find it valuable—and you might choose to publish it from your own machine. With tunneling tech (Cloudflare, Tor, etc.), you can expose your local box to the public internet. The vision is: user-sovereign data, but still shareable.

You could even federate groups of topic-based archives into a shared search ecosystem, useful for domains like biotech or other specialized fields.

Another crucial point: DownloadNet archives your browsing in real time. It doesn’t crawl externally; it captures exactly what you see, including sites you access via institutional credentials (e.g. research journals behind paywalls). Then you can optionally share those archives with a trusted group.

I’m also exploring a web-document bundle format: package an interactive set of web pages (not just one) into a self-contained snapshot you can send (e.g. via email). The recipient can browse that snapshot locally, with all internal links intact, as of a particular moment in time. It’s a simple but powerful idea, and I think it has real growth potential in the data-sovereignty space. I started this as a passion project, and I believe many others care deeply about these ideas too. If you’re interested or want to get involved, head to the repository.

One way my vision differs from something like Solid is the philosophy of adoption: rather than launching with a full-blown protocol, you start with a simple tool that users adopt, extend, and share. Over time, emergent use cases and community practices shape the system. It’s bottom-up rather than top-down.

I’m not dissing Solid — I understand its aims and don’t see this as strictly competitive or exclusive. But I feel the incremental, user-led route is likelier to produce something sustainable. You grow it in the wild, learn what users actually need, and adapt. Instead of trying to design for all cases in advance, you let real-world use teach you what matters.

Anyway, that’s the gist of my vision—and how it diverges from other approaches like the one in the article you referenced. While it may seem as a condemnation of other ideas, it's not. So please don't take it that way.

If this is something you could get into, I encourage you come on over to the repo and share your contribution. I also riff more on Solid, this article and the approach of DN if you're interested, here: https://github.com/DO-SAY-GO/dn/wiki/What-is-DiskerNet-and-h...

rob_c|4 months ago

Aka, more dunking on "the cloud". Now it's cool to be able to do so.

How about we go back 20yr and train a generation of unix sysadmins and self host at companies and at home.