top | item 43423070

(no title)

harhargange | 11 months ago

Can IPFS or torrent and large local databases decentralised by people be a solution to this? I personally have the resources to share and host TBs of data but didn't find a good use to it.

discuss

order

finnthehuman|11 months ago

For that to work, a website has to push a mirror into that alternate system, and the scraper has to know the associated mirror exists.

That's two big "ifs" for something I'm not aware of a standardized way of announcing. And the entire thing crumbles as soon as someone who wants every drop of data possible says "crawl their sites anyway to make sure they didn't forget to publish anything into the 2nd system."

GTP|11 months ago

I doubt, as the article mentions scraping the same resource after just 6 hours. AI companies want to make sure they have fresh data, whileit would be hard to keep such a database updated.

Self-Perfection|11 months ago

This thought came to me as well.

This way crawlers might contribute back by providing extra storage and bandwidth.

Though something like ZeroNet seems a better approach to allow dynamic content.

smashah|11 months ago

The real solution to this is for cloud fees to fall by 80%.