top | item 47127118

(no title)

ElevenLathe | 8 days ago

How much did it cost to produce all the data on the internet and every book ever published? Surely even the most conservative calculations put it at multiple years of planetary GDP. The same argument can be made to say that letting the big labs get away with pirating it will disincentivize people to publish anything.

discuss

order

ceroxylon|8 days ago

I personally have stopped publishing publicly, since my research is still on the fuzzy boundary of AI's current knowledge, my website gets scraped daily, and I don't want to contribute to paid models for zero acknowledgement or compensation.

Imustaskforhelp|8 days ago

> I personally have stopped publishing publicly, since my research is still on the fuzzy boundary of AI's current knowledge, my website gets scraped daily, and I don't want to contribute to paid models for zero acknowledgement or compensation.

I don't know about your works so pardon me but thinking about it, would a better solution be for gated communities at the very least, say matrix or xmpp or irc be better?

I suppose that scraping bots of matrix would be quite hard for AI companies to setup? but anyone interested in reading your contents can still find the data if they are interested plus you get the additional benefit of a community/like-minded people.

piva00|8 days ago

Not only publishing, it has already disincentivised a huge part of what made Web 2.0: public APIs for data access to platforms.

It was amazing to be able to create some toy projects using data from big platforms, now they're all afraid LLM trainers will scrape their contents and create a competitor to their moat, the data.

It just sucks at many different levels.