top | item 39422666

(no title)

opportune | 2 years ago

The internet is already mostly filled with low quality bullshit though, and GPT-4/Gemini are much better writers than whoever is churning out SEO as we know it now.

It's a lazy argument to imply this 1. invalidates the technological achievement 2. prevents iterative improvement a la the singularity. For one, the Internet itself is not bullshit just because a lot of spammers/hustlers put bad content on it to try to make money. And secondly, you can curate datasets... nothings stopping researchers from training LLMs on shitty SEO now, and if they wanted to they could curate datasets going forward to try to prevent LLM-spam from entering the training sets of future models.

And finally, people already use reputation/identity/branding and proxies for it as quality filters on the internet. For example, this is an unfamiliar blogger to me and so I entered it with skepticism I wouldn't have with people like Gwern or Lynn Alden. Good writing from people like Gwern and Lynn Alden won't disappear just because LLM content exists on the internet - it just makes reputations and identity (eg to a real human) more important.

discuss

rocqua|2 years ago

You are spot-on (I think) with the point that identify and reputation will become much more important. I hope this will end up with system that help cultivate and verify reputation. I fear however that the identity leg is much easier to tackle.

rebolek|2 years ago

If LLM needs so much babysitting then singularity is certainly not going to happen.

opportune|2 years ago

What does me telling LLMs to write articles for "best vacuum cleaners 2024" and putting it on the Internet have to do with the ability of LLMs to improve themselves? Humans write those kinds of articles for the Internet as it is, and yet humans are the ones designing and improving LLMs now.

horeszko|2 years ago

Good content won't disappear necessarily, it could however be drowned out by BS as the author states.

Its the needle in a haystack problem, where AI is used to increase the size of the haystack, making the needle (i.e. quality content) harder to find.

opportune|2 years ago

The Internet is a pull model. I can go to Gwern's website directly and not care that most other websites have crap on them.

People choose to use push models for content through meta properties, tiktok, and aggregators like reddit and HN, but nothing is forcing them to. If they push enough bad content, people won't keep using them. Already happened with Facebook and Reddit predecessors, probably happening to Reddit now.

It doesn't matter how big the haystack is when you have the ability to go directly to the needle.