WingNews

notahacker|2 years ago

> Honestly, given that ChatGPT produces higher quality output, the next generation of spam blogs will (sadly) probably be an improvement over the sea of crap we have now.

I think that's questionable. Gibberish is easily skipped over, bargain basement ESL writing with no familiarity with the product sticks out a mile, but with GPT you get a kind of reverse "uncanny valley" effect where it's written in polished English and says just enough in the opening paragraphs that sounds informed to lead you to read half the piece before you realise it's machine generated bullshit. So you waste a lot more time on something that has no more insight (and cost even less to produce)

majormajor|2 years ago

The "demand" for bottom-of-the-barrel "10 things you didn't know about X" or "a lazy tutorial on how to patch a roof" spam is probably close to being met already.

So if demand stays the same, but the cost to produce it gets even lower, the ecosystem can support more players each pumping out their own version of all the spam sites. Stupid fake number example: Instead of spending $1000 a month to push out 100 articles, you spend $200 a month to push out 100 articles, and now only need to bring in $300 instead of $1100. So now there's another $800 worth of views that others could capture without putting you in the red, while letting them be in the green. So we may just end up with even more blogspam pushing the unique sources down the results list in Google. It's not likely to be meaningfully better based on my experience with "naive" prompting (the people creating these articles aren't going to be tuning their prompts to get good stuff per topic, they'll just get the quick and dirty stuff).

But I don't think that's where it's gonna get fun.

You know where there could be a lot of demand for higher-quality, harder-to-spot spam? Corporate product marketing and politics. Astroturfing in far more writing styles for far less money. Individuals running more fake accounts on Reddit, HN, etc, pushing more "unique"-but-repetitive copies of their views, while sprinkling in a fair bit of on topic info on a wider variety of other topics.

barbariangrunge|2 years ago

there is no demand. It's just seo spam, and users like me just ctrl+click most of the first page of search results hoping that one of them will be usable. Over time, I'm less and less able to find a single relevant result though. The seo spam is incredible now

kristianp|2 years ago

They said gpt would increase productivity, but I don't think they meant the productivity of spammers!

paulgb|2 years ago

> This has been a problem for years.

Hah, yep. I used to build tools to detect it at an adtech startup. A common approach at the time was to take someone else's text and use naive thesaurus replacement so that it would be just barely comprehensible but statistically look like english. So “You can catch the mouse” might become “you jar trap the rodent”. Glad to see how far technology has progressed! /s

duskwuff|2 years ago

And, unfortunately, it's still an ongoing problem. Some of it even ends up getting published by IEEE, e.g. [1].

There was some research a few years ago [2] into just how widespread this issue was in scientific publishing. The situation has likely only gotten worse with the introduction of higher-quality text generation LLMs.

[1]: https://ieeexplore.ieee.org/document/8597261

[2]: https://arxiv.org/abs/2107.06751

TechBro8615|2 years ago

I've noticed this sometimes where it replaces proper nouns with a synonym, e.g. "Bill Gates" becomes "Invoice Gates." Unfortunately that pattern only applies to the most bottom-of-the-barrel SEO spam. I expect ChatGPT output will be more subtle, but if a lazy spammer doesn't obfuscate it enough, there will still be some tells - e.g. the five paragraph essay format with a conclusion beginning with "overall..."

seanp2k2|2 years ago

yay the arms race between ROI and computational power required to generate spam vs ROI and computational power required to tell if something is indeed spam, the inescapable cat-and-mouse game between users just trying to find something that isn't garbage while a million hawksters attempt to dupe them into buying their garbage products

bko|2 years ago

We've had spam for a long time. Much of it used to be farmed out to low cost countries where people would summarize or aggregate data and present it in the most aggressively monetized way possible. AI spam will be better. I personally don't mind reading AI generated content. I prefer GPT over stack overflow written by humans for 95% of what I need help with. I also don't think it'll lead to more spam, as it probably wasn't supply constrained anyway. The demand for spam is pretty inelastic, meaning that as the supply shifts, it'll just lower the price and not impact quantity demanded much. It'll be better spam and the advertisers will have to pay less for it. Spammer margins will also likely go down as there is a lot more competition from the tooling.

transcriptase|2 years ago

One of the questions I have is whether models are being trained on the SEO {spam|blogspam|adsense optimized|spun} websites.

visarga|2 years ago

I believe low to medium quality algorithmically-generated content is easily identifiable with large language models. Consequently, spammers may find themselves needing to leverage LLMs to produce text that appears high in quality. This implies that the content generated could either be of substantial quality or well-disguised gibberish.

Therefore, some form of reputation system remains necessary, like those used for scientific publications. I predict websites that provide a trusted reputation system will have a lot to gain in the future. Github stars, upvotes, retweets, citation count, or just good moderation - they will be essential in solving the spam/hallucination problem.

rchaud|2 years ago

Respectfully, to expect to use AI to solve a problem created by using AI, is wishful thinking.

We haven't even solved the email spam problem, and that started 30 years ago! We have simply accepted that the largest players in the business (MS, Gmail, Yahoo) will decide on our behalf which emails we will actually see. If you want to start your own email service, fine, just keep in mind that the big players might think you're a spammer.

Kim_Bruning|2 years ago

https://xkcd.com/810/

Mission accomplished?

seer|2 years ago

I’ve though I’ve seen all of the xkcd comics and from time to time stumble again at something like this and marvel at how prophetic his work is.

Totally off topic, but I just think that the producers of shows like black mirror should just hire the guy and he’ll come up with plots for really disturbing things not the mellow obvious crap they try to push now…

rdtsc|2 years ago

The real fun will start when ChatGPT starts using its own gibberish for learning. Eventually everything on the internet will be ChatGPT many times over recycled garbage. It will rewrite wikipedia, news, history.

thih9|2 years ago

Will search engines turn into searchable gpt directories then?

It does sound like an improvement short term, but long term I wish there was another option.

hot_gril|2 years ago

At least the "123" keyword still works on Google.

klntsky|2 years ago

> the next generation of spam blogs will (sadly) probably be an improvement over the sea of crap we have now.

Maybe even to an extent such that they will be actually useful

kristianp|2 years ago

The problem is that they will be filled with hallucinations: unlike "human spam" which is based on facts.

(no title)

discuss