top | item 35687166

(no title)

levesque | 2 years ago

Predicting whether a text was written by a LLM or not is not trivial. What was the latest number by OpenAI? 30%? As LLMs get better, it seems like we won't be able to distinguish real text from fake text. Your LLM will be able to summarize it, but it will still be 99% spam.

discuss

grumbel|2 years ago

You don't need to predict if it what written by LLM, if it's a human or machine makes no difference to the validity of a text. You just need to be able to extract the actual information out of it and cross check it against other sources.

The summary that an LLM can provide is not just of one text, but of all the texts about the topic it has access to. Thus you never need to access the actual texts itself, just whatever the LLM condenses out of them.

Sindisil|2 years ago

"just" need to "extract the actual information out of it and cross check it against other sources".

How do you determine the trustworthiness of those other sources when an ever increasing portion are also LLM generated?

All the "you just need to" responses are predicted on being able to police the LLM output based upon your own expertise (e.g., much talk about code generation being like working with junior devs, and so being able to replace all your juniors and just have super productive seniors).

Question: how does one become an expert? Yep, it's right there: experts are made through experience.

So if LLMs replace all the low experience roles, how exactly do new experts emerge?

jerf|2 years ago

You're trusting the LLM a lot more than you should. It's entirely possible to skew those too. (Even ignoring the philosophical question of what an "unskewed" LLM would even be.) I'm actually impressed by OpenAI's efforts to do so. I also deplore them and think it's an atrocity, but I'm still impressed. The "As an AI language model" bit is just the obvious way they're skewed. I wouldn't trust an LLM any farther than I can throw it to accurately summarize anything important.

pixl97|2 years ago

>cross check it against other sources.

The problem comes in when 99.999999% of other sources are also bullshit.

mensetmanusman|2 years ago

If LLMs start writing a majority of HN comments, we won’t know what is true or not. HN will be noise and worthless then.

satisfice|2 years ago

Banal is banal, whether written by a human or not.

But GPT text is inherently deceptive, even when factually flawless— because we humans never evaluate a message merely on its factuality. We read between the lines. The same way insects are confused and fly in spirals around light, we will be flying spirals around GPT text based on our assumptions about its nature or the nature of the human whom we presume to have written it.