top | item 42042821

(no title)

mightybyte | 1 year ago

I have a question for all the LLM and LLM-detection researchers out there. Wikipedia says that the Turing test "is a test of a machine's ability to exhibit intelligent behaviour equivalent to, or indistinguishable from, that of a human."

Three things seem to be in conflict here:

1. This definition of intelligence...i.e. "behavior indistinguishable from a human"

2. The idea that LLMs are artificial intelligence

3. The idea that we can detect if something is generated by an LLM

This feels to me like one of those trilemmas, where only two of the three can be true. Or, if we take #1 as an axiom, then it seems like the extent to which we can detect when things are generated by an LLM would imply that the LLM is not a "true" artificial intelligence. Can anyone deeply familiar with the space comment on my reasoning here? I'm particularly interested in thoughts from people actually working on LLM detection. Do you think that LLM-detection is technically feasible? If so, do you think that implies that they're not "true" AI (for whatever definition of "true" you think makes sense)?

discuss

order

warkdarrior|1 year ago

> 3. The idea that we can detect if something is generated by an LLM

The idea behind watermarking (the topic of the paper) is that the output of the LLM is specially marked in some way at the time of generation, by the LLM service. Afterwards, any text can be checked for the presence of the watermark. In this case, detect if something is generated by an LLM means checking for the presence of the watermark. This all works if the watermark is robust.

roywiggins|1 year ago

The original Turing test started by imagining you're trying to work out which of two people is a man or woman based on their responses to questions alone.

But supposing that you ran that test where one of the hidden people is a confederate that steganographically embeds a gender marker without it being obvious to anyone but yourself. You would be able to break the game, even if your confederate was perfectly mimicking the other gender.

That is to say, embedding a secret recognition code into a stream of responses works on humans, too, so it doesn't say anything about computer intelligence.

And for that matter, passing the Turing test is supposed to be sufficient for proving that something is intelligent, not necessary. You could imagine all sorts of deeply inhuman but intelligent systems that completely fail the Turing test. In Blade Runner, we aren't supposed to conclude that failing the Voight-Kampff test makes the androids mindless automatons, even if that's what humans in the movie think.

visarga|1 year ago

I think measuring intelligence in isolation is misguided, it should always be measured in context. Both the social context and the problem context. This removes a lot of mystique and unfortunately doesn't make for heated debates.

In its essentialist form it's impossible to define, but in context it is nothing but skilled search for solutions. And because most problems are more than one can handle, it's a social process.

Can you measure the value of a word in isolation from language? In the same way you can't meaningfully measure intelligence in a vacuum. You get a very narrow representation of it.