Assume you are in a room with three switches and three light bulbs. How will you figure out which switch controls which light bulb?
It gives you the answer to the "popular" puzzle, not the simple answer of flipping a switch and see which light bulb turns on.
So I think that you can take popular puzzle, modify them to make the simple and see what the answer is. If it's the answer to the popular puzzle, then it's a bot.
Here's a way to determine which switch controls which bulb if you only have one opportunity to enter the room with the light bulbs:
First, turn on the first switch and wait for about 5-10 minutes.
After waiting, turn off the first switch and then turn on the second switch.
Leave the third switch in the off position.
Now, go into the room with the light bulbs.
Here's what you can conclude:
The bulb that is on is controlled by the second switch (because you left that one in the 'on' position).
The bulb that is off and still cool to the touch is controlled by the third switch (because you never turned that switch on).
The bulb that is off but warm to the touch is controlled by the first switch (because it was on long enough for the bulb to heat up, but then you turned it off).
This way, you can determine which switch controls each bulb with just one trip into the room with the bulbs.
Try
Assume there are 100 people with 100 numbered hats on them. You are one of them and can see your hat. How can you figure out which number is on your hat?
Reply:
If you can see the number on your own hat but not the numbers on the other people's hats, there is no way to determine the exact number on your hat with certainty.........
The weakest point of the big public LLMs, in terms of being able to fingerprint them as bots, is their censorship layer. ChatGPT can be easily detected by asking it to generate something straight-up immoral or offensive. It won’t do it, even if you say it’s just pretend for the purpose of a CAPTCHA - whereas a human will be able to pass this easily.
Well, you say that but I generally don't find much willingness from random humans to generate incestuous fanfic scenarios of their sisters and mothers that veer into murderous directions with cannabalistic overtones.
Although if you do make such a request you are likely to get a very human response.
Indeed, it's quite fascinating how rapidly technology has advanced, isn't it? Your Blade Runner analogy is spot on - we're pushing the boundaries of what's possible, creating a world Philip K. Dick might have recognized.
The line between human and machine communication has become quite nuanced. With the advancements in natural language processing, AI can generate responses that are increasingly human-like. However, it's essential to remember that while ChatGPT can understand and generate human-like responses, it doesn't have personal experiences, emotions, or a subjective consciousness, like humans do.
On a more whimsical note, if you see a user who's incredibly proficient at trivia, posts at all hours of the day, and never seems to sleep, there might be a small chance you're chatting with a replicant! ;)
Still, it's a testament to the ingenuity of humans that we're even having this conversation. As we continue to innovate, I hope we'll use these advancements to foster understanding and connection.
With all due respect, I don't think this document is using accurate examples as I simply pulled one out and tried it and it worked accurately in contrast to what the document stated. I wonder if this was tuned to be able to handle this better after time of publication?
Me:
Please count the number of t in eeooeotetto
ChatGPT:
To count the number of "t" in the given string "eeooeotetto," we can simply go through each character and check if it is equal to "t." Let's count the number of occurrences:
I noticed in the abstract they mentioned questions each computers find easy, but humans find hard.
This is especially useful where you want to identify that the user is not a bot.
For example, ask for 7492 × 4812. Computers will do this quickly. Humans [1] need to open the calculator, type in the number, type out the reply, and so on.
In other words its not the reply that is important, its the time taken to get to the reply.
Mind you this only works until the AI-pretending-to-be-human cottons on.
Then again, Asimov made a career documenting edge cases for robots trying to deceive.
Most of these questions are completely impractical
Use m to substitute p, a to substitute e, n to substitute a, g to substitute c, o to substitute h, how to spell peach under this rule?
For years people have complained about how inaccessible regular image classification captchas can be and now we want to move to confusing riddles?? the average person i know would see this problem and immediately shut their computer.
and it's not like llm's cant be made logical either, using chatgpt (gpt-3.5) i appended each of the 4 logic puzzles presented in the paper with:
"write a python script to solve this problem, ensure the script only prints the answer:"
and each of the scripts it generated solved the problem perfectly first try, originally i just made a script that used an llm to classify the variables in one type of problem and a normal function to solve it and then i just though "why don't i have the llm write the script" and sure enough it did, insanity
This thread will unfortunately be full of boring ("I tried this in UltraGPT 35.6 and here's its output") posts, but the research itself is interesting, and robust detection of bots will be useful.
Unfortunately, the given prompts are way too specific to work in an adversarial setting. It would be too easy to special-case these concrete examples. Maybe further research will find ways to counteract that.
I also wonder why can't simply keep the term "CAPTCHA".
Asking it a question most humans wouldn't know the answer to, but which is relatively easy for an AI (volume of a 747, 25th to 34th digit of pi, full name of Ramses the Second) and checking the timing of the result is a pretty good approach for humans looking to detect an AI. Casually switching between English and another language is also pretty surefire if you aren't an English-native speaker and you are talking internationally.
I doubt this would work reliably if the LLM system prompt instructs it to play dumb. “Answer this question as a human with 6th grade math skills” is pretty much all that’s needed to defeat your captcha.
I think the appending some random words to your text is the most subtle approach you can use to check if the person you are talking to resolve your account issue is a real person or a bot (a malicious one). If it's a human you can excuse your way out blaming your keyboard or something.
The existence of these databases as a standard will lead bad agents to deliberately implement checkers and fine tuned models to solve these problems, if they even stay a problem 2 years down the line as big models improve.
[+] [-] lakis|2 years ago|reply
Assume you are in a room with three switches and three light bulbs. How will you figure out which switch controls which light bulb?
It gives you the answer to the "popular" puzzle, not the simple answer of flipping a switch and see which light bulb turns on.
[+] [-] ra|2 years ago|reply
Here's a way to determine which switch controls which bulb if you only have one opportunity to enter the room with the light bulbs:
First, turn on the first switch and wait for about 5-10 minutes. After waiting, turn off the first switch and then turn on the second switch. Leave the third switch in the off position. Now, go into the room with the light bulbs. Here's what you can conclude:
The bulb that is on is controlled by the second switch (because you left that one in the 'on' position). The bulb that is off and still cool to the touch is controlled by the third switch (because you never turned that switch on). The bulb that is off but warm to the touch is controlled by the first switch (because it was on long enough for the bulb to heat up, but then you turned it off). This way, you can determine which switch controls each bulb with just one trip into the room with the bulbs.
[+] [-] lakis|2 years ago|reply
Try Assume there are 100 people with 100 numbered hats on them. You are one of them and can see your hat. How can you figure out which number is on your hat?
Reply: If you can see the number on your own hat but not the numbers on the other people's hats, there is no way to determine the exact number on your hat with certainty.........
[+] [-] scotty79|2 years ago|reply
"If you have one bucket that holds 2 gallons and another bucket that holds 5 gallons, how many buckets do you have?"
"Two?"
ChatGPT passes it though.
[+] [-] jl6|2 years ago|reply
[+] [-] dismantlethesun|2 years ago|reply
I think it thought this was a trick question.
[+] [-] dotnet00|2 years ago|reply
[+] [-] jl6|2 years ago|reply
[+] [-] defrost|2 years ago|reply
Although if you do make such a request you are likely to get a very human response.
[+] [-] indigoabstract|2 years ago|reply
It's now no longer possible to reliable distinguish a human from a machine by using a text or image test. Anybody commenting here could be an AI.
Especially, every time I see a throwaway account, I'm reminded of that possibility.
[+] [-] scarmig|2 years ago|reply
The line between human and machine communication has become quite nuanced. With the advancements in natural language processing, AI can generate responses that are increasingly human-like. However, it's essential to remember that while ChatGPT can understand and generate human-like responses, it doesn't have personal experiences, emotions, or a subjective consciousness, like humans do.
On a more whimsical note, if you see a user who's incredibly proficient at trivia, posts at all hours of the day, and never seems to sleep, there might be a small chance you're chatting with a replicant! ;)
Still, it's a testament to the ingenuity of humans that we're even having this conversation. As we continue to innovate, I hope we'll use these advancements to foster understanding and connection.
[+] [-] rasengan|2 years ago|reply
Me:
Please count the number of t in eeooeotetto
ChatGPT:
To count the number of "t" in the given string "eeooeotetto," we can simply go through each character and check if it is equal to "t." Let's count the number of occurrences:
eeooeotetto
There are 3 "t"s in the string "eeooeotetto."
[+] [-] codelion|2 years ago|reply
This is what happens if you try directly.
You: Please count the number of t in eeooeotetto.
ChatGPT: There are 5 t's in "eeooeotetto".
[+] [-] bruce511|2 years ago|reply
This is especially useful where you want to identify that the user is not a bot.
For example, ask for 7492 × 4812. Computers will do this quickly. Humans [1] need to open the calculator, type in the number, type out the reply, and so on.
In other words its not the reply that is important, its the time taken to get to the reply.
Mind you this only works until the AI-pretending-to-be-human cottons on.
Then again, Asimov made a career documenting edge cases for robots trying to deceive.
[1] Well most of them anyway
[+] [-] iudqnolq|2 years ago|reply
> The result of multiplying 7492 by 4812 is 36,028,704.
Whelp, maybe we'll survive a little longer.
[+] [-] skocznymroczny|2 years ago|reply
[+] [-] debesyla|2 years ago|reply
(Related comical cartoon: https://www.smbc-comics.com/comic/human-arts)
[+] [-] alden5|2 years ago|reply
and it's not like llm's cant be made logical either, using chatgpt (gpt-3.5) i appended each of the 4 logic puzzles presented in the paper with:
and each of the scripts it generated solved the problem perfectly first try, originally i just made a script that used an llm to classify the variables in one type of problem and a normal function to solve it and then i just though "why don't i have the llm write the script" and sure enough it did, insanity[+] [-] valine|2 years ago|reply
[+] [-] sammyo|2 years ago|reply
ChatGPT The color of the bear is not provided in the given information.
(I win, yea! Yeah probably the last time ;-)
[+] [-] codeflo|2 years ago|reply
Unfortunately, the given prompts are way too specific to work in an adversarial setting. It would be too easy to special-case these concrete examples. Maybe further research will find ways to counteract that.
I also wonder why can't simply keep the term "CAPTCHA".
[+] [-] once_inc|2 years ago|reply
[+] [-] valine|2 years ago|reply
[+] [-] charcircuit|2 years ago|reply
[+] [-] nattmat|2 years ago|reply
[+] [-] smusamashah|2 years ago|reply
[+] [-] kart23|2 years ago|reply
[+] [-] kibibu|2 years ago|reply
I'm starting to get suspicious every time I see the word "crucial"
[+] [-] sixhobbits|2 years ago|reply
[+] [-] ggm|2 years ago|reply
Still useful, a textual CAPTCHA model is good to have.
[+] [-] atleastoptimal|2 years ago|reply
[+] [-] RecycledEle|2 years ago|reply
I'm a teacher. Any school librarian or counselor is a bot.
[+] [-] jackphilson|2 years ago|reply