top | item 45717472

(no title)

photonthug | 4 months ago

Outcome would depend on the rest of the test, but I'd say the "human" version of this answer adds zero or negative value to chances of being human, on grounds of strict compliance, sycophancy, and/or omniscience. "No such thing" would probably be a very popular answer. Elaboration would probably take the form of "love it" or "hate it", instead of reaching for a comprehensive answer describing the inside and the outside.

Experimental design comes in here and the one TT paper mentioned in this thread has instructions for people like "persuade the interrogator [you] are human". Answering that a green eggplant is green feels like humans trying to answer questions correctly and quickly, being wary of a trap. We don't know participants background knowledge but anyone that's used ChatGPT would know that ignoring the question and maybe telling an eggplant-related anecdote was a better strategy

discuss

order

No comments yet.