(no title)
tyushk | 5 months ago
Gemini will often start responses that use the canvas tool with "Of course", which would force the model into going down a line of tokens that end up with attempting to fulfill the user's request. It happens often enough that it seems like it's not being generated by the model, but instead inserted by the backend. Maybe "you're absolutely right" is used the same way?
nicce|5 months ago
They fight for the user attention and keeping them on their platform, just like social media platforms. Correctness is secondary, user satisfaction is primary.
ZaoLahma|5 months ago
I get it - we don't want LLMs to be reinforces of bad ideas, but sometimes you need a little positivity to get past a mental barrier and do something that you want to do, even if what you want to do logically doesn't make much sense.
An "ok cool" answer is PERFECT for me to decide not to code something stupid (and learn something useful), and instead go and play video games (and learn nothing).
diggan|5 months ago
Kind of makes sense, not every user wants 100% correctness (just like in real-life).
And if I want correctness (which I do), I can make the models prioritize that, since my satisfaction is directly linked to the correctness of the responses :)
kuschku|5 months ago
And that's where everything is going wrong. We should use technology to further the enlightenment, bring us closer to the truth, even if it is an inconvenient one.
CGamesPlay|5 months ago
layer8|5 months ago
ACCount37|5 months ago
If we have RLHF in play, then human evaluators may generally prefer responses starting with "you're right" or "of course", because it makes it look like the LLM is responsive and acknowledges user feedback. Even if the LLM itself was perfectly capable of being responsive and acknowledging user feedback without emitting an explicit cue. The training will then wire that human preference into the AI, and an explicit "yes I'm paying attention to user feedback" cue will be emitted by the LLM more often.
If we have RL on harder targets, where multiturn instruction following is evaluated not by humans that are sensitive to wording changes, but by a hard eval system that is only sensitive to outcomes? The LLM may still adopt a "yes I'm paying attention to user feedback" cue because it allows it to steer its future behavior better (persona self-consistency drive). Same mechanism as what causes "double check your prior reasoning" cues such as "Wait, " to be adopted by RL'd reasoning models.
Jotalea|5 months ago
the_af|5 months ago
You have "someone" constantly praising your insight, telling you you are asking "the right questions", and obediently following orders (until you trigger some content censorship, of course). And who wouldn't want to come back? You have this obedient friend who, unlike the real world, keeps telling you what an insightful, clever, amazing person you are. It even apologizes when it has to contradict you on something. None of my friends do!
zozbot234|5 months ago
You're absolutely right! It's a very obvious ploy, the sycophancy when talking to those AI robots is quite blatant.
the_af|5 months ago