top | item 44272198

(no title)

xer0x | 8 months ago

Claude's increasing euphoria as a conversation goes can mislead me. I'll be exploring trade offs, and I'll introduce some novel ideas. Claude will use such enthusiasm that it will convince me that we're onto something. I'll be excited, and feed the idea back to a new conversation with Claude. It'll remind me that the idea makes risky trade offs, and would be better solved by with a simple solution. Try it out.

discuss

order

slooonz|8 months ago

They failed hard with Claude 4 IMO. I just can't have any feedback other than "What a fascinating insight" followed by a reformulation (and, to be generous, an exploration) of what I said, even when Opus 3 has no trouble finding limitations.

By comparison o3 is brutally honest (I regularly flatly get answers starting with "No, that’s wrong") and it’s awesome.

SamPatt|8 months ago

Agreed that o3 can be brutally honest. If you ask it for direct feedback, even on personal topics, it will make observations that, if a person made them, would be borderline rude.

simonw|8 months ago

Thanks for this, I just tried the same "give me feedback on this text" prompt against both o3 and Claude 4 and o3 was indeed much more useful and much less sycophantic.

SatvikBeri|8 months ago

I put this in my system prompt: "Never compliment me. Critique my ideas, ask clarifying questions, and offer better alternatives or funny insults" and it works quite well. It has frequently told me that I'm wrong, or asked what I'm actually trying to do and offered better alternatives.

renewiltord|8 months ago

LLM sycophancy is a really annoying tool, but one must imagine that most humans get a lot of pleasure from it. This is probably the optimization function that led to Google being useless to us: the rest of humanity is a lot more worthwhile to Google and they all want the other thing. The Tyranny of the Majority, if you will.

The LLM anti-sycophancy rules also break down over time, with the LLM becoming curt while simultaneously deciding that you are a God of All Thoughts.

makeset|8 months ago

My favorite is when I typo "Why is thisdfg algorithm the best solution?" and it goes "You are absolutely right! Algorithm Thisdfg is a much better solution than what I was suggesting! Thank you for catching my mistake!"

XenophileJKO|8 months ago

Just ask Claude to be "critical" and it is brutally critical.. but also a bit nihilistic. So you kind of have to temper it a little.

gexla|8 months ago

I have found it's the most brutal of all of them if you simply tell it to be "hard-nosed" or play "Devil's Advocate." Brutal partially because it will destroy an argument formulated in Gemini or ChatGPT. Using whatever I can get without subs across the board. Debating seems to be one of Claude's strong points.