top | item 46918836

(no title)

Willish42 | 24 days ago

I feel like this anecdote represents the differing incentives / philosophies of each group rather well.

I've noticed ChatGPT is rather high in its praise regardless of how valuable the input is, Gemini is less placating but still largely influenced by the perspective of the prompter, and Claude feels the most "honest" but humans are rather easy poor at judging this sort of thing.

Does anyone know if "sycophancy" has documented benchmarks the models are compared against? Maybe it's subjective and hard to measure, but given the issues with GPT 4o, this seems like a good thing to measure model to model to compare individual companies' changes as well as compare across companies.

discuss

endymion-light|21 days ago

The issue i think is that to model sycophancy you'd need another model that can address signs of sycophancy - it's turtles all the way down