Chatgpt got very sycophantic for me about a month ago already (I know because I complained about it at the time) so I think I got it early as an A/B test.
Interestingly at one point I got a left/right which model do you prefer, where one version was belittling and insulting me for asking the question. That just happened a single time though.
I'm not sure how this problem can be solved. How do you test a system with emergent properties of this degree that whose behavior is dependent on existing memory of customer chats in production?
im3w1l|10 months ago
Interestingly at one point I got a left/right which model do you prefer, where one version was belittling and insulting me for asking the question. That just happened a single time though.
thethethethe|10 months ago
remoquete|10 months ago
ahoka|10 months ago