(no title)
numeri | 7 months ago
His company has also been caught adding specific instructions in this vein to its prompt.
And now it's searching for his tweets to guide its answers on political questions, and Simon somehow thinks it could be unintended, emergent behavior? Even if it were, calling this unintended would be completely ignoring higher order system dynamics (a behavior is still intended if models are rejected until one is found that implements the behavior) and the possibility of reinforcement learning to add this behavior.
simonw|7 months ago
I do not think he wants it to openly say "I am now searching for tweets from:elonmusk in order to answer this question". That's plain embarrassing for him.
That's what I meant by "I think there is a good chance this behavior is unintended".
numeri|7 months ago
> This suggests that Grok may have a weird sense of identity—if asked for its own opinions it turns to search to find previous indications of opinions expressed by itself or by its ultimate owner. I think there is a good chance this behavior is unintended!
I'd say it's far more likely that:
1. Elon ordered his research scientists to "fix it" – make it agree with him
2. They did RL (probably just basic tool use training) to encourage checking for Elon's opinions
3. They did not update the UI (for whatever reason – most likely just because research scientists aren't responsible for front-end, so they forgot)
4. Elon is likely now upset that this is shown so obviously
The key difference is that I think it's incredibly unlikely that this is emergent behavior due to an "sense of identity", as opposed to direct efforts of the xAI research team. It's likely also a case of https://en.wiktionary.org/wiki/anticipatory_obedience.
JimmaDaRustla|7 months ago
You think that's the tipping point of him being embarrassed?
grafmax|7 months ago
Psychologically I wonder if these half-baked hopes provide a kind of escapist outlet. Maybe for some people it feels safer to hide your head in the sand where you can no longer see the dangers around you.
morngn|7 months ago
Cognitive dissonance drives a lot “save the world” energy. People have undeserved wealth they might feel bad about, given prevailing moral traditions, if they weren’t so busy fighting for justice or saving the planet or something that allows them to feel more like a super hero than just another sinful human.
JimmaDaRustla|7 months ago