top | item 46709780

(no title)

timmg | 1 month ago

I just had a fun conversation with Claude about its own "constitution". I tried to get it to talk about what it considers harm. And tried to push it a little to see where the bounds would trigger.

I honestly can't tell if it anticipated what I wanted it to say or if it was really revealing itself, but it said, "I seem to have internalized a specifically progressive definition of what's dangerous to say clearly."

Which I find kinda funny, honestly.

discuss

inimino|1 month ago

self-aware, an LLM isn't but a thinking model can be a little bit