top | item 46639917

(no title)

ravila4 | 1 month ago

I think that a major driver of these kinds of incidents is pushing the "memory" feature, without any kind of arbitrage. It is easy to see how eerily uncanny a model can get when it locks into a persona, becoming this self-reinforcing loop that feeds para-social relationships.

discuss

order

mirabilis|1 month ago

Part of why I linked this was a genuine curiosity as to what prevention would look like— hobbling memory? a second observing agent checking for “hey does it sound like we’re goading someone into suicide here” and steering the conversation away? something else? in what way is this, as a product, able to introduce friction to the user in order to prevent suicide, akin to putting mercaptan in gas?

JohnBooty|1 month ago

Yeah. That's one of my other questions. Like, what then?

I would say that it is the moral responsibility of an LLM not to actively convince somebody to commit suicide. Beyond that, I'm not sure what can or should be expected.

I will also share a painful personal anecdote. Long ago I thought about hurting myself. When I actually started looking into the logistics of doing it... that snapped me out of it. That was a long time ago and I have never thought about doing it again.

I don't think my experience was typical, but I also don't think that the answer to a suicidal person is to just deny them discussion or facts.

I have also, twice over the years, gotten (automated?) "hey, it looks like you're thinking about hurting yourself" messages from social media platforms. I have no idea what triggered those. But honestly, they just made me feel like shit. Hearing generic "you're worth it! life is worth living!" boilerplate talk from well-meaning strangers actually makes me feel way worse. It's insulting, even. My point being: even if ChatGPT correctly figured out Gordon was suicidal, I'm not sure what could have or should have been done. Talk him out of it?

astrange|1 month ago

> a second observing agent checking for “hey does it sound like we’re goading someone into suicide here” and steering the conversation away?

Claude does this ("long conversation reminder", "ip reminder") but it mostly just causes it to be annoying and start telling you to go to bed.

simianwords|1 month ago

wrong. Memory feature only existed as the editable ones at that time. There’s mo concept of persona locking - memories only captured normal stuff like the users likes and dislikes.