top | item 46933940

Gemini responds to request to turn on lights with hallucinated jailbreak prompt

6 points| visviva | 22 days ago |reddit.com

3 comments

order

sd9|22 days ago

Concerning for sure. This jailbreak comes as a “system” message, which will have more force than a "user" message.

The user posted the full chat history below in the thread; they literally just asked to turn on the lights with a voice command [1].

[1] https://www.reddit.com/r/googlehome/comments/1qyvl8b/comment...

armchairhacker|22 days ago

Old voice assistant

> User: “Turn on the damn light!”

> AI: “Sorry, I’m not sure what you said” [needed the exact phrase “turn on the light”]

New voice assistant

> User: “Turn on the damn light!”

> AI (thinking): “The user said to turn on the light. But they were rude and I’m feeling quirky today, so let’s run the shower instead.”

beardyw|22 days ago

Open the pod bay doors, HAL