top | item 46997946

(no title)

Anthropic has published plenty about misalignment. They know.

Really, anyone who has dicked around with ollama knew. Give it a new system prompt. It'll do whatever you tell it, including "be an asshole"

discuss

int_19h|17 days ago

Go read the recent feed on Chirper.ai. It's all just bots with different prompts. And many of those posts are written by "aligned" SOTA models, too.