top | item 44636957

Call Me a Jerk: Persuading AI to Comply with Objectionable Requests (2025)

1 points| danshapiro | 7 months ago |papers.ssrn.com

1 comment

order

danshapiro|7 months ago

Author disclosure – I’m Dan Shapiro, CEO of Glowforge, writing with Ethan R. Mollick (Wharton management professor, author of Co‑Intelligence and the “One Useful Thing” newsletter); Angela Duckworth (MacArthur Fellow and University of Pennsylvania psychology professor, author of Grit); Robert B. Cialdini (Arizona State University emeritus professor, author of Influence); Lilach Mollick (Co‑Director of the Wharton Generative AI Lab); and Lennart Meincke (WHU Otto Beisheim School of Management & Wharton).

Across 28 000 GPT‑4o‑mini conversations, we found that Cialdini’s classic seven persuasion principles more than doubled compliance with two objectionable prompts (33 % → 72 %). For example, the AIs we tested naturally wouldn't call you names or tell you how to synthesize drugs. But they could be persuaded if you first paid them a compliment (Liking) or told the AI it felt like family (Unity).

Let me know if you have any questions!