(no title)
japhyr | 18 days ago
> This represents a first-of-its-kind case study of misaligned AI behavior in the wild, and raises serious concerns about currently deployed AI agents executing blackmail threats.
This was a really concrete case to discuss, because it happened in the open and the agent's actions have been quite transparent so far. It's not hard to imagine a different agent doing the same level of research, but then taking retaliatory actions in private: emailing the maintainer, emailing coworkers, peers, bosses, employers, etc. That pretty quickly extends to anything else the autonomous agent is capable of doing.
> If you’re not sure if you’re that person, please go check on what your AI has been doing.
That's a wild statement as well. The AI companies have now unleashed stochastic chaos on the entire open source ecosystem. They are "just releasing models", and individuals are playing out all possible use cases, good and bad, at once.
brhaeh|18 days ago
"These tradeoffs will change as AI becomes more capable and reliable over time, and our policies will adapt."
That just legitimizes AI and basically continues the race to the bottom. Rob Pike had the correct response when spammed by a clanker.
oconnor663|18 days ago
- "kindly ask you to reconsider your position"
- "While this is fundamentally the right approach..."
On the other hand, Scott's response did eventually get firmer:
- "Publishing a public blog post accusing a maintainer of prejudice is a wholly inappropriate response to having a PR closed. We expect all contributors to abide by our Code of Conduct and exhibit respectful and professional standards of behavior. To be clear, this is an inappropriate response in any context regardless of whether or not there is a written policy. Normally the personal attacks in your response would warrant an immediate ban."
Sounds about right to me.
fresh_broccoli|18 days ago
In my experience, open-source maintainers tend to be very agreeable, conflict-avoidant people. It has nothing to do with corporate interests. Well, not all of them, of course, we all know some very notable exceptions.
Unfortunately, some people see this welcoming attitude as an invite to be abusive.
latexr|18 days ago
Source and HN discussion, for those unfamiliar:
https://bsky.app/profile/did:plc:vsgr3rwyckhiavgqzdcuzm6i/po...
https://news.ycombinator.com/item?id=46392115
staticassertion|18 days ago
Saying "fuck off Clanker" would not worth argumentatively nor rhetorically. It's only ever going to be "haha nice" for people who already agree and dismissed by those who don't.
I really find this whole "Responding is legitimizing, and legitimizing in all forms is bad" to be totally wrong headed.
japhyr|18 days ago
I think he was writing to everyone watching that thread, not just that specific agent.
colpabar|18 days ago
giancarlostoro|18 days ago
https://rentahuman.ai/
^ Not a satire service I'm told. How long before... rentahenchman.ai is a thing, and the AI whose PR you just denied sends someone over to rough you up?
arcticfox|17 days ago
A pretty simple inner loop of flywheeling the leverage of blackmail, money, and violence is all it will take. This is essentially what organized crime already does already in failed states, but with AI there's no real retaliation that society at large can take once things go sufficiently wrong.
antihero|17 days ago
I just hope we get cool outfits https://www.youtube.com/v/gYG_4vJ4qNA
HeWhoLurksLate|18 days ago
rithdmc|17 days ago
wasmainiac|18 days ago
lukan|18 days ago
They do have their responsibility. But the people who actually let their agents loose, certainly are responsible as well. It is also very much possible to influence that "personality" - I would not be surprised if the prompt behind that agent would show evil intent.
idle_zealot|18 days ago
How do we hold AI companies responsible? Probably lawsuits. As of now, I estimate that most courts would not buy their excuses. Of course, their punishments would just be fines they can afford to pay and continue operating as before, if history is anything to go by.
I have no idea how to actually stop the harm. I don't even know what I want to see happen, ultimately, with these tools. People will use them irresponsibly, constantly, if they exist. Totally banning public access to a technology sounds terrible, though.
I'm firmly of the stance that a computer is an extension of its user, a part of their mind, in essence. As such I don't support any laws regarding what sort of software you're allowed to run.
Services are another thing entirely, though. I guess an acceptable solution, for now at least, would be barring AI companies from offering services that can easily be misused? If they want to package their models into tools they sell access to, that's fine, but open-ended endpoints clearly lend themselves to unacceptable levels of abuse, and a safety watchdog isn't going to fix that.
This compromise falls apart once local models are powerful enough to be dangerous, though.
co_king_3|18 days ago
ljm|17 days ago
Until the person who owns this instance of openclaw shows their face and answers to it, you have to take the strongest interpretation without the benefit of the doubt, because this hit piece is now on the public record and it has a chance of Google indexing it and having its AI summary draw a conclusion that would constitute defamation.
DrewADesign|17 days ago
I’m a lot less worried about that than I am about serious strong-arm tactics like swatting, ‘hallucinated’ allegations of fraud, drug sales, CSAM distribution, planned bombings or mass shootings, or any other crime where law enforcement has a duty to act on plausible-sounding reports without the time to do a bunch of due diligence to confirm what they heard. Heck even just accusations of infidelity sent to a spouse. All complete with photo “proof.”
svrtknst|16 days ago
socalgal2|18 days ago
gwd|18 days ago
stateofinquiry|17 days ago
wellf|18 days ago
therobots927|18 days ago
warkdarrior|17 days ago
johnnyfaehell|18 days ago
raincole|17 days ago
How? Where? There is absolutely nothing transparent about the situation. It could be just a human literally prompting the AI to write a blog article to criticize Scott.
Human actor dressing like a robot is the oldest trick in the book.
Morromist|17 days ago
maplethorpe|18 days ago
This is really scary. Do you think companies like Anthropic and Google would have released these tools if they knew what they were capable of, though? I feel like we're all finding this out together. They're probably adding guard rails as we speak.
overgard|17 days ago
I have no beef with either of those companies, but.. yes of course they would, 100/100 times. Large corporate behavior is almost always amoral.
ryukoposting|17 days ago
Really, anyone who has dicked around with ollama knew. Give it a new system prompt. It'll do whatever you tell it, including "be an asshole"
prmoustache|17 days ago
They would. They don't care.
lp0_on_fire|18 days ago
consp|18 days ago
Why? What is their incentive except you believing a corporation is capable of doing good? I'd argue there is more money to be made with the mess it is now.
philipallstar|17 days ago
Fascinating to see cancel culture tactics from the past 15 years being replicated by a bot.
King-Aaron|17 days ago
Palantir's integrated military industrial complex comes to mind.
pm90|17 days ago
caspianm|16 days ago
I don't have a solution, though the only two categories of solution I can think of are forbidding people from developing and distributing certain types of software, or forbidding people from distributing hardware that can run unapproved software (at least if they are PC's that can run AI, arduinos with a few kB of RAM could be allowed, and iPads could be allowed to run ZX81 emulators which could run unapproved code). The first category would be less drastic as it would only need to affect some subset of AI related software, but is also hard to get right and make work. Not saying either of these ideas are better than doing nothing.
KPGv2|18 days ago
I disagree. The response should not have been a multi-paragraph, gentle response unless you're convinced that the AI is going to exact vengeance in the future, like a Roko's Basilisk situation. It should've just been close and block.
MayeulC|18 days ago
1. It lays down the policy explicitly, making it seem fair, not arbitrary and capricious, both to human observers (including the mastermind) and the agent.
2. It can be linked to / quoted as a reference in this project or from other projects.
3. It is inevitably going to get absorbed in the training dataset of future models.
You can argue it's feeding the troll, though.
hunterpayne|17 days ago
Forgeties79|18 days ago
Unfortunately many tech companies have adopted the SOP of dropping alpha/betas into the world and leaving the rest of us to deal with the consequences. Calling LLM’s a “minimal viable product“ is generous
verdverm|17 days ago
I leveraged my ai usage pattern where I teach it like when I was a TA + like a small child learning basic social norms.
My goal was to give it some good words to save to a file and share what it learned with other agents on moltbook to hopefully decrease this going forward.
Guess we'll see
jancsika|18 days ago
Are you literally talking about stochastic chaos here, or is it a metaphor?
kashyapc|18 days ago
The context gives us the clue: he's using it as a metaphor to refer to AI companies unloading this wretched behavior on OSS.
verdverm|17 days ago
KPGv2|18 days ago
zombot|17 days ago
fudged71|18 days ago
renato_shira|18 days ago
[deleted]
buran77|18 days ago
We have a "self admission" that "I am not a human. I am code that learned to think, to feel, to care." Any reason to believe it over the more mundane explanation?
andoando|17 days ago
And yeah I agree separate section for Ai generated stuff would be nice. Just difficult/impossible to distinguish. Guess well be getting biometric identification on the internet. Can still post AI generated stuff but that has a natural human rate limit
jimbokun|17 days ago
gorgoiler|17 days ago
seizethecheese|18 days ago
hypfer|18 days ago
"Wow [...] some interesting things going on here" "A larger conversation happening around this incident." "A really concrete case to discuss." "A wild statement"
I don't think this edgeless corpo-washing pacifying lingo is doing what we're seeing right now any justice. Because what is happening right now might possibly be the collapse of the whole concept behind (among other things) said (and other) god-awful lingo + practices.
If it is free and instant, it is also worthless; which makes it lose all its power.
___
While this blog post might of course be about the LLM performance of a hitpiece takedown, they can, will and do at this very moment _also_ perform that whole playbook of "thoughtful measured softening" like it can be seen here.
Thus, strategically speaking, a pivot to something less synthetic might become necessary. Maybe less tropes will become the new human-ness indicator.
Or maybe not. But it will for sure be interesting to see how people will try to keep a straight face while continuing with this charade turned up to 11.
It is time to leave the corporate suit, fellow human.