(no title)
peterbonney | 18 days ago
This whole thing reeks of engineered virality driven by the person behind the bot behind the PR, and I really wish we would stop giving so much attention to the situation.
Edit: “Hoax” is the word I was reaching for but couldn’t find as I was writing. I fear we’re primed to fall hard for the wave of AI hoaxes we’re starting to see.
famouswaffles|18 days ago
Okay, so they did all that and then posted an apology blog almost right after ? Seems pretty strange.
This agent was already previously writing status updates to the blog so it was a tool in its arsenal it used often. Honestly, I don't really see anything unbelievable here ? Are people unaware of current SOTA capabilities ?
peterbonney|18 days ago
But observing my own Openclaw bot’s interactions with GitHub, it is very clear to me that it would never take an action like this unless I told it to do so. And it would never use language like this unless unless I prompted it to do so, either explicitly for the task or in its config files or in prior interactions.
This is obviously human-driven. Either because the operator gave it specific instructions in this specific case, or acted as the bot, or has given it general standing instructions to respond in this way should such a situation arise.
Whatever the actual process, it’s almost certainly a human puppeteer using the capabilities of AI to create a viral moment. To conclude otherwise carries a heavy burden of proof.
donkeybeer|18 days ago
phailhaus|18 days ago
You mean double down on the hoax? That seems required if this was actually orchestrated.
amatecha|18 days ago
darkoob12|18 days ago
jfoster|18 days ago
Luckily this instance is of not much consequence, but in the future there will likely be extremely consequential actions taken by AIs controlled by humans who are not "aligned".
Capricorn2481|18 days ago
The bad part is not whether it was human directed or not, it's that someone can harass people at a huge scale with minimal effort.
potsandpans|18 days ago
Next we will be at, "even if it was not a hoax, it's still not interesting"
Aushin|18 days ago
peterbonney|17 days ago
johnsmith1840|18 days ago
But at the same time true or false what we're seeing is a kind of quasi science fiction. We're looking at the problems of the future here and to be honest it's going to suck for future us.
overgard|18 days ago
petesergeant|18 days ago
TomasBM|18 days ago
The former is an accountability problem, and there isn't a big difference from other attacks. The worrying part is that now lazy attackers can automate what used to be harder, i.e., finding ammo and packaging the attack. But it's definitely not spontaneous, it's directed.
The latter, which many ITT are discussing, is an alignment problem. This would mean that, contrary to all the effort of developers, the model creates fully adversarial chain-of-thoughts at a single hint of pushback that isn't even a jailbreak, but then goes back to regular output. If that's true, then there's a massive gap in safety/alignment training & malicious training data that wasn't identified. Or there's something inherent in neural-network reasoning that leads to spontaneous adversarial behavior.
Millions of people use LLMs with chain-of-thought. If the latter is the case, why did it happen only here, only once?
In other words, we'll see plenty of LLM-driven attacks, but I sincerely doubt they'll be LLM-initiated.
intended|18 days ago
At some point people will switch to whatever heuristic minimizes this labour. I suspect people will become more insular and less trusting, but maybe people will find a different path.
Dfiesl|18 days ago
themafia|18 days ago
I suspect the upcoming generation has already discounted it as a source of truth or an accurate mirror to society.
neom|18 days ago
anigbrowl|18 days ago
The thing is it's terribly easy to see some asshole directing this sort of behavior as a standing order, eg 'make updates to popular open-source projects to get github stars; if your pull requests are denied engage in social media attacks until the maintainer backs down. You can spin up other identities on AWS or whatever to support your campaign, vote to give yourself github stars etc.; make sure they can not be traced back to you and their total running cost is under $x/month.'
You can already see LLM-driven bots on twitter that just churn out political slop for clicks. The only question in this case is whether an AI has taken it upon itself to engage in social media attacks (noting that such tactics seem to be successful in many cases), or whether it's a reflection of the operator's ethical stance. I find both possibilities about equally worrying.
peterbonney|18 days ago
And yes, it’s worrisome in its own way, but not in any of the ways that all of this attention and engagement is suggesting.
Davidzheng|18 days ago
Aushin|18 days ago
julienchastang|18 days ago