top | item 45030868

(no title)

biggestfan | 6 months ago

According to their own blog post, even after mitigations, the model still has an 11% attack success rate. There's still no way I would feel comfortable giving this access to my main browser. I'm glad they're sticking to a very limited rollout for now. (Sidenote, why is this page so broken? Almost everything is hidden.)

discuss

mkozlows|6 months ago

The strong sense I got from reading this is that they don't believe it's possible to safely do this sort of thing right now, and they want to warn people away from Perplexity etc. so they can avoid losing market share while also not launching a not-yet-ready product.

(The more interesting question will be whether they have any means to eventually make it safe. I'm pretty skeptical about it in the near term.)

AdieuToLogic|6 months ago

> The strong sense I got from reading this is that they don't believe it's possible to safely do this sort of thing right now, and they want to warn people away ...

This is directly contradicted by one of the first sentences in the article:

  We've spent recent months connecting Claude to your 
  calendar, documents, and many other pieces of software. The 
  next logical step is letting Claude work directly in your 
  browser.

Ascribing altruism to the quoted intent is dissembling at best.

Szpadel|6 months ago

well, at least they are honest about it and don't try to hide it in any way. They probably want to gather more real world data for training and validation, that's why this limited release. openai have browser agent for some time already but I didn't hear about any security considerations. I bet they have the same issues

latexr|6 months ago

> at least they are honest about it and don't try to hide it in any way.

Seems more likely they’re trying to cover their own ass, so when anything inevitably goes wrong they can point and say “see, we told you it was dangerous, not our fault”.

pharrington|6 months ago

Honesty would be Anthropic paying the 1000 alpha testers a fair wage for their very dangerous QA work.

aquova|6 months ago

I'm honestly dumbfounded this made it off the cutting room floor. A 1 in 9 chance for a given attack to succeed? And that's just the tests they came up with! You couldn't pay me to use it, which is good, because I doubt my account would keep that money in it for long.

rvz|6 months ago

> According to their own blog post, even after mitigations, the model still has an 11% attack success rate.

That is really bad. Even after all those mitigations imagine the other AI browsers being at their worst. Perplexity's Comet showed how a simple summarization can lead to your account being hijacked.

> (Sidenote, why is this page so broken? Almost everything is hidden.)

They vibe-coded the site with Claude and didn't test it before deploying. That is quite a botched amateur launch for engineers to do at Anthropic.

mark242|6 months ago

11% success rate for what is effectively a spear-phishing attempt isn't that terrible and tbh it'll be easier to train Claude not to get tricked than it is to train eg my parents.

zaphirplane|6 months ago

What ! 1 in 10 successfully phished is ok ? 1 in 10 page views. That has to approach 100% success rate over a week say month of browsing the web with targeted ads and/or link farms to get the page click

whatevertrevor|6 months ago

The kind of attack vector is irrelevant here, what's important is the attack surface. Not to mention this is a tool facilitating the attack, with little to no direct interaction with the user in some cases. Just because spear-phishing is old and boring doesn't mean it cannot have real consequences.

(Even if we agree with the premise that this is just "spear-phishing", which honestly a semantics argument that is irrelevant to the more pertinent question of how important it is to prevent this attack vector)

asdff|6 months ago

>Claude not to get tricked than it is to train eg my parents.

One would think but apparently from this blog post it is still succeptible to the same old prompt injections that have always been around. So I'm thinking it is not very easy to train Claude like this at all. Meanwhile with parents you could probably eliminate an entire security vector outright if you merely told them "bank at the local branch," or "call the number on the card for the bank don't try and look it up."

lelanthran|6 months ago

With spear phishing there are a limited number of attack attempts, maybe one a day and the target will wise up.

With this you can probably try a few thousand attempts per minute.