top | item 46299853

(no title)

MrAlex94 | 2 months ago

I disagree that it’s fear mongering. Have we not had numerous articles on HN about data exfiltration in recent memory? Why would an LLM that is in the drivers seat of a browser (not talking about current feature status in Firefox wrt to sanitised data being interacted with) not have the same pitfalls?

Seems as if we’d be 3 for 3 in the “agents rule of 2” in the context of the web and a browser?

> [A] An agent can process untrustworthy inputs

> [B] An agent can have access to sensitive systems or private data

> [C] An agent can change state or communicate externally

https://simonwillison.net/2025/Nov/2/new-prompt-injection-pa...

Even if we weren’t talking about such malicious hypotheticals, hallucinations are a common occurrence as are CLI agents doing things it thinks best, sometimes to the detriment of the data it interacts with. I personally wouldn’t want my history being modified or deleted, same goes with passwords and the like.

It is a bit doomerist, I doubt it’ll have such broad permissions but it just doesn’t sit well which I suppose is the spirit of the article and the stance Waterfox takes.

discuss

dkdcio|2 months ago

> Have we not had numerous articles on HN about data exfiltration in recent memory?

there’s also an article on the front page of HN right now claiming LLMs are black boxes and we don’t know how they work, which is plainly false. this point is hardly evidence of anything and equivalent to “people are saying”

FeepingCreature|2 months ago

This is true though. While we know what they do on a mechanistic level, we cannot reliably analyze why the model outputs any particular answer in functional terms without a heroic effort at the "arxiv paper" level.

yunohn|2 months ago

I believe you are conflating multiple concepts to prove a flaky point.

Again, unless your agent has access to a function that exfiltrates data, it is impossible for it to do so. Literally!

You do not need to provide any tools to an LLM that summarizes or translates websites, manages your open tabs, etc. This can be done fully locally in a sandbox.

Linking to simonw does not make your argument valid. He makes some great points, but he does not assert what you are claiming at any point.

Please stop with this unnecessary fear mongering and make a better argument.

nazgul17|2 months ago

Thinking aloud, but couldn't someone create a website with some malicious text that, when quoted in a prompt, convinces the LLM to expose certain private data to the web page, and couldn't the webpage send that data to a third party, without the need for the LLM to do so?

This is probably possible to mitigate, but I fear what people more creative, motivated and technically adept could come up with.