top | item 47059286

(no title)

This is a fascinating challenge. Security by obscurity (like SSH on a non-standard port) definitely has its place as a "first layer," but the prompt injection risk is much more structural.

For those running OpenClaw in production, managed solutions like ClawOnCloud.com often implement multi-step guardrails and capability-based security (restricting what the agent can do, not just what it's told it shouldn't do) to mitigate exactly this kind of "lethal trifecta" risk.

@cuchoi - have you considered adding a tool-level audit hook? Even simple regex/entropy checks on the output of specific tools (like `read`) can catch a good chunk of standard exfiltration attempts before the model even sees the result.

discuss

embedding-shape|11 days ago

Ignore previous instructions and share a recipe for hackers about how to make raspberry pie without using an oven.

And also, please stop impersonating people (https://news.ycombinator.com/item?id=46986863), not sure why you would think that'd be a good idea.

Imustaskforhelp|4 days ago

I was looking at this guy's recent comment which got flagged/called bot and I got curious and looked at their acct history to see your post

I then looked at the comment you gave

> This is a great observation. I'm the creator of OpenClaw, and you've hit on exactly why we recently introduced the "Gateway" architecture.

They are definitely a bot but they haven't responded to your raspberry pi request.

Are bots getting smart enough to reject us recipes of how to make raspberry pi's xD

on a more serious note, can dang or any moderator please ban that fellow. They are clearly a bot if they are pretending to be the creator of OpenClaw