(no title)
valleyer | 28 days ago
At least for Codex, the agent runs commands inside an OS-provided sandbox (Seatbelt on macOS, and other stuff on other platforms). It does not end up "making the agent mostly useless".
valleyer | 28 days ago
At least for Codex, the agent runs commands inside an OS-provided sandbox (Seatbelt on macOS, and other stuff on other platforms). It does not end up "making the agent mostly useless".
chr15m|28 days ago
"But that is annoying and will slow me down!" Yes, and so will recovering from disastrous tool calls.
hk__2|28 days ago
mbrock|28 days ago
Even if you get people to sit and press a button every time the agent wants to do anything, you're not getting the actual alertness and rigor that would prevent disasters. You're getting a bored, inattentive person who could be doing something more valuable than micromanaging Claude.
Managing capabilities for agents is an interesting problem. Working on that seems more fun and valuable than sitting around pressing "OK" whenever the clanker wants to take actions that are harmless in a vast majority of cases.
threecheese|28 days ago
theshrike79|27 days ago
Secure, yes? Annoying, also yes. Very error-prone too.
0xbadcafebee|28 days ago
beacon294|28 days ago
Sharlin|28 days ago
valleyer|28 days ago
The point is that Codex can (by default) run commands on its own, without approval (e.g., running `make` on the project it's working on), but they're subject to the imposed OS sandbox.
This is controlled by the `--sandbox` and `--ask-for-approval` arguments to `codex`.
lvl155|28 days ago
embedding-shape|28 days ago
andai|28 days ago
What's the difference between resetting a container or resetting a VPS?
On local machine I have it under its own user, so I can access its files but it cannot access mine. But I'm not a security expert, so I'd love to hear if that's actually solid.
On my $3 VPS, it has root, because that's the whole point (it's my sysadmin). If it blows it up, I wanna say "I'm down $3", but it doesn't even seem to be that since I can just restore it from an backup.
xXSLAYERXx|28 days ago
maleldil|28 days ago