> Older alternatives like sandbox-2 exist, but they provide isolation near the OS level, not the language level. At that point we might as well use Docker or VMs.
No,no, Docker is not a sandbox for untrusted code.
What if I told you that, back in the day, we were letting thousands of untrusted, unruly, mischievous people execute arbitrary code on the same machine, and somehow, the world didn't end?
We live in a bizarre world where somehow "you need a hypervisor to be secure" and "to install this random piece of software, run curl | sudo bash" can live next to each other and both be treated seriously.
I don't think it is generally possible to escape from a docker container in default configuration (e.g. `docker run --rm -it alpine:3 sh`) if you have a reasonably update-to-date kernel from your distro. AFAIK a lot of kernel lpe use features like unprivileged user ns and io_uring which is not available in container by default, and truly unprivileged kernel lpe seems to be sufficient rare.
You're right, Docker isn't a sandbox for untrusted code. I mentioned it because I've seen teams default to using it for isolating their agents on larger servers.
So I made sure to clarify in the article that it's not secure for that purpose.
Docker provides some host isolation which can be used effectively as a sandbox. It's not designed for security (and it does have some reasonable defaults) but it does give you options to layer on security modules like apparmor and seccomp very easily.
@task(name="analyze_data", compute="MEDIUM", ram="512MB", timeout="30s", max_retries=1)
def analyze_data(dataset: list) -> dict:
# Your code runs safely in a Wasm sandbox
return {"processed": len(dataset), "status": "complete"}
This is fundamentally awkward in a language with as absurdly flexible a type system as Python. What if that list parameter contains objects that implement __getattr__? What if the output dict has an overridden __getattr__?
Even defining semantics seems awkward, especially if one wants those semantics to simultaneously make sense and have any sort of clear security properties.
edit: a quick look at the source suggests that the output is deserialized JSON regardless of what the type signature says. That’s certainly one solution.
The gist dismisses sandbox-2 as “might as well use Docker or VMs” but IMO that misses what makes it interesting. The PyPy sandbox isn’t just isolation, it’s syscall interception with a controller in the loop.
I’ve been building on that foundation: script runs in sandbox, all commands and file writes get captured, human-in-the-loop reviews the diff before anything executes. It’s not adversarial (block/contain) but collaborative (show intent, ask permission).
Different tradeoff than WASM or containers: lighter than VMs, cross-platform, and the user sees exactly what the agent wants to do before approving.
I have been thinking about this myself, but am still not convinced about how to run untrusted Python code. I'm not convinced that the right solution is to run the code as WebASM [1].
I have been looking towards some kind of quick-start qemu option as a possibility, but the project will take a while.
I see what you mean, but i think there is room for both approaches.
If we want to isolate untrusted code at a very fine-grained level (like just a specific function), VMs can feel a bit heavy due to the overhead, complexity etc
> The thing is, Python dominates AI/ML, especially the AI agents space. We're moving from deterministic systems to probabilistic ones, where executing untrusted code is becoming common.
Actually, since it runs inside a WASM sandbox, even if the untrusted code overwrites built-ins like map or modifies globals(), it only affects its own isolated memory space. It cannot escape the WASM container or affect the host system
it blows my mind how people call Perl ugly but yet this monstrosity is ok. Python being 'human' readable has got to be the biggest scam ever perpetrated against language design.
Seems fine to me. I think you're going to take a huge performance hit by putting CPython into wasm. gVisor is mentioned as having a performance penalty but I'm extremely doubtful of that penalty (which is really on IO, which I expect to not be a huge deal for these workloads) being anywhere near the penalty of wasm.
petters|1 month ago
No,no, Docker is not a sandbox for untrusted code.
senko|1 month ago
We live in a bizarre world where somehow "you need a hypervisor to be secure" and "to install this random piece of software, run curl | sudo bash" can live next to each other and both be treated seriously.
neoCrimeLabs|1 month ago
The kata-containers [1] runtime takes a container and runs it as a virtual host. It works with Docker, podman, k8s, etc.
It's a way to get the convenience of a container, but benefits of a virtual host.
This is not do-all-end-all, (there are more options), but this is a convenient one that is better than typical containers.
[1] - https://katacontainers.io/
maple3142|1 month ago
mavdol04|1 month ago
ashishb|1 month ago
s_ting765|1 month ago
amluto|1 month ago
Even defining semantics seems awkward, especially if one wants those semantics to simultaneously make sense and have any sort of clear security properties.
edit: a quick look at the source suggests that the output is deserialized JSON regardless of what the type signature says. That’s certainly one solution.
mavdol04|1 month ago
We stick to JSON to make sure we pass data, not behavior. It avoids all that complexity.
corv|1 month ago
I’ve been building on that foundation: script runs in sandbox, all commands and file writes get captured, human-in-the-loop reviews the diff before anything executes. It’s not adversarial (block/contain) but collaborative (show intent, ask permission).
Different tradeoff than WASM or containers: lighter than VMs, cross-platform, and the user sees exactly what the agent wants to do before approving.
WIP, currently porting to PyPy 3.8 to unlock MacOS arm64 support: https://github.com/corv89/shannot
loeg|1 month ago
Long, long ago, there was "repy"[1][2]. (This is definitely included in the "none succeeded" bucket, FWIW.)
[1]: https://github.com/SeattleTestbed/repy_v2
[2]: https://dl.acm.org/doi/10.1145/1866307.1866332
bArray|1 month ago
I have been looking towards some kind of quick-start qemu option as a possibility, but the project will take a while.
[1] https://github.com/mavdol/capsule
mavdol04|1 month ago
If we want to isolate untrusted code at a very fine-grained level (like just a specific function), VMs can feel a bit heavy due to the overhead, complexity etc
regenschutz|1 month ago
cmacleod4|1 month ago
graemep|1 month ago
Alifatisk|1 month ago
This is so true
incognito124|1 month ago
https://judge0.com/
ptspts|1 month ago
How does it work? Which WASM euntime does it use? Does it use a Python jnterpreter compiled to WASM?
chaboud|1 month ago
https://github.com/mavdol/capsule
(From the article)
Appears to be CPython running inside of wasmtime
bArray|1 month ago
maxloh|1 month ago
---
That is not save at all. You could always hijack builtin functions within untrusted code.
mavdol04|1 month ago
fud101|1 month ago
staticassertion|1 month ago