top | item 46500510

Sandboxing Untrusted Python

68 points| mavdol04 | 1 month ago |gist.github.com

52 comments

order

petters|1 month ago

> Older alternatives like sandbox-2 exist, but they provide isolation near the OS level, not the language level. At that point we might as well use Docker or VMs.

No,no, Docker is not a sandbox for untrusted code.

senko|1 month ago

What if I told you that, back in the day, we were letting thousands of untrusted, unruly, mischievous people execute arbitrary code on the same machine, and somehow, the world didn't end?

We live in a bizarre world where somehow "you need a hypervisor to be secure" and "to install this random piece of software, run curl | sudo bash" can live next to each other and both be treated seriously.

neoCrimeLabs|1 month ago

It depends on your threat model, but generally speaking would not trust default container runtimes for a true sandbox.

The kata-containers [1] runtime takes a container and runs it as a virtual host. It works with Docker, podman, k8s, etc.

It's a way to get the convenience of a container, but benefits of a virtual host.

This is not do-all-end-all, (there are more options), but this is a convenient one that is better than typical containers.

[1] - https://katacontainers.io/

maple3142|1 month ago

I don't think it is generally possible to escape from a docker container in default configuration (e.g. `docker run --rm -it alpine:3 sh`) if you have a reasonably update-to-date kernel from your distro. AFAIK a lot of kernel lpe use features like unprivileged user ns and io_uring which is not available in container by default, and truly unprivileged kernel lpe seems to be sufficient rare.

mavdol04|1 month ago

You're right, Docker isn't a sandbox for untrusted code. I mentioned it because I've seen teams default to using it for isolating their agents on larger servers. So I made sure to clarify in the article that it's not secure for that purpose.

ashishb|1 month ago

Show me how you will escape a docker sandbox.

s_ting765|1 month ago

Docker provides some host isolation which can be used effectively as a sandbox. It's not designed for security (and it does have some reasonable defaults) but it does give you options to layer on security modules like apparmor and seccomp very easily.

amluto|1 month ago

The example is:

    @task(name="analyze_data", compute="MEDIUM", ram="512MB", timeout="30s", max_retries=1)
    def analyze_data(dataset: list) -> dict:
        # Your code runs safely in a Wasm sandbox
        return {"processed": len(dataset), "status": "complete"}
This is fundamentally awkward in a language with as absurdly flexible a type system as Python. What if that list parameter contains objects that implement __getattr__? What if the output dict has an overridden __getattr__?

Even defining semantics seems awkward, especially if one wants those semantics to simultaneously make sense and have any sort of clear security properties.

edit: a quick look at the source suggests that the output is deserialized JSON regardless of what the type signature says. That’s certainly one solution.

mavdol04|1 month ago

Yep, exactly.

We stick to JSON to make sure we pass data, not behavior. It avoids all that complexity.

corv|1 month ago

The gist dismisses sandbox-2 as “might as well use Docker or VMs” but IMO that misses what makes it interesting. The PyPy sandbox isn’t just isolation, it’s syscall interception with a controller in the loop.

I’ve been building on that foundation: script runs in sandbox, all commands and file writes get captured, human-in-the-loop reviews the diff before anything executes. It’s not adversarial (block/contain) but collaborative (show intent, ask permission).

Different tradeoff than WASM or containers: lighter than VMs, cross-platform, and the user sees exactly what the agent wants to do before approving.

WIP, currently porting to PyPy 3.8 to unlock MacOS arm64 support: https://github.com/corv89/shannot

bArray|1 month ago

I have been thinking about this myself, but am still not convinced about how to run untrusted Python code. I'm not convinced that the right solution is to run the code as WebASM [1].

I have been looking towards some kind of quick-start qemu option as a possibility, but the project will take a while.

[1] https://github.com/mavdol/capsule

mavdol04|1 month ago

I see what you mean, but i think there is room for both approaches.

If we want to isolate untrusted code at a very fine-grained level (like just a specific function), VMs can feel a bit heavy due to the overhead, complexity etc

regenschutz|1 month ago

What's the problem with WASM? It's a mature target, and was created primarily, if not solely, for running untrusted native code.

Alifatisk|1 month ago

> The thing is, Python dominates AI/ML, especially the AI agents space. We're moving from deterministic systems to probabilistic ones, where executing untrusted code is becoming common.

This is so true

ptspts|1 month ago

Neither the article nor the README explains how it works.

How does it work? Which WASM euntime does it use? Does it use a Python jnterpreter compiled to WASM?

maxloh|1 month ago

Edit: never mind, I read it wrong.

---

That is not save at all. You could always hijack builtin functions within untrusted code.

  def untrusted_function():
      original_map = map
  
      def noisy_map(func, *iterables):
          print(f"--- Log: map() called on {func.__name__} ---")
          return original_map(func, *iterables)
  
      globals()['map'] = noisy_map

mavdol04|1 month ago

Actually, since it runs inside a WASM sandbox, even if the untrusted code overwrites built-ins like map or modifies globals(), it only affects its own isolated memory space. It cannot escape the WASM container or affect the host system

fud101|1 month ago

it blows my mind how people call Perl ugly but yet this monstrosity is ok. Python being 'human' readable has got to be the biggest scam ever perpetrated against language design.

staticassertion|1 month ago

Seems fine to me. I think you're going to take a huge performance hit by putting CPython into wasm. gVisor is mentioned as having a performance penalty but I'm extremely doubtful of that penalty (which is really on IO, which I expect to not be a huge deal for these workloads) being anywhere near the penalty of wasm.