Thanks for the responses folks. I will briefly summarize them:
> As you say, it is fundamentally not possible to make this work at
the Python level.
This is pretty effectively demonstrated by "Tav's admirable but failed attempt
to sandbox file IO":
* http://tav.espians.com/a-challenge-to-break-python-security.html
Wow there are some impressive ways to confuse the system. I particularly like
overriding str's equality function to defeat mode checking code when opening
files.
> When we needed this at edX, we wrote CodeJail
> (https://github.com/edx/codejail).
It's a wrapper around AppArmor to provide OS-level protection of code
execution in subprocesses. It has Python-specific features, but because it
is based on AppArmor, can sandbox any process, so long as it's configured
properly.
This looks promising. I will take a closer look.
> What about launching the Python process in a Docker container?
This may work in combination with other techniques. Certainly faster than
spinning up a new VM or snapshot-restoring a fixed VM on a repeated basis.
Would need to see whether CPU, Memory, and Disk usage could be constrained at
the level of a container.
- David
On Monday, May 25, 2015 at 7:24:32 PM UTC-7, [email protected] wrote:
> I am writing a web service that accepts Python programs as input, runs the
> provided program with some profiling hooks, and returns various information
> about the program's runtime behavior. To do this in a safe manner, I need to
> be able to create a sandbox that restricts what the submitted Python program
> can do on the web server.
>
> Almost all discussion about Python sandboxes I have seen on the internet
> involves selectively blacklisting functionality that gives access to system
> resources, such as trying to hide the "open" builtin to restrict access to
> file I/O. All such approaches are doomed to fail because you can always find
> a way around a blacklist.
>
> For my particular sandbox, I wish to allow *only* the following kinds of
> actions (in a whitelist):
> * reading from stdin & writing to stdout;
> * reading from files, within a set of whitelisted directories;
> * pure Python computation.
>
> In particular all other operations available through system calls are banned.
> This includes, but is not limited to:
> * writing to files;
> * manipulating network sockets;
> * communicating with other processes.
>
> I believe it is not possible to limit such operations at the Python level.
> The best you could do is try replacing all the standard library modules, but
> that is again just a blacklist - it won't prevent a determined attacker from
> doing things like constructing their own 'code' object and executing it.
>
> It might be necessary to isolate the Python process at the operating system
> level.
> * A chroot jail on Linux & OS X can limit access to the filesystem. Again
> this is just a blacklist.
> * No obvious way to block socket creation. Again this would be just a
> blacklist.
> * No obvious way to detect unapproved system calls and block them.
>
> In the limit, I could dynamically spin up a virtual machine and execute the
> Python program in the machine. However that's extremely expensive in
> computational time.
>
> Has anyone on this list attempted to sandbox Python programs in a serious
> fashion? I'd be interested to hear your approach.
>
> - David
--
https://mail.python.org/mailman/listinfo/python-list