Saturday 2 August 2014

Sandboxing Python Scripts in a Java Application with Jython

The following post describes how (with a little hacking) you can use Jython and the standard Java security manager to create a sandboxed environment in which to securely run untrusted Python scripts.

Note: the following was tested with Jython 2.5.3 and Java 1.7u45. The command line examples assume Linux and Bash.

Let's say you've written a Java application, and you want your users to be able to customise it by providing Pyhton scripts that will be run by the application. Let's also say that you have a deep-seated distrust of your users: they're inherently suspect, and any code they supply to be run as part of your application should be treated as if it were written by the Devil himself.

This poses a problem. We would like to be able to use the Java implementation of the Pyhton interpreter, Jython. However, neither Jython not the Java JSR 223 scripting API (javax.script) offer much in the way of security sandboxing features. We know that Java offers excellent sandboxing through the JVM security manager and security policies, so that looks like a prime candidate, but we need to grant sufficient permissions to the Jython interpreter while at the same time placing restrictions on the Python scripts that the interpreter is running on our behalf. Tricky.

As you might expect, the Jython interpreter needs us to grant it some basic permissions in order to run; it needs to be able to load script files and access some Java system properties for instance. All in all, it turns out Jython needs the us to grant it the following permissions (assuming that the Jython jar is in our working directory):

grant codebase "file:${user.dir}/jython-standalone-2.5.3.jar" {
  permission java.lang.RuntimePermission "createClassLoader";
  permission java.lang.RuntimePermission "getProtectionDomain";
  permission java.lang.RuntimePermission "accessDeclaredMembers";
  permission java.io.FilePermission "${user.dir}/*", "read";
  permission java.util.PropertyPermission "java.vm.name", "read";
  permission java.util.PropertyPermission "java.vm.vendor", "read";
  permission java.util.PropertyPermission "os.name", "read";
  permission java.util.PropertyPermission "os.arch", "read";
  permission java.util.PropertyPermission "user.dir", "read";
  permission java.util.PropertyPermission "line.separator", "read";
};

Note: The accessDeclaredMembers permission is only necessary if you need to allow scripts to extend Java classes. Given our scenario, we'll assume that's a likely requirement of our Java application's API.

Great, with the above in our Java security policy file (which we'll cunningly call security.policy), we can run Jython with the Java security manager installed, and the interpreter runs our untrusted script, the imaginatively-titled untrusted-script.py:

java -cp jython-standalone-2.5.3.jar -Djava.security.manager -Djava.security.policy==security.policy org.python.util.jython untrusted-script.py

This has already significantly restricted what our fiendishly evil script can do. For instance, if we try to run the following code:

f = open('/etc/passwd')
userdetails = f.read()
print(userdetails)

We get this error message:

java.security.AccessControlException: java.security.AccessControlException: access denied ("java.io.FilePermission" "/etc/passwd" "read")

Busted! This is great: we know that because the Python script is running in the Jython interpreter, and the Jython interpreter is running in the JVM, our untrusted Python script won't be able to do anything that we haven't granted the Jython jar permission to do.

However, there's still a problem. One of the permissions we had to grant the Jython jar in order for it to run at all was the java.lang.RuntimePermission "createClassLoader", and as it turns out, granting code this permission effectively invalidates your security policy, because code that can create its own classloader can create a classloader that grants full permissions to any classes it loads. From the Java API for RuntimePermission:

"This is an extremely dangerous permission to grant. Malicious applications that can instantiate their own class loaders could then load their own rogue classes into the system. These newly loaded classes could be placed into any protection domain by the class loader, thereby automatically granting the classes the permissions for that domain."

But wait, we're only running Python scripts: what harm can they do with the createClassLoader permission? Well, this is Jython, where Python code can call Java APIs, and do this:

from java.security import AccessControlException, Permissions, AllPermission, SecureClassLoader, CodeSource
from java.net import URL

import base64

class DodgyClassLoader(SecureClassLoader):

  def __init__(self):
    SecureClassLoader.__init__(self)
    self.datamap = {}
    self.codeSource = CodeSource(URL('file:/dummy'), None)

  def addClass(self, name, data):
    self.datamap[name] = data

  def findClass(self, name):
    data = self.datamap[name]
    return self.super__defineClass(name, data, 0, len(data), self.codeSource)

  def getPermissions(self, codesource):
    permissions = Permissions()
    permissions.add(AllPermission())
    return permissions

fileReaderClassDef = base64.b64decode('<<base64 ecoding of a class file defining a class that reads files>>')

fileReaderInnerClassDef = base64.b64decode('<<base64 encoding of the inner class containing the code to read files within a doPrivileged block>>')

classloader = DodgyClassLoader()
classloader.addClass('dodgy.FileReader', fileReaderClassDef)
classloader.addClass('dodgy.FileReader$1', fileReaderInnerClassDef)
fileReaderClass = classloader.findClass('dodgy.FileReader')
fileReader = fileReaderClass.newInstance()
userDetails = fileReader.readFile('/etc/passwd')
print(userDetails)

Confound it all, we were so close. If only there were some way to allow the Jython interpreter permission to create classloaders, while at the same time preventing Python scripts from exploiting this. We could maybe do some filtering of the untrusted scripts to check they're not creating classloaders, but we're now relying on mechanisms outside of our nice safe Java sandbox, and we'd have to account for all manner of Python trickery.

Hold on though: in order to grant code permission to do some sensitive operation, the Java security manager needs every frame on the stack to have the necessary permission. At the moment, we've given all the code in the Jython jar the same set of permissions. What if there were a few classes that aren't on the stack when Jython needs to do its classloading, but are on the stack whenever a script tries to call Java APIs? We could place these classes in a separate jar file and not grant them any permissions that we don't want to grant to our Python scripts, limiting the what scripts can do through the Java APIs.

As luck would have it, there are such classes:
  • org.python.core.PyReflectedConstructor
  • org.python.core.PyReflectedField
  • org.python.core.PyReflectedFunction
Let's pull these out of the Jython jar and put them in their own jar:
unzip jython-standalone-2.5.3.jar org/python/core/PyReflected*
jar cvf jython-standalone-2.5.3-nonsecure.jar org/python/core/*
rm -rf org
zip -d jython-standalone-2.5.3.jar org/python/core/PyReflected*

This removes the classes from the main Jython jar and puts them into a new jar: jython-standalone-2.5.3-nonsecure.jar. As this jar isn't listed in our security policy file, these classes don't get any security permissions. Now we can run Jython by adding this extra jar to the classpath as follows:

java -cp jython-standalone-2.5.3-nonsecure.jar:jython-standalone-2.5.3.jar -Djava.security.manager -Djava.security.policy==security.policy org.python.util.jython untrusted-script.py

This way, if our untrusted script tries to create a classloader, the Jython interpreter runs correctly, but the script throws the following exception:

java.security.AccessControlException: java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "createClassLoader")

This is exactly the behavior we want, and it means that we've closed off the only permission that we had to grant the Jython interpreter that was a major concern. Huzzah!

Now, getting back to out original problem, if we wanted to take advantage of this setup in a Java application, all we'd need to do is set the Java security manager when we start up our application, grant java.security.AllPermissions to our application code, and invoke Jython from Java, most likely with the JSR 223 API.

Notes:
  • The hacking around with jar files is only necessary into order to deal with the createClassLoader permission. This is something that we'd like to avoid doing because of the power it potentially gives to code running in the Jython interpreter. However, it's worth noting that the example exploit above, which creates a dodgy subclass of classloader, is only possible because we also granted the accessDeclaredMembers permission. If there's no requirement to allow scripts to extend Java classes, this permission isn't necessary, and it becomes much harder to exploit the createClassLoader permission.
Shortcomings:
  • We had to grant Jython permission to read files in the working directory in order to read the script file, and the tweaks we made to the jar files don't block access to scripts. We therefore shouldn't put anything important in the working directory, or anywhere we intend to put script files.
  • For some reason I've not got to the bottom of yet, these changes prevent Jython running up a command line, which isn't too much of a problem as we want to supply a script, rather than interact with Python via the command line.
  • This is something of a hack and very susceptible to changes in the Jython source code.

9 comments:

Unknown said...

Thank you for this post, it helped me out a lot. There is a nagging problem that I'm running into though - I wonder if you've seen it.

When I run your python code that reads a file in another directory (like '/etc') I get the same behaviour that you describe. But I also get some log messages on stderr:

12-Dec-2014 2:01:45 PM org.python.google.common.base.internal.Finalizer getInheritableThreadLocalsField

INFO: Couldn't access Thread.inheritableThreadLocals. Reference finalizer threads will inherit thread local values.

*sys-package-mgr*: The java security manager isn't allowing access to the package cache dir, 'cachedir\packages'

I don't know what the first two mean, but the last one got me thinking: I should probably grant the secure jython jar permission to read and write the cachedir directory, and then test that my untrusted jython script is unable to write to that directory. (I get the impression that cachedir isn't used much when running a jython script via the command-line and org.python.util.jython, but it seems to be used a lot more when running a jython script in the context of a larger application via org.python.util.PythonInterpreter like I require, so I figured I should handle this cachedir problem.)

Granting read/write permissions to cachedir made the *sys-package-mgr* message go away, as expected. But then I found that the untrusted jython script can also read and write to the cachedir. This seems to me like a troubling security hole. I don't understand how it is possible, given the description of the situation that you've given here, which as far as I can tell is right on.

I'm using Jython 2.5.3 also, and confirmed that my Jython code is running inside those PyReflected* classes by putting "import java.lang; java.lang.Exception().printStackTrace()" in my jython script. Sure enough, that prints the following, with a PyReflect* class appearing, as expected:

java.lang.Exception
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.python.core.PyReflectedConstructor.constructProxy(PyReflectedConstructor.java:210)
at org.python.core.PyReflectedConstructor.__call__(PyReflectedConstructor.java:179)
at org.python.core.PyObject.__call__(PyObject.java:345)
at org.python.core.PyMethod.instancemethod___call__(PyMethod.java:220)
at org.python.core.PyMethod.__call__(PyMethod.java:211)
at org.python.core.PyMethod.__call__(PyMethod.java:206)
at org.python.core.Deriveds.dispatch__init__(Deriveds.java:19)
at org.python.core.PyObjectDerived.dispatch__init__(PyObjectDerived.java:1057)
at org.python.core.PyType.type___call__(PyType.java:1565)
at org.python.core.PyType.__call__(PyType.java:1548)
at org.python.core.PyObject.__call__(PyObject.java:371)
at org.python.core.PyObject.__call__(PyObject.java:375)
at org.python.pycode._pyx0.f$0(untrusted-script-blog.py:5)
at org.python.pycode._pyx0.call_function(untrusted-script-blog.py)
at org.python.core.PyTableCode.call(PyTableCode.java:165)
at org.python.core.PyCode.call(PyCode.java:18)
at org.python.core.Py.runCode(Py.java:1275)
at org.python.util.PythonInterpreter.execfile(PythonInterpreter.java:235)
at org.python.util.jython.run(jython.java:247)
at org.python.util.jython.main(jython.java:129)


So if you have a minute - can you tell me: have you ever seen anything like this?

Regards,

- Dan

Unknown said...

Subscribing to this thread...

alphaloop said...

Hi Daniel, thanks for the comment. I'd seen the ThreadLocal error, but it didn't seem to be causing any issues, however the cachdir issue wasn't one I'd noticed and seems more of a problem. I agree it seems odd that granting the permission to the Jython code should allow access to scripts if the PyReflect* classes are on the stack. I'm away at the moment for a few weeks, but I'll certainly look into it to see if I'm seeing the same effect. In the meantime if you get any further I'd be interested to hear what you find. Thanks again for posting.

Unknown said...

Thank you. I look forward to hearing about what you find. If I figure out anything else, I'll post it here.

Unknown said...

I may have figured out some more about this.

I noticed that a line of Jython code such as "open('C:\\cygwin\\tmp\\1.txt', 'w')" produces an exception like this:

Traceback (most recent call last):
File "<script>", line 33, in
java.security.AccessControlException: access denied (java.io.FilePermission C:\cygwin\tmp\1.txt read)
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:374)
at java.security.AccessController.checkPermission(AccessController.java:549)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
at java.lang.SecurityManager.checkRead(SecurityManager.java:871)
at java.io.RandomAccessFile.(RandomAccessFile.java:206)
at org.python.core.io.FileIO.fromRandomAccessFile(FileIO.java:172)
at org.python.core.io.FileIO.(FileIO.java:79)
at org.python.core.PyFile.file___init__(PyFile.java:150)
at org.python.core.PyFile$exposed___new__.createOfType(Unknown Source)
at org.python.core.PyOverridableNew.new_impl(PyOverridableNew.java:12)
at org.python.core.PyType.invokeNew(PyType.java:466)
at org.python.core.PyType.type___call__(PyType.java:1558)
at org.python.core.PyType.__call__(PyType.java:1548)
at org.python.core.OpenFunction.__call__(__builtin__.java:1564)
at org.python.core.PyObject.__call__(PyObject.java:404)
at org.python.core.PyObject.__call__(PyObject.java:408)
at org.python.pycode._pyx0.f$0(<script>:44)
at org.python.pycode._pyx0.call_function(<script>)
at org.python.core.PyTableCode.call(PyTableCode.java:165)
at org.python.core.PyCode.call(PyCode.java:18)
at org.python.core.Py.runCode(Py.java:1275)
at org.python.core.Py.exec(Py.java:1319)
at org.python.util.PythonInterpreter.exec(PythonInterpreter.java:215)
at T.main(T.java:19)

There are no PyReflected* frames in this stack trace, unlike the java.lang.Exception().printStackTrace() I mentioned before. So it seems I was barking up the wrong tree with that. I figure I'll need to add more classes to the nonsecure jython jar if I want to block unwanted file accesses.

I get the impression that you don't need your scripts to do any file access, but my needs are different. In addition to my desire to get Jython's package cache directory working, I need to allow my scripts access to white lists of directories for reading and writing. So that's why I've spent the time to get this right and not deny too much file access.

I thought about moving org.python.core.PyCode to the nonsecure jar but it appears that everything runs under that, so it would prevent Jython from using the package cache directory.

So I moved org.python.core.io.FileIO to the nonsecure jython jar. This seems to work so far. I hope to have the time to look into it more in January.

Unknown said...

Also, a few more notes if you're interested:

1) To block Jython scripts from calling java.lang.System.exit() effectively, I had to resort to some code in addition to the policy file, because, according to the javadocs for java.lang.RuntimePermission, "The "exitVM.*" permission is automatically granted to all code loaded from the application class path, thus enabling applications to terminate themselves."

So I added some code like this to my program's startup, before any Jython scripts are run:

protected static void initExitVMSecurityManager() {
// This set reflects the contents of our jython-nonsecure.jar. If the list of classes in
// that jar is ever changed, then this set should be changed too.
final SortedSet nonSecureJythonClassNames = UtilC.set(new String[]{
"org.python.core.PyReflectedConstructor",
"org.python.core.PyReflectedField",
"org.python.core.PyReflectedFunction",
"org.python.core.io.FileIO",
});
System.setSecurityManager(new SecurityManager() {
@Override
public void checkPermission(Permission permission__) {
if(permission__.getName() != null && permission__.getName().startsWith("exitVM")) {
for(Class cls: getClassContext()) {
if(nonSecureJythonClassNames.contains(cls.getName())) {
throw new SecurityException("exitVM permission denied");
}
}
}
// This does all the other checks, against our security policy file:
super.checkPermission(permission__);
}
});
}

2) I found that to get an import such as "from java.lang import *" to work, I needed to add this permission to the secure jython jar's section of the policy file:

permission to java.io.FilePermission "${user.dir}${/}-", "read,execute";

3) I added access to each directory in System.getProperty("java.ext.dirs") to the policy file programmatically. If I didn't, I would get an exception like this during my Jython initialization:

Exception in thread "main" java.lang.ExceptionInInitializerError
at T.main(T.java:13)
Caused by: java.security.AccessControlException: access denied (java.io.FilePermission C:\windows\Sun\Java\lib\ext read)
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:374)
at java.security.AccessController.checkPermission(AccessController.java:549)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
at java.lang.SecurityManager.checkRead(SecurityManager.java:871)
at java.io.File.isDirectory(File.java:752)
at org.python.core.packagecache.SysPackageManager.addJarDir(SysPackageManager.java:57)
at org.python.core.packagecache.SysPackageManager.addJarPath(SysPackageManager.java:78)
at org.python.core.packagecache.SysPackageManager.findAllPackages(SysPackageManager.java:106)
at org.python.core.packagecache.SysPackageManager.(SysPackageManager.java:39)
at org.python.core.PySystemState.initPackages(PySystemState.java:978)
at org.python.core.PySystemState.doInitialize(PySystemState.java:890)
at org.python.core.PySystemState.initialize(PySystemState.java:802)
at org.python.core.PySystemState.initialize(PySystemState.java:752)
at org.python.core.PySystemState.initialize(PySystemState.java:745)
at org.python.util.PythonInterpreter.initialize(PythonInterpreter.java:57)
at com.activecore.al.common.Jython.initialize(Jython.java:37)
at com.activecore.al.common.Jython.(Jython.java:20)
... 1 more

4) I also granted the permissions below to both the secure and nonsecure jython jars, for reasons that I can't remember right now. I think that some python libraries that my scripts want to use need them.

permission java.util.PropertyPermission "*", "read";
permission java.lang.RuntimePermission "getenv.*";

Unknown said...

5) I added a line like this to get the python 'subprocess' module to work.

permission java.io.FilePermission "%s${/}-", "read,execute";

... where "%s" is escapeBackSlashes((new File(System.getenv("COMSPEC"))).getParent()), and escapeBackSlashes() does the obvious.

(It might seem odd that I want the 'subprocess' module to work, but as with reading and writing files, I have a whitelist for executables that I add "execute" permissions for separately.)

Unknown said...

Thanks again for your original post. I don't think that I would have figured out the classloader angle.

Anonymous said...

Thank you for this post, it helped me out a lot. Let's say you've written a Java application, and you want your users to be able to customise it by providing Pyhton scripts that will be run by the application. Our USA VPS Hosting comes with high speed, and resiliency to let your website with a good experience by Onlive Server.