Rigid Interpreter/Thread binding is too restrictive #589

novos40 · 2025-01-18T04:16:24Z

Is your feature request related to a problem? Please describe.
I'm trying to use JEP in highly dynamic high performance server environment. Multiple requests can be executed at the same time and multiple applications can be dynamically deployed/undeployed on the server. Also I'd prefer to use limited (e.g. NCPU) number of platform threads and large number of virtual threads. Given that any request gets assigned to a random thread from a pool it is very restrictive that each interpreter is "welded" to the thread it was created by. I understand and accept that two interpreters cannot be used simultaneously, but I don't see any reason why interpreters can not be pooled and assigned to an executing thread dynamically. That is, for each python application I'd have a pool of interpreters created for that application. Naturally the max number of interpreters in the pool can not exceed NCPU so the total number of interpreters would be number of apps times number of CPUs which is perfectly fine. Obviously I'd use SubInterpreters to provide isolation. With current implementation, however, this is not possible. The only way I can use JEP in this environment is to create and close interpreter for each request and thus incur python compilation and initialization expense for each request.

Describe the solution you'd like
Instead of binding interpreter instance to creating thread forever, implement interpreter.enter() and interpreter.leave() methods. Enter() would bind interpreter to the current thread and leave() would unbind it. Only bound interpreter can be active. Unbound ones can be only dormant.
This is the exact same solution employed by GraalVM to run python.

At the very least it must be possible to have multiple interpreter instances created by the same thread and selected dynamically. This way each instance still belongs to the thread, but thread may select which one to use at a time. This way the maximum number of instances would be number of applications times number of threads which can be much worse that Napps x NCPU but still bounded and quasi-static. This is similar to netty's threading when single thread is assigned to a number of connections so each connection is guaranteed to be processed by the same thread, but not at the same time.

Describe alternatives you've considered
I started my project with graal VM, but the problem with it is that currently they don't support all required python module versions. They are working to extend the list it, but it takes time and I thought I can use JEP in a meanwhile. BTW both can live in the same OSGi container just fine so developers would have a choice which implementation to use.

bsteffensmeier · 2025-01-21T03:12:06Z

That is an interesting idea. A SharedInterpreter is pretty close to the minimum amount of work necessary to use Python on a thread, it is much more light weight than a SubInterpreter. At the cpython level there is really only one interpreter for all SharedInterpreters and the SharedInterpreter just creates the thread specific state necessary to use the interpreter on the calling thread. Things like python modules are not recreated or destroyed when a SharedInterpreter is opened or closed. You may want to re-examine your need for SubInterpreters, if you can find a way to get your framework working with SharedInterpreters instead of SubInterpreters then you would be able to access python with less overhead.

One idea that I have been trying to find the time to implement for a while now is to allow SharedInterpreters to be created from SubInterpreters. Right now all SharedInterpreters are using a single static cpython interpreter but it should be possible to create a SharedInterpreter that is using a specific SubInterpreter instead. In that case the SharedInterpreter would just be the thread specific state to access that sub-interpreter on a different thread. Within the context of your application that would mean you could have a single SubInterpreter for each application and then when a thread runs for that application it could create a SharedInterpreter based off the SubInterpreter on that thread and close the SharedInterpreter when it is done.

The idea of having an interpreter with no associated thread is also interesting. From what I know about the c-api provided by python I think that is theoretically possible but I have never heard of anyone trying to use cpython like this so I am not sure if it would actually work or not. Theoretically we could have some sort of DetachedInterpreter instance and allow you to create SharedInterpreters on different threads that reuse the state from the Detached Interpreter.

At the very least it must be possible to have multiple interpreter instances created by the same thread and selected dynamically.

This is not possible at all. The PyGILState API is clearly documented as being incompatible with sub-interpreters. We have managed to slightly bypass that incompatibility by imposing the requirement that there is only one interpreter per thread. If we allow multiple interpreters on a thread then any modules using the PyGILState API will break. You are looking at Jep because it is compatible with third party modules and the threading restrictions we have imposed were implemented specifically to get that compatibility.

novos40 · 2025-01-21T04:05:57Z

My problem with shared interpreters is exactly that, they are shared and therefore can leak secrets between applications. I guess I can try to use shared interpreters in a POC, but it will never go into production because of security concerns. Even the workaround you've suggested to use shared interpreter to import from modified sys.path (it does work BTW) is questionable from the security point of view, again, because imported modules become shared across the board and any secrets stored within become discoverable by other applications if they are [maliciously] designed to do that (would not be so hard if you know what to do). This is an application container and by default it must protect itself from applications and applications from each other much like OS does with processes.

Sharing per-application static interpreter would definitely help as from what I see now the cost of creating a brand new interpreter just to run a single request processing function can be rather high which may negate many advantages of running in JVM.

I'm not c-python API expert in any way, but the link to PyGILState you gave does not seem to prohibit anything I've suggested. You only need to make sure that for every PyGILState_Ensure you call correspondent PyGILState_Release. (Please correct me if I'm wrong.) After last release is called, the interpreter is just a dormant data which can be kept in a pool of interpreters. Any interpreter taken from the pool is activated again by calling ensure again, executing code and then calling release to deactivate and return interpreter to the pool. Unless interpreter keeps thread ID internally and does a check to make sure that exactly the same thread is used again as for the very first ensure call everything should work. (I saw you do it in your java code, but I don't understand why you need such a restriction). Even if this was true it's still does not seem to prohibit having multiple interpreters created by the same thread and used one at a time (i.e. interpreter pool is thread-local)
I'd have a hard time to believe that interpreter would weld itself to a specific thread [ID]. It would be just a very poor design IMHO, but, again, I might have a gross misunderstanding of the whole thing.

Once I've done with POC I may try to modify your code to relax these conditions and see if it works. Any pointers where to look and/or where to start would be greatly appreciated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rigid Interpreter/Thread binding is too restrictive #589

Rigid Interpreter/Thread binding is too restrictive #589

novos40 commented Jan 18, 2025 •

edited

Loading

bsteffensmeier commented Jan 21, 2025

novos40 commented Jan 21, 2025 •

edited

Loading

Rigid Interpreter/Thread binding is too restrictive #589

Rigid Interpreter/Thread binding is too restrictive #589

Comments

novos40 commented Jan 18, 2025 • edited Loading

bsteffensmeier commented Jan 21, 2025

novos40 commented Jan 21, 2025 • edited Loading

novos40 commented Jan 18, 2025 •

edited

Loading

novos40 commented Jan 21, 2025 •

edited

Loading