Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shared modules hook does not respect sys.path and modules imported via importlib #588

Open
novos40 opened this issue Jan 18, 2025 · 2 comments

Comments

@novos40
Copy link

novos40 commented Jan 18, 2025

Describe the bug

Common precondition: Module is declared as shared via JepConfig.addSharedModules in java

Case 1
Precondition: In python, sys.path is modified either via site.addsitedir() or directly by adding folder to it
Effect: subsequent import statement in python results in "ModuleNotFound" error.
Notes:

  1. running in verbose mode shows that shared modules hook does not check added folders and thus can't find the module
  2. removing module from shared list or placing module file in standard site locations results in module imported without problems

Case 2
Precondition: Shared module is imported in python via importlib, e.g. using the following function

def importFileAs(path, name):                                                   # base import utility
    """path is either a module or a package's __init.py__ file path, module or package will be imported under given name"""
    spec = importlib.util.spec_from_file_location(name, path)
    module = importlib.util.module_from_spec(spec)
    sys.modules[name] = module                                                  # register the module
    spec.loader.exec_module(module)
    return (name, module)

Effect: imported module is not shared even when declared as such

Expected behavior
Case 1: expect shared modules hook to take into account current sys.path
Case2: expect imported module to be shared

Environment (please complete the following information):
Windows 10 running in vmware workstation
python 3.11
java 21
Jep 4.2.2

@bsteffensmeier
Copy link
Member

I don't think it is technically possible to achieve your expected behavior.

For case 2 a problem that comes up is that if you close the SubInterpreter that created the module then the module will stop working(or worse crash things). When the native python code destroys an interpreter it is very aggressive about freeing any memory associated with the interpreter. I am pretty sure the module itself would be destroyed but if we could figure out how to keep the module alive then things like the builtin functions are destroyed so your module could not rely on any builtin python functionality which makes it hard to do much of anything.

That helps explain what is happening in case 1. Shared modules are not actually created in the SubInterpreter where the import happens because SubInterpreters can be closed. Instead the shared module hook forwards the import to the MainInterpreter which is an interpreter that is kept open for the life of the program. This ensures shared modules can be used for the life of your program even when interpreters are closed. Since the MainInterpreter is a completely different interpreter it has a different sys.path. I don't see anyway we could reasonably import a module with the exact environment of a SubInterpreter while running the import in the MainInterpreter.

I may have a workaround for you. The MainInterpreter which handles shared modules is the same Python Interpreter used by SharedInterpreters. You can create a SharedInterpreter and alter sys.path or import modules with importlib. Even if the SharedInterpreter is closed the sys.path and sys.modules will persist and a SubInterpreter with shared modules can use the path or modules specified by the SharedInterpreter.

@novos40
Copy link
Author

novos40 commented Jan 18, 2025

So do I understand it right:

  1. For each thread and python application combination I create SharedInterpreter from JepConfig with shared modules declared and run my initialization python module which does dynamic import
  2. I close SharedInterpreter as I don't need it
  3. I create SubInterpreter and run the same initialization module which in this case should work fine as it will inherit shared module from main interpreter

Is this correct? So the price is creation of one extra interpreter with all python compilation etc. Not ideal, but I'll try.

Is it possible to have a method like importShared(module) which can take either module itself or file path to the module? This would save the extra interpreter creation.

Alternatively, Is it possible to get instance of MainInterpreter? There is already sharedImport(String) method. If I can also run a script and/or modify sys.path of main interpreter it would also solve the problem, right?

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants