Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The global config should not be stored on each interpreter #91120

Open
ericsnowcurrently opened this issue Mar 8, 2022 · 9 comments
Open

The global config should not be stored on each interpreter #91120

ericsnowcurrently opened this issue Mar 8, 2022 · 9 comments
Assignees
Labels
3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API

Comments

@ericsnowcurrently
Copy link
Member

BPO 46964
Nosy @ncoghlan, @vstinner, @ericsnowcurrently
PRs
  • bpo-46964: Move PyInterpreterState.config to _PyRuntimeState.config #31771
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ericsnowcurrently'
    closed_at = None
    created_at = <Date 2022-03-08.19:55:53.355>
    labels = ['interpreter-core', 'expert-C-API', '3.11']
    title = 'The global config should not be stored on each interpreter'
    updated_at = <Date 2022-03-30.15:26:47.339>
    user = 'https://github.com/ericsnowcurrently'

    bugs.python.org fields:

    activity = <Date 2022-03-30.15:26:47.339>
    actor = 'vstinner'
    assignee = 'eric.snow'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core', 'C API']
    creation = <Date 2022-03-08.19:55:53.355>
    creator = 'eric.snow'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46964
    keywords = ['patch']
    message_count = 8.0
    messages = ['414772', '414789', '415845', '415891', '416061', '416076', '416366', '416373']
    nosy_count = 3.0
    nosy_names = ['ncoghlan', 'vstinner', 'eric.snow']
    pr_nums = ['31771']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue46964'
    versions = ['Python 3.11']

    @ericsnowcurrently
    Copy link
    Member Author

    tl;dr let's move PyInterpreterState.config to _PyRuntimeState.config.

    Historically the runtime has been initialized using Py_Initialize(). PEP-587 added Py_InitializeFromConfig(), which takes a PyConfig and allows all sorts of customization of the runtime. This is valuable for embedders (and benefits core development too).

    During runtime initialization the given config is copied and stored internally on the main interpreter. Once initialization completes, the config is no longer modified. The config values are then used in a variety of places during execution. If a new interpreter is created then the config from the current (or main) interpreter are copied into it.

    Note the following:

    • the config is never modified
    • there is no public API for getting the config or changing it
    • there is no API for creating an interpreter with a different config
    • the fact that the config is stored on the interpreter is an internal detail and not documented (nor discussed in PEP-587)

    Consequently, PyConfig really is the global runtime config. Yet we are storing a copy of it on each interpreter. Doing so unnecessarily adds extra complexity (and, when multiple interpreters are used, extra CPU usage and extra memory usage).

    So I propose that we move the config to _PyRuntimeState. The change isn't big nor all that complex. Note that actually there is one field that can differ between interpreters: PyConfig._isolated_interpreter (set in _Py_NewInterpreter()). We can move that one field to a new per-interpreter config struct.

    @ericsnowcurrently ericsnowcurrently added the 3.11 only security fixes label Mar 8, 2022
    @ericsnowcurrently ericsnowcurrently self-assigned this Mar 8, 2022
    @ericsnowcurrently ericsnowcurrently added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API 3.11 only security fixes labels Mar 8, 2022
    @ericsnowcurrently ericsnowcurrently self-assigned this Mar 8, 2022
    @ericsnowcurrently ericsnowcurrently added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API labels Mar 8, 2022
    @vstinner
    Copy link
    Member

    vstinner commented Mar 9, 2022

    Doing so unnecessarily adds extra complexity (and, when multiple interpreters are used, extra CPU usage and extra memory usage).

    Right, PyConfig allocates memory per interpeter. But IMO it's an acceptable trade-off. I don't see the benefits of sharing the config between all interpreters.

    Have per-interpreter config allows to create an interpreter with a different config:

    • different sys.path
    • isolated vs non-isolated
    • dev mode
    • etc.

    While there is currently no easy way to use a custom config, recent changes make it easier to implement, like the function to modify a PyConfig from a Python dict: _PyInterpreterState_SetConfig().

    there is no public API for getting the config or changing it

    Barry Warsaw proposed to add a function to get the config, but I didn't add it.

    There are private functions for that:

    • _testinternalcapi.get_config()
    • _testinternalcapi.set_config()

    @ericsnowcurrently
    Copy link
    Member Author

    Here are reasons why PyConfig relates to the runtime and not each interpreter:

    • PyConfig was introduced so embedders could dictate how the *runtime*
      should be initialized
    • after initialization, it represents how the runtime was initialized
    • in the public API, PyConfig relates only to the runtime
    • more than a few PyConfig fields are specific to the entire runtime
      and not appropriate on a per-interpreter basis
    • such PyConfig fields are only used during runtime init
    • currently, every interpreter uses (a copy of) the original PyConfig,
      completely unchanged
    • there is no API for creating an interpreter with a different config
    • most of the things that one might want to customize on an interpreter
      can already be done right after the interpreter is initialized
    • thus far we have had no actual use cases for initializing an interpreter
      with a different config
    • for many of the PyConfig fields, allowing different values for each
      interpreter is an attractive nuisance and would invite confusion

    This is why PyConfig makes more sense as the global config, not a per-interpreter one. I think it was a mistake to store the config on PyInterpreterState. Keep in mind, though, that PyConfig was added several years before _PyRuntimeState existed. If _PyRuntimeState had been around, I'm sure that's where we would have kept the global config.

    Thus, it makes sense to move PyInterpreterState.config to _PyRuntimeState.config. If there's a need for a per-interpreter config later, we would do the following:

    • add a new PyInterpreterConfig with only the fields we need
    • add a new PyInterpreterState.config field of that type

    In fact, that is exactly what I'm proposing in PEP-684, where there is a need for creating an interpreter with various isolation options. That's what got me thinking about why PyConfig isn't good for customizing new interpreters. Even if we didn't move PyInterpreterState.config to _PyRuntimeState.config, PEP-684 would still not use PyConfig as the config to pass when creating a new interpreter. Using PyConfig would be confusing and an invitation for trouble. So I'd use a new PyInterpreterConfig either way. Consequently, having PyConfig on PyInterpreterState would be confusing, with no benefit.

    My intention with this BPO issue was to get our internal details aligned with what makes sense for a global config and set the stage for adding a proper per-interpreter config.

    @vstinner
    Copy link
    Member

    thus far we have had no actual use cases for initializing an interpreter
    with a different config

    I don't think that sub-interpreters should have config.install_signal_handlers=1. Same for config.configure_c_stdio=1. Only the main interpreter should handle signals and the configuration of C streams (stdio, stderr, stdio).

    I added _PyInterpreterState_SetConfig() to experiment changing config.bytes_warning. But this is no strong need to have an API to change it, so right now, there is still no API for that. If we allow to change it, it would be surprising to change all interpreters at once.

    I don't see why config.argv (sys.argv) must be the same in all interpreters. Same for config.module_search_paths (sys.path).

    For me, it makes sense to have a per-interpreter value for most PyConfig options.

    • there is no API for creating an interpreter with a different config

    There is _Py_NewInterpreter(int isolated_subinterpreter) to set config.config._isolated_interpreter. But since the feature is not "official", it remains a private API.

    currently, every interpreter uses (a copy of) the original PyConfig,
    completely unchanged

    Currently, it's possible to call Py_InitializeFromConfig() (or Py_Initialize()) multiple times and it changes the PyConfig of the current interpreter. See the pyinit_core_reconfigure() function. _testembed has an unit test for that: I had to implement this feature. Once, calling the function multiple times and it broke an application. See bpo-34008.

    @ncoghlan
    Copy link
    Contributor

    Moving the invariant bits would certainly make sense.

    As Victor notes, though, some things (like whether to install the signal handlers) should be different.

    @vstinner
    Copy link
    Member

    Moving the invariant bits would certainly make sense.

    IMO it's convenient to have a single structure for all "configuration", even if a few parameters are expected to be the same in all interpreters.

    Python/initconfig.c is already a long C file. It contains a lot of code to copy members, converts them from a dict/to a dict, check consistency, etc.

    It's also a little bit annoying (for me) to have a separated PyPreConfig structure int Python/preconfig.c which shares some code (like the "_PyPreCmdline" thing to parse "common" command line options).

    Well, that's the opinon of the author of PEP-587, biaised obviously ;-)

    @ncoghlan
    Copy link
    Contributor

    I agree the status quo has its benefits. However, it has the distinct downside that because each interpreter has its own storage for copied settings, other code has to assume the settings *might* be different in each interpreter, even if they're currently always the same.

    If the invariant bits are moved out, then it clearly indicates when code can safely assume that every interpreter will be seeing the same value.

    @vstinner
    Copy link
    Member

    If the invariant bits are moved out, then it clearly indicates when code can safely assume that every interpreter will be seeing the same value.

    What is the benefit of that? Do you have an example?

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @ericsnowcurrently ericsnowcurrently added 3.12 bugs and security fixes and removed 3.11 only security fixes labels Oct 31, 2022
    @vstinner
    Copy link
    Member

    there is no public API for getting the config or changing it

    Minor update. PEP 741 added PyConfig_Get() and PyConfig_Set() to get an set the current runtime configuration. The API doesn't leak implementation details, it's not directly related to PyConfig. For example, PyConfig_Get("module_search_paths") gets sys.path.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants