-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Give imported packages lower precedence than standard library. #6532
Conversation
This sounds great and seems it could also fix the issue we are running into with google/containerregistry#109 - where https://github.com/agronholm/pythonfutures is used for python 2 futures compatibility but it does not work with python 3, so there it should import the syslib but it is giving preference to the dependency. @brandjon if possible it would be great to get this in for the 0.20 release. |
Friendly ping - Jon, any thoughts as to whether a change like this could be workable? |
ping @brandjon - AFAICT this is part of improving python ecosystem support in bazel. |
Hi @brandjon and @aiuto - Can I ask for an update here? This PR was opened almost two months ago, and no feedback has been given. I understand if the change is more subtle/complex than it appears for google3-reasons, but I'd appreciate some indication of whether or not this (or something like it) is a viable direction. Thanks in advance! |
Hi Daniel, apologies for the delay. I've fallen behind on triaging some Python PRs as a result of some feature work on the 2/3 mode issue. I'm on a release manager rotation this week but aim to work through this backlog in the coming days. A quick comment about the substance of the change: Is it ever desirable in practice to continue to allow packages to override standard library modules? I.e., is that behavior consistent with any common workflow or default configuration in normal Python programming? (For example, ordinarily when there is a file whose name clashes with a standard library module, other files in the same directory will import it instead of the standard library. Though that particular example shouldn't be a concern for Bazel, since we use module names qualified by the package path.) If the answer is yes then we should provide some way to still get the old behavior, otherwise this is basically a bug fix and doesn't need to preserve compatibility. |
Hey Brandon - Thanks for getting back to me! I'm a relative newcomer to the Python ecosystem myself, so I can't speak to whether there are idiomatic examples of overriding the standard library. I would imagine that this ought to be discouraged. I believe I saw something about Bazel, in the future, allowing alternate template-stubs for Python "binaries". Perhaps this is the answer to projects that wish to use the legacy import-ordering? As you mentioned, it shouldn't be too much of an issue for Bazel, since it generally requires fully qualified package paths. |
True that having alternate stubs would help. I mainly ask because that feature's not available yet, and we're trying to be better about avoiding breaking changes unless they're gated behind an --incompatible flag that people can opt out of while they migrate. But I'm not sure this change requires that. |
What do you think of the following:
Thoughts? |
Hey Jon - Happy new year! 🎉 With the holidays behind us, I was hoping we could revisit and take another look at this change - It's been open for a few months now, with no concerns heard surrounding the "breaking"ness of the change. Is there anything I can do on my end to help move this forward, or help inform whether the change can be merged safely? Thanks in advance! |
PR #6532 proposes changing the semantics, but that'll require a test of what we're changing. If we decide to go ahead with that, the assertion in this new test case just needs to be flipped. RELNOTES: None PiperOrigin-RevId: 227876534
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "should we do this" question aside, we also should have a way to test this change. I wrote an integration test (034334d) that confirms the current semantics. If we make this change, the test assertion just needs to be flipped in this PR.
src/main/java/com/google/devtools/build/lib/bazel/rules/python/python_stub_template.txt
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/bazel/rules/python/python_stub_template.txt
Outdated
Show resolved
Hide resolved
Reviewed (see above) -- the PR sounds reasonable and it's just a matter of confirming with the community. I'm hoping to avoid the overhead of an incompatible change flag for this. |
src/main/java/com/google/devtools/build/lib/bazel/rules/python/python_stub_template.txt
Outdated
Show resolved
Hide resolved
Also -- what's the rationale for emitting a usercustomize file as opposed to just sticking them directly on the |
I think my rationale was that since Python provides a well-defined mechanism for configuring the |
While there are indeed many moving parts for determining how an import is resolved, it's not clear to me why usercustomize is a better hook than PYTHONPATH itself. I think of them as just two different alternative mechanisms without one necessarily being more preferred than the other. Actually, I would think of usercustomize as more appropriate for a persistent configuration change that applies to all builds done by a given user, a la bashrc. But I'm fine leaving this as-is. It's not hard to change it later if needed. (I imagine it might conflict with real usercustomize modules installed on the system, but consulting those would be non-hermetic anyway.) Mailing list results look pretty positive on just making this change. Will give it at least to later today before merging. |
I believe it may also have had to do with |
The current Bazel Python stub template prepends imported packages onto `PYTHONPATH`. This effectively gives them higher `import`-precedence than any directories: - included in `PYTHONPATH` by the user - included in `PYTHONPATH` or `sys.path` by the interpreter The later category includes the Python standard library. This can cause real and nontrivial problems for projects. If a project uses uses `pip_import` to depend on `flake8`, then it will also transitively import `enum34`, a library which provides compatibility with the `enum` library (introduced in Python 3.4) for pre-3.4 versions of Python, but expects never to be imported by Python>=3.4. The scheme currently used by Bazel will find the `enum34` module prior to finding the Standard Library `enum` module, even when run by Python3.7. The solution proposed here is to build and load a `usercustomize` module at program start, which will be loaded during program startup. This module appends the directory of each imported module to the `sys.path` variable, effectively making the packages available, but giving them lower precedence than built-in directories.
Ah, I assumed the standard libraries were on the I have a commit to update the failing test in accordance with this behavior change -- can you give access to the PR branch? |
I believe the "Allow edits from maintainers" option should already be enabled. |
My mistake, pushed. |
The failures look legit. Example failure for
It's failing to find the module entirely, not shadowing the module with something else of the same name, so it can't just be the order of |
I think the issue & reproduction I opened yesterday is relevant to this pull request: #7051 Do you think merging this PR fixes the problem? |
Yep; failure looks legit to me as well - I don't have access to a Windows box, so I'm going to need to iterate a bit on this, using your CI to help debug. |
@Profpatsch: No, this PR is about the case where a library declared within the Bazel build collides with a library setup by the system/environment, and resolves that case in favor of the system's library. The issue you filed is about two libraries clashing that are both declared within the Bazel build. I was actually just preparing a discussion about that problem, which is also reported in #6886 -- I'll post it on the bazel-sig-python list. |
Just to confirm: It sounds like this fixes issue #5899? |
Yes, thanks for the crosslink. |
780365b
to
9ecec27
Compare
|
See #7959 for other |
I realize that this is a very old branch, but is my reading correct that this change will make it harder for builds to isolate themselves from whatever random packages might be installed on the system? If so, this seems like a step backwards to me. I'd much rather bazel python targets declare all their dependencies and operate with the equivalent of Imagine the situation where the user is on a system that has package |
I'm not likely to have time or energy to revisit this PR in the near future - Would recommend closing for the time being, and restarting discussion of whether there's a better solution if it's still an issue for other users. |
Thanks for that update. |
The current Bazel Python stub template prepends imported packages onto
PYTHONPATH
. This effectively gives them higherimport
-precedencethan any directories:
PYTHONPATH
by the userPYTHONPATH
orsys.path
by the interpreterThe later category includes the Python standard library.
This can cause real and nontrivial problems for projects. If a project
uses uses
pip_import
to depend onflake8
, then it will alsotransitively import
enum34
, a library which provides compatibilitywith the
enum
library (introduced in Python 3.4) for pre-3.4 versionsof Python, but expects never to be imported by Python>=3.4. The scheme
currently used by Bazel will find the
enum34
module prior to findingthe Standard Library
enum
module, even when run by Python3.7.The solution proposed here is to build and load a
usercustomize
moduleat program start, which will be loaded during program startup. This
module appends the directory of each imported module to the
sys.path
variable, effectively making the packages available, but giving them
lower precedence than built-in directories.
cc: @brandjon