-
-
Notifications
You must be signed in to change notification settings - Fork 649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run python sources with a VenvPex #17700
Conversation
I marked this as cherry-pick back to 2.14.x, since I'd love to get the perf boost when upgrading to a minor release. (Hopefully the merge won't be painful) |
Cool change! I will review properly asap, but I'd be against the cherry-pick to 2.14 (or even 2.15). We really only want to backport bugfixes. This change is not a bugfix, and may be destabilizing. |
I highly recommend deferring the local_dists part of this change to a separate PR. I think I see a problem there, and it will be much easier to reason about if we do it separately. These are both potentially good changes, but they are also potentially disruptive, and if not reasoned about carefully could lead to subtle issues, so let's do one thing at a time? |
IIRC the local dists did come out of a speed bump I hit, although might not be there anymore. I'll split it out. While I'm doing that, come up with a unit test for what you're thinking, since everything passes currently. |
Ah I found the speed bump. |
@thejcannon I assume you ran into the fact that adding a PEX_PATH to a venv causes the venv to be re-built to include the items on the PEX_PATH in the new venv and so any older seed location is no longer the right one? |
Very specifically, I ran into a test failure where the code being run couldn't import modules from the local dist. What you describe would certainly explain the test failure, but I didn't look into it past "it failed, let's make it not fail". I'll keep this in mind for my own edification. As a related aside, our fondness of pex env vars in our process executions (for things that could be folded into the PEX) is only slightly alarming. But perhaps thats to maximize a particular PEX's re-usability? |
I think so. The number of perf hacks is alarming. Some day we'll get time to start to address the real costs of sandboxing and tune it specifically per-OS. Basically sandboxing just doesn't work, so we've added a ton of complexity to paper over that. Its sort of working, but pretty creaky. |
@@ -99,12 +99,17 @@ async def generate_python_from_protobuf( | |||
protoc_gen_mypy_script = "protoc-gen-mypy" | |||
protoc_gen_mypy_grpc_script = "protoc-gen-mypy_grpc" | |||
mypy_pex = None | |||
complete_pex_env = pex_environment.in_sandbox(working_directory=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A drive-by fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, below (as you may have seen) we now need to pass complete_pex_env
into VenvPexRequest
. We could default it, but I like the explicitness of forcing callers to choose.
src/python/pants/backend/python/goals/run_python_source_integration_test.py
Show resolved
Hide resolved
) -> VenvScriptWriter: | ||
# N.B.: We don't know the working directory that will be used in any given | ||
# invocation of the venv scripts; so we deal with working_directory once in an | ||
# `adjust_relative_paths` function inside the script to save rule authors from having to do | ||
# CWD offset math in every rule for all the relative paths their process depends on. | ||
complete_pex_env = pex_environment.in_sandbox(working_directory=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you stripped the comment of its claim. The now CompletePexEnvironment
may have a known working directory, which seems to suggest an assert complete_pex_env._working_directory is None, but that's obviously begging why let people pass this in at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm only partially grokking the comment, but it still feels true, albeit missing a qualifier.
Should it not start with "We may not know the ..."?
EDIT: No, if we may not know, then we don't know... so I think it's still OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh and "why let people pass this in at all?": Because in the case of run
, working_directory
very much is set to something: the build root!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, missed that obvious case given the PR subject. As to the comment, it seems like "in any given invocation" -> "in any given sandbox invocation" then makes it True again with the creation of the complete_pex_env
now delegated upstream. Not worth another CI burn though.
PS if splitting off the local dists part of the change is disproportionally hard to do, then we don't absolutely have to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC I notice quite a few diff lines that were unrelated to the substantial purpose of the PR, but seem more related to a personal preference for inlining? Those can muddy the waters in a review, so for future changes I'd recommend considering whether that sort of subjective aesthetic choice is worth the diff lines... :)
Can you give an example? I suspect everything you're noticing is a result of |
# variables are not stripped. | ||
"--no-strip-pex-env", | ||
|
||
pex_request, sources = await MultiGet( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the example I was thinking of. AFAICT this diff is entirely due to inlining the two Gets into the MultiGet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, sorry that'd a byproduct of several refactors where it was part of the earlier MultiGet
(on line 51) which has them inlined.
I wish we had better lint tools and we could choose a convention 😮💨
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal at all, just something to think about in the future. I often do little cleanups like this in PRs and then realize it's contributing to the diff in ways that aren't directly pertinent to the change at hand...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. That said, IIUC, this is a perf improvement that was actually along the lines of a footgun that let this slip out of the gate with poorer than needed perf. This PR just whacs the footgun mole and brings the perf up to par; so the emphasis in my mind here is that the existence of Pex
and VenvPex
is confusing and prone to perf errors.
Couldn't agree more (solely in the context of Pants) |
And the naming doesn't help. I naively originally thought a |
This change makes it so we use a `VenvPex` to run Python sources, which is a speed boost (I measure a gain of about ~500ms, which is also quoted in `pex.py`). In order to make this work (specifically ensuring we don't revert the fix for pantsbuild#12055) we now have to weave the complete pex environment through to `VenvPexRequest`.
This change makes it so we use a `VenvPex` to run Python sources, which is a speed boost (I measure a gain of about ~500ms, which is also quoted in `pex.py`). In order to make this work (specifically ensuring we don't revert the fix for #12055) we now have to weave the complete pex environment through to `VenvPexRequest`.
This change makes it so we use a
VenvPex
to run Python sources, which is a speed boost (I measure a gain of about ~500ms, which is also quoted inpex.py
).In order to make this work (specifically ensuring we don't revert the fix for #12055) we now have to weave the complete pex environment through to
VenvPexRequest
.