Run python sources with a VenvPex #17700

thejcannon · 2022-12-01T18:56:16Z

This change makes it so we use a VenvPex to run Python sources, which is a speed boost (I measure a gain of about ~500ms, which is also quoted in pex.py).

In order to make this work (specifically ensuring we don't revert the fix for #12055) we now have to weave the complete pex environment through to VenvPexRequest.

thejcannon · 2022-12-01T18:57:40Z

I marked this as cherry-pick back to 2.14.x, since I'd love to get the perf boost when upgrading to a minor release. (Hopefully the merge won't be painful)

benjyw · 2022-12-02T02:53:27Z

Cool change! I will review properly asap, but I'd be against the cherry-pick to 2.14 (or even 2.15). We really only want to backport bugfixes. This change is not a bugfix, and may be destabilizing.

benjyw · 2022-12-02T03:01:38Z

I highly recommend deferring the local_dists part of this change to a separate PR. I think I see a problem there, and it will be much easier to reason about if we do it separately.

These are both potentially good changes, but they are also potentially disruptive, and if not reasoned about carefully could lead to subtle issues, so let's do one thing at a time?

thejcannon · 2022-12-02T12:21:38Z

IIRC the local dists did come out of a speed bump I hit, although might not be there anymore.

I'll split it out. While I'm doing that, come up with a unit test for what you're thinking, since everything passes currently.

thejcannon · 2022-12-02T16:08:52Z

Ah I found the speed bump. PEX_PATH doesn't work with VenvPex shenanigans, and so the local_dists pex needs to be a part of the VenvPex. It's doable, so I'll do that.

jsirois · 2022-12-02T20:00:04Z

PEX_PATH doesn't work with VenvPex shenanigans

@thejcannon I assume you ran into the fact that adding a PEX_PATH to a venv causes the venv to be re-built to include the items on the PEX_PATH in the new venv and so any older seed location is no longer the right one?

thejcannon · 2022-12-02T20:05:19Z

PEX_PATH doesn't work with VenvPex shenanigans

@thejcannon I assume you ran into the fact that adding a PEX_PATH to a venv causes the venv to be re-built to include the items on the PEX_PATH in the new venv and so any older seed location is no longer the right one?

Very specifically, I ran into a test failure where the code being run couldn't import modules from the local dist. What you describe would certainly explain the test failure, but I didn't look into it past "it failed, let's make it not fail". I'll keep this in mind for my own edification.

As a related aside, our fondness of pex env vars in our process executions (for things that could be folded into the PEX) is only slightly alarming. But perhaps thats to maximize a particular PEX's re-usability?

jsirois · 2022-12-02T20:08:56Z

But perhaps thats to maximize a particular PEX's re-usability?

I think so. The number of perf hacks is alarming. Some day we'll get time to start to address the real costs of sandboxing and tune it specifically per-OS. Basically sandboxing just doesn't work, so we've added a ton of complexity to paper over that. Its sort of working, but pretty creaky.

jsirois · 2022-12-02T20:12:11Z

src/python/pants/backend/codegen/protobuf/python/rules.py

@@ -99,12 +99,17 @@ async def generate_python_from_protobuf(
    protoc_gen_mypy_script = "protoc-gen-mypy"
    protoc_gen_mypy_grpc_script = "protoc-gen-mypy_grpc"
    mypy_pex = None
+    complete_pex_env = pex_environment.in_sandbox(working_directory=None)


A drive-by fix?

No, below (as you may have seen) we now need to pass complete_pex_env into VenvPexRequest. We could default it, but I like the explicitness of forcing callers to choose.

src/python/pants/backend/python/goals/run_python_source_integration_test.py

jsirois · 2022-12-02T20:22:04Z

src/python/pants/backend/python/util_rules/pex.py

    ) -> VenvScriptWriter:
        # N.B.: We don't know the working directory that will be used in any given
        # invocation of the venv scripts; so we deal with working_directory once in an
        # `adjust_relative_paths` function inside the script to save rule authors from having to do
        # CWD offset math in every rule for all the relative paths their process depends on.
-        complete_pex_env = pex_environment.in_sandbox(working_directory=None)


It looks like you stripped the comment of its claim. The now CompletePexEnvironment may have a known working directory, which seems to suggest an assert complete_pex_env._working_directory is None, but that's obviously begging why let people pass this in at all?

So I'm only partially grokking the comment, but it still feels true, albeit missing a qualifier.

Should it not start with "We may not know the ..."?

EDIT: No, if we may not know, then we don't know... so I think it's still OK

Oh and "why let people pass this in at all?": Because in the case of run, working_directory very much is set to something: the build root!

Aha, missed that obvious case given the PR subject. As to the comment, it seems like "in any given invocation" -> "in any given sandbox invocation" then makes it True again with the creation of the complete_pex_env now delegated upstream. Not worth another CI burn though.

benjyw · 2022-12-03T01:38:11Z

PS if splitting off the local dists part of the change is disproportionally hard to do, then we don't absolutely have to.

benjyw

IIUC I notice quite a few diff lines that were unrelated to the substantial purpose of the PR, but seem more related to a personal preference for inlining? Those can muddy the waters in a review, so for future changes I'd recommend considering whether that sort of subjective aesthetic choice is worth the diff lines... :)

src/python/pants/backend/python/goals/run_helper.py

thejcannon · 2022-12-05T18:20:26Z

IIUC I notice quite a few diff lines that were unrelated to the substantial purpose of the PR, but seem more related to a personal preference for inlining? Those can muddy the waters in a review, so for future changes I'd recommend considering whether that sort of subjective aesthetic choice is worth the diff lines... :)

Can you give an example? I suspect everything you're noticing is a result of black formatting a line that has exceeded our configured line length.

benjyw · 2022-12-05T18:30:56Z

src/python/pants/backend/python/goals/run_helper.py

-                # variables are not stripped.
-                "--no-strip-pex-env",
+
+    pex_request, sources = await MultiGet(


This was the example I was thinking of. AFAICT this diff is entirely due to inlining the two Gets into the MultiGet?

Ah yeah, sorry that'd a byproduct of several refactors where it was part of the earlier MultiGet (on line 51) which has them inlined.

I wish we had better lint tools and we could choose a convention 😮‍💨

Not a big deal at all, just something to think about in the future. I often do little cleanups like this in PRs and then realize it's contributing to the diff in ways that aren't directly pertinent to the change at hand...

jsirois

This is great. That said, IIUC, this is a perf improvement that was actually along the lines of a footgun that let this slip out of the gate with poorer than needed perf. This PR just whacs the footgun mole and brings the perf up to par; so the emphasis in my mind here is that the existence of Pex and VenvPex is confusing and prone to perf errors.

thejcannon · 2022-12-05T20:37:34Z

Couldn't agree more (solely in the context of Pants)

thejcannon · 2022-12-05T20:38:31Z

And the naming doesn't help. I naively originally thought a --venv PEX was the speedup, instead of this particular hack/technique

This change makes it so we use a `VenvPex` to run Python sources, which is a speed boost (I measure a gain of about ~500ms, which is also quoted in `pex.py`). In order to make this work (specifically ensuring we don't revert the fix for pantsbuild#12055) we now have to weave the complete pex environment through to `VenvPexRequest`.

This change makes it so we use a `VenvPex` to run Python sources, which is a speed boost (I measure a gain of about ~500ms, which is also quoted in `pex.py`). In order to make this work (specifically ensuring we don't revert the fix for #12055) we now have to weave the complete pex environment through to `VenvPexRequest`.

Run Python sources using venvpex

4e0c71f

thejcannon added needs-cherrypick category:performance labels Dec 1, 2022

thejcannon added this to the 2.14.x milestone Dec 1, 2022

thejcannon requested review from stuhood, jsirois, benjyw and Eric-Arellano December 1, 2022 18:56

thejcannon mentioned this pull request Dec 1, 2022

Test DebugAdapter requests, and fix issues #17678

Merged

thejcannon removed the needs-cherrypick label Dec 2, 2022

thejcannon removed this from the 2.14.x milestone Dec 2, 2022

put local dists back

7ea32ac

jsirois reviewed Dec 2, 2022

View reviewed changes

benjyw approved these changes Dec 5, 2022

View reviewed changes

src/python/pants/backend/python/goals/run_helper.py Outdated Show resolved Hide resolved

benjyw reviewed Dec 5, 2022

View reviewed changes

is not None

4b4760a

jsirois approved these changes Dec 5, 2022

View reviewed changes

thejcannon merged commit 780ff73 into pantsbuild:main Dec 5, 2022

thejcannon deleted the runvenv branch December 5, 2022 20:47

stuhood mentioned this pull request Jan 18, 2023

No such file or directory: 'local_dists.pex/PEX-INFO' when running a python_source #17987

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run python sources with a VenvPex #17700

Run python sources with a VenvPex #17700

thejcannon commented Dec 1, 2022 •

edited

Loading

thejcannon commented Dec 1, 2022 •

edited

Loading

benjyw commented Dec 2, 2022

benjyw commented Dec 2, 2022

thejcannon commented Dec 2, 2022

thejcannon commented Dec 2, 2022

jsirois commented Dec 2, 2022

thejcannon commented Dec 2, 2022 •

edited

Loading

jsirois commented Dec 2, 2022

jsirois Dec 2, 2022

thejcannon Dec 3, 2022

jsirois Dec 2, 2022

thejcannon Dec 3, 2022 •

edited

Loading

thejcannon Dec 3, 2022

jsirois Dec 5, 2022

benjyw commented Dec 3, 2022

benjyw left a comment

thejcannon commented Dec 5, 2022

benjyw Dec 5, 2022

thejcannon Dec 5, 2022 •

edited

Loading

benjyw Dec 5, 2022

jsirois left a comment

thejcannon commented Dec 5, 2022

thejcannon commented Dec 5, 2022

Run python sources with a VenvPex #17700

Run python sources with a VenvPex #17700

Conversation

thejcannon commented Dec 1, 2022 • edited Loading

thejcannon commented Dec 1, 2022 • edited Loading

benjyw commented Dec 2, 2022

benjyw commented Dec 2, 2022

thejcannon commented Dec 2, 2022

thejcannon commented Dec 2, 2022

jsirois commented Dec 2, 2022

thejcannon commented Dec 2, 2022 • edited Loading

jsirois commented Dec 2, 2022

jsirois Dec 2, 2022

Choose a reason for hiding this comment

thejcannon Dec 3, 2022

Choose a reason for hiding this comment

jsirois Dec 2, 2022

Choose a reason for hiding this comment

thejcannon Dec 3, 2022 • edited Loading

Choose a reason for hiding this comment

thejcannon Dec 3, 2022

Choose a reason for hiding this comment

jsirois Dec 5, 2022

Choose a reason for hiding this comment

benjyw commented Dec 3, 2022

benjyw left a comment

Choose a reason for hiding this comment

thejcannon commented Dec 5, 2022

benjyw Dec 5, 2022

Choose a reason for hiding this comment

thejcannon Dec 5, 2022 • edited Loading

Choose a reason for hiding this comment

benjyw Dec 5, 2022

Choose a reason for hiding this comment

jsirois left a comment

Choose a reason for hiding this comment

thejcannon commented Dec 5, 2022

thejcannon commented Dec 5, 2022

thejcannon commented Dec 1, 2022 •

edited

Loading

thejcannon commented Dec 1, 2022 •

edited

Loading

thejcannon commented Dec 2, 2022 •

edited

Loading

thejcannon Dec 3, 2022 •

edited

Loading

thejcannon Dec 5, 2022 •

edited

Loading