-
-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start incremental migration from rust-cpython to PyO3 #12110
Conversation
# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
[ci skip-build-wheels]
# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
I think the parallelism mechanism is similar to Rust-CPython iiuc: I didn't totally understand the section on the GIL and mutability. It'd be helpful if someone could check over that and confirm this is consistent with what we'd expect: https://pyo3.rs/v0.13.2/types.html. Are there other questions we should be asking in deciding if we want PyO3 vs. Rust-CPython? I imagine a performance benchmark, but not sure how to get a meaningful one because porting the core functionality of our extension (rule graph scheduler) is non-trivial. Fwit, PyO3 claims we can use references rather than ownership much more than with rust-cpython: https://pyo3.rs/v0.13.2/rust_cpython.html#ownership-and-lifetimes. |
@@ -0,0 +1,33 @@ | |||
// Copyright 2021 Pants project contributors (see CONTRIBUTORS.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to change this file organization at all for our FFI? I copied and pasted engine/src
's layout, but we can use this as an opportunity to reorganize things.
For one, I think that interface.rs
was way too big before, largely due to how hard it was with Rust-CPython to use types declared in other files. That is solved by PyO3 thanks to using normal Rust structs with normal visibility rules. We definitely want to break up interface.rs
, although I'm not sure how? For example, externs/interface/nailgun.rs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we do want to change it, it probably makes sense to have that reorganization be a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, I would agree, but I think we can make some improvements along the way given this:
Because the migration is incremental, we can more safely improve our FFI implementation along the way, rather than merely preserving the status quo. For example, this PR makes the PyStubCAS API more ergnonomic.
Oh, @stuhood is this a blocker for ABI3?
|
Is the intent that |
Yes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't resist taking a quick look at this after you linked it from the PyO3 issue. I'm excited to see this happening; please feel welcome to ping me if you want me to check in on any future reviews 🚀
// (e.g. Python 3.7 vs 3.8). | ||
println!("cargo:rerun-if-env-changed=PY"); | ||
|
||
if cfg!(target_os = "macos") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI this bit won't be necessary on the upcoming PyO3 0.14 (as it'll be done upstream for you).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Thanks!
Of the 4 justifications only the fourth is interesting. IIRC cpython did not use abi3 for perf reasons. Do you have any information on that? Are we trading one binary per OS for a loss in perf? This would be good to have a solid handle on before doing an incremental migration. Alternatively, a complete migration and resulting apples-apples perf numbers would of course be a fine avenue too. |
Does "interesting" == "compelling" here? If so, I disagree that the other reasons aren't compelling:
There's inherent value in things being easier to understand, and our project attempts to prioritize understanding and maintainability where possible. There's also inherent value in integrating with tooling like IDE IntelliSense. Compare
We have identified as a project that we prioritize bringing in new contributors. One of the best ways to do this is to make our code easier to understand.
This allows us to write more idiomatic and maintainable Rust code. For example, PyO3 works with
(This is a 5th reason). Better traction often (but not always) translates to a couple benefits:
--
I tried looking through their GitHub and couldn't find relevant references to "abi3" or "pep 384" explaining their rationale. But Stu explained it in #9593 as:
Iiuc, the performance thing is speculation from Stu. PyO3's docs on ABI 3 do not mention any perf hit, only these 3 limitations: https://pyo3.rs/v0.13.2/building_and_distribution.html#missing-features. I do see in Cython's issue some performance concerns with PEP 384, but not clear how much and if it would hit Pants and PyO3. Either way, I'm not proposing PyO3 exclusively because of PEP 384. It's not clear to me if we can even use it because we use |
ABI3 is definitely slower than building for a known Python version. We've been doing a lot of optimization work for 0.14 which makes use of APIs which aren't available for ABI3. There was already some perf gap and I'm certain it's gotten substantially wider, though I haven't yet made concrete benchmark numbers to back that claim up. (The docs for |
Ok, thanks @davidhewitt. @Eric-Arellano I think that takes the last justification off the table and makes this all simpler to think about. Is it worth the disruption for nicer APIs? Probably, but at least the motivation is now crisp. |
Fair enough - I agree. This is where an incremental migration comes in: reduce the size of the disruption. Because each change will be small, we can also clean up our FFI code with each PR, e.g. implementing this ticket Stu opened #11722. |
Another aspect worth considering here is whether Py03 eases or complicates embedding Python moving forward (or whether being mid-migration might do so). Entirely possible that it is neutral, but #7369 would be a higher priority than an API change, I think. It would make the ABI3 question moot, for example (since we'd embed a precise version). |
@stuhood see #7369 (comment) for PyOxidizer maintainer's response. Does that sound good? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. That risk seems manageable, especially since using PyOxidizer isn't scheduled, and if we think that we can push through a migration in less than ... 3 months or so?
Hopefully less! This migration is the main side project I want to do this summer, especially for me to gain more experience with Rust native extensions in prep for my EuroPython talk about the topic. |
…uild#12110)" This reverts commit 7a36876.
See #12110 for the original motivation. ## Why migrate Beyond the APIs requiring much less boilerplate and being more ergonomic (e.g. the `.call0()` and `call1()` variants of `.call()`), I think the biggest benefit is clearer modeling for when the GIL is held. With PyO3, the two [core types](https://pyo3.rs/v0.15.0/types.html) are `&PyAny` and `PyObject` (aka `Py<PyAny`). `&PyAny` is always a reference with the same lifetime as the `Python` token, which represents when the GIL is used. Whenever you want to do Python operations in Rust, like `.getattr()`, you use `&PyAny`. `PyObject` and any `Py<T>` are GIL-independent. In contrast, rust-cpython only had `PyObject` and `&PyObject`. It passed around `Python` as an argument to any Python functions like `.getattr()`. -- An additional benefit is that using proc macros instead of declarative macros means PyCharm no longer hangs for me when working in the `engine/` crate! PyCharm does much better w/ proc macros, and things feel zippy now like autocomplete working. ## Migration strategy I tried originally doing an incremental strategy, most recently with #12451, which proved technically infeasible. This PR completely swaps Rust-Cpython with PyO3, but tries to minimize the diff. It only makes changes when it was too difficult to get working with PyO3. For example, `interface.rs` is far too big, but this keeps the same organization as before. For now, we still have the `native_engine_pyo3` extension along with `native_engine`. That will be consolidated in a followup. ## Benchmark Before: ``` ❯ hyperfine -w 1 -r 10 './pants --no-pantsd list ::' Benchmark #1: ./pants --no-pantsd list :: Time (mean ± σ): 2.170 s ± 0.025 s [User: 1.786 s, System: 0.303 s] Range (min … max): 2.142 s … 2.227 s 10 runs ``` After: ``` ❯ hyperfine -w 1 -r 10 './pants --no-pantsd list ::' Benchmark #1: ./pants --no-pantsd list :: Time (mean ± σ): 2.193 s ± 0.060 s [User: 1.798 s, System: 0.292 s] Range (min … max): 2.144 s … 2.348 s 10 runs ```
Rust-cpython vs PyO3
While developing my talk on Rust native extensions, I found PyO3 has had lots of traction recently, including Python cryptography now using it. PyO3 started as a fork, but has since diverged: https://pyo3.rs/v0.13.2/rust_cpython.html.
We went with rust-cpython instead of PyO3 for Rust FFI in #9593 because, at the time, PyO3 required the Rust nightly compiler. While this was a huge win over CFFI, PyO3 would now bring us several benefits:
macro_rules!
DSL. This is more natural to work with and better understood by IDEs, rustfmt, etc.Python
. See https://pyo3.rs/v0.13.2/rust_cpython.html#error-handling.* Can use the Python stable ABI again, meaning we only need to build one wheel per platform, rather than platform x interpreter. (Altho we would need to stop usingWe're unlikely to use this due to performance hit.PyBuffer
API.) See https://pyo3.rs/v0.13.2/building_and_distribution.html#py_limited_apiabi3.The community has been very responsive too: PyO3/pyo3#1625, and they have an active Gitter room.
Incremental migration
To reduce risk and complexity, we use an incremental migration to PyO3 by building two distinct native extensions. At first,
native_engine_pyo3
will only have self-contained functionality, e.g.PyStubCAS
and, soon, the Nailgun server.Because the migration is incremental, we can more safely improve our FFI implementation along the way, rather than merely preserving the status quo. For example, this PR makes the
PyStubCAS
API more ergnonomic.When we reach critical mass with PyO3,
native_engine_pyo3
will be renamed tonative_engine
andnative_engine
will be changed tonative_engine_cpython
. (Or, we'll remove rust-cpython in one big swoop). This will result in some churn in our Python files (due to import module changing), but I argue that's worth the reduction in risk.