-
-
Notifications
You must be signed in to change notification settings - Fork 637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port from cffi to the cpython crate. #9593
Conversation
ce80b6b
to
e38e759
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Epic! Have you done any profiling to check if this impacts performance?
src/python/pants/engine/scheduler.py
Outdated
@@ -132,45 +128,13 @@ def __init__( | |||
|
|||
def graph_trace(self, session, execution_request): | |||
with temporary_file_path() as path: | |||
self._native.lib.graph_trace(self._scheduler, session, execution_request, path.encode()) | |||
self._native.lib.graph_trace(self._scheduler, session, execution_request, path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, so it's safe to serialize unicode? Neat!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust and Python3 both use utf8-encoded unicode strings as their main string primitives (which is a very sane decision), so I'm not surprised this could be made to work automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not inherently zero copy (some conversion still happens at the call boundary), but it is possible to do some things as zero copy by accepting a PyString
on the rust side, and then temporarily viewing it as a Cow<str>
(copy on write &str
): https://docs.rs/cpython/0.5.0/cpython/struct.PyString.html#method.to_string
src/python/pants/engine/scheduler.py
Outdated
@@ -132,45 +128,13 @@ def __init__( | |||
|
|||
def graph_trace(self, session, execution_request): | |||
with temporary_file_path() as path: | |||
self._native.lib.graph_trace(self._scheduler, session, execution_request, path.encode()) | |||
self._native.lib.graph_trace(self._scheduler, session, execution_request, path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust and Python3 both use utf8-encoded unicode strings as their main string primitives (which is a very sane decision), so I'm not surprised this could be made to work automatically.
82a10c8
to
396e3c0
Compare
### Problem Two tests are failing on master, one of which affected #9593, and both of which repro locally. Not sure if this is Python version differences. [ci skip-rust-tests] [ci skip-jvm-tests]
be78faf
to
284d590
Compare
I've updated the description to cover some other benefits and downsides. I'll continue working on this in the background, but we'll likely want to drop Centos6 before attempting to ship more wheels: see #7288. |
284d590
to
3f6788b
Compare
7db2f1b
to
4c6283b
Compare
6a76ab1
to
980794e
Compare
980794e
to
65a87b6
Compare
I was able to fix the rust tests shard, and thanks to all of @Eric-Arellano's hard work this is now unblocked and reviewable. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay! Great work!
// NB: When built with Python 3, `native_engine.so` only works with a Python 3 interpreter. | ||
// When built with Python 2, it works with both Python 2 and Python 3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stale. We always use Py3, but it is true that Py3.6 != Py3.7 etc.
Value(Arc::new(handle)) | ||
} | ||
|
||
// NB: Longer name because overloaded in a few places. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a longer name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consume_into_py_object
@@ -0,0 +1,1608 @@ | |||
// Copyright 2020 Pants project contributors (see CONTRIBUTORS.md). | |||
// Licensed under the Apache License, Version 2.0 (see LICENSE). | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not generated, right? It's pretty big so didn't do a close review. Looks good from a skim. Lmk if I should ready closely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the file formerly known as engine_cffi/src/lib.rs
: if you look at the commits independently, you can mostly ignore the second one, as it's a rename.
[ci skip-jvm-tests]
65a87b6
to
6f14ebc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely a nicer API performance completely aside.
[ci skip-jvm-tests]
… shard to a copy of `libpython3.6m.so.1.0`. [ci skip-jvm-tests]
6f14ebc
to
2f98835
Compare
## Rust-cpython vs PyO3 While developing my talk on Rust native extensions, I found PyO3 has had lots of traction recently, including Python cryptography now using it. PyO3 started as a fork, but has since diverged: https://pyo3.rs/v0.13.2/rust_cpython.html. We went with rust-cpython instead of PyO3 for Rust FFI in #9593 because, at the time, PyO3 required the Rust nightly compiler. While this was a huge win over CFFI, PyO3 would now bring us several benefits: * Uses procedural macros, rather than a custom `macro_rules!` DSL. This is more natural to work with and better understood by IDEs, rustfmt, etc. * Great documentation, including a book: https://pyo3.rs/v0.13.2/index.html. * Better error handling, especially due to not requiring `Python`. See https://pyo3.rs/v0.13.2/rust_cpython.html#error-handling. ~* Can use the Python stable ABI again, meaning we only need to build one wheel per platform, rather than platform x interpreter. (Altho we would need to stop using `PyBuffer` API.) See https://pyo3.rs/v0.13.2/building_and_distribution.html#py_limited_apiabi3.~ We're unlikely to use this due to performance hit. The community has been very responsive too: PyO3/pyo3#1625, and they have an active Gitter room. ## Incremental migration To reduce risk and complexity, we use an incremental migration to PyO3 by building two distinct native extensions. At first, `native_engine_pyo3` will only have self-contained functionality, e.g. `PyStubCAS` and, soon, the Nailgun server. Because the migration is incremental, we can more safely improve our FFI implementation along the way, rather than merely preserving the status quo. For example, this PR makes the `PyStubCAS` API more ergnonomic. When we reach critical mass with PyO3, `native_engine_pyo3` will be renamed to `native_engine` and `native_engine` will be changed to `native_engine_cpython`. (Or, we'll remove rust-cpython in one big swoop). This will result in some churn in our Python files (due to import module changing), but I argue that's worth the reduction in risk.
Problem
When initially porting the engine to rust, we used
cffi
because it had a mode that avoided the need to actually link against Python. But somewhere along the way, we began linking against Python, and needed to add C shims to expose a proper module. Though complex, it has served us well for a few years.But during that same time, Rust python-integration crates like
cpython
have matured, providing significant safety and usability benefits over a raw C API.Solution
Port to the
cpython
crate. When compared to usage of CFFI, this eliminates the passing of raw pointers back and forth across the ffi boundary (improving errors), and allows for direct access to Python objects from Rust (while the GIL is acquired). Thecpython
crate was chosen over PyO3 because it doesn't require a nightly Rust compiler, and is vouched for by the mercurial developers.The
cpython
crate does not use PEP-384, and updating it to do so would be a significant undertaking: for example, thePyObject::call
interface relies onPyTuple
, which is not present under PEP-384. It would likely also erase some of the performance and usability benefit of this API. So in order to switch to this crate, @Eric-Arellano did a ton of pre-work (culminating in #9881) to switch away from publishing ABI3 wheels, and toward publishing a wheel-per-python3.Result
./pants list ::
is 8-10% faster, usage is easier and safer, and there is less code to maintain. Additionally, since pyoxidizer (see #7369) relies on thecpython
crate, this helps to unblock potentially switching to distributing Pants as a native binary with an embedded Python.