Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to PyO3 #433

Closed
wants to merge 3 commits into from
Closed

Switch to PyO3 #433

wants to merge 3 commits into from

Conversation

indygreg
Copy link
Owner

I'm opening this PR so we have a better place to look at the code diff.

Current state is I'm able to build a working binary with PyO3 on Linux. macOS probably works as well. Windows will need more work (and likely some changes to PyO3's build system).

There's run-time crash with release builds on Linux. I think we were tickling an existing bug, however, as the crash is due to memory safety related to argv handling during Python interpreter initialization. It is a heisenbug and goes away if I sprinkle print statements around the suspect code (which I was doing because the stack trace was nearly worthless since a lot of functions were inlined). I may have to break out rr to debug it.

Anyway, the build system integration is the piece of this PR that still needs the most work. The Rust code speaking to PyO3 in the pyembed crate should be pretty solid and ready for review. All tests pass against the PyO3 branch I'm running. At present time, that branch only has commits for which there are open PRs in the PyO3 project.

CC @davidhewitt if you feel like looking at how I'm using PyO3. The main think I know I'm not doing right is I haven't implemented a GC visitor for types that contain references to other Python types. But that was a preexisting bug I think.

@davidhewitt
Copy link

👍 I will gladly take a good read of this, I might not get around to it until next week though just FYI.

@indygreg
Copy link
Owner Author

I still need to get the oxidized-importer CI working. But basic PyOxidizer workflows appear to be working on this branch on Linux, Windows, and macOS!

This still needs a bit of cleanup before merge. And I'd like to get all the PyO3 changes this depends on landed in the upstream repo. But things are looking pretty good!

One unfortunate side-effect of the switchover is PyOxidizer's Rust projects now take longer to build. Only ~1.5s slower on my Ryzen 5950X. But in terms of CPU time we went from ~111s to ~165s. The slowdown is noticeable when running tests, which build many Rust projects as part of integration testing. sccache helps a bit. But we don't get many cache hits in tests due to use of temp directories. This newfound slowness might be a good excuse to improve test efficiency...

@davidhewitt
Copy link

One unfortunate side-effect of the switchover is PyOxidizer's Rust projects now take longer to build. Only ~1.5s slower on my Ryzen 5950X. But in terms of CPU time we went from ~111s to ~165s. The slowdown is noticeable when running tests, which build many Rust projects as part of integration testing. sccache helps a bit. But we don't get many cache hits in tests due to use of temp directories. This newfound slowness might be a good excuse to improve test efficiency...

Hmmm interesting. This is presumably due to using PyO3's proc macros instead of macro_rules? I wonder if there's much we can do to optimize them.

Also we had contributors previously who reduced llvm line counts (& compile times) by tweaking bits of the pyO3 codebase, I wonder if there's more that can be done there?

@davidhewitt
Copy link

(Just a heads up that my offer to read this is still standing, however I'm giving a talk on PyO3 (virtually) at the Rust Manchester meetup this week, so I'm a bit swamped thinking about that until then. Hopefully at a quiet point over next weekend 🤞)

Copy link

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an absolutely massive PR! I'm humbled by the amount of work you've put in here to try out PyO3 for PyOxidizer.

I've given this a brief read-through this evening and picked up on a few things that were interesting. There's kinda too much for me to read everything in full detail, however if there are particular bits you want to point me at I'm also happy to reread those?

pyembed/build.rs Outdated Show resolved Hide resolved
@@ -30,11 +37,11 @@ const SURROGATEESCAPE: &[u8] = b"surrogateescape\0";
/// The optional encoding is the name of a Python encoding. If not used,
/// the default system encoding will be used.
#[cfg(target_family = "unix")]
pub fn cpython_osstr_to_pyobject(
py: cpython::Python,
pub fn osstr_to_pyobject<'p>(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyO3 has conversions from OsStr and Path to Python objects. These may potentially be usable for you instead of the ffi you're doing in this file?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks like I'll be able to port to your conversion implementations. I'll track this in a TODO and get around to it sometime.

Comment on lines 66 to +74
///
/// This creates the Python module object.
///
/// We don't use the macros in the cpython crate because they are somewhat
/// We don't use the macros in the pyo3 crate because they are somewhat
/// opinionated about how things should work. e.g. they call
/// PyEval_InitThreads(), which is undesired. We want total control.
#[allow(non_snake_case)]
pub extern "C" fn PyInit_oxidized_importer() -> *mut oldpyffi::PyObject {
let py = unsafe { cpython::Python::assume_gil_acquired() };
pub extern "C" fn PyInit_oxidized_importer() -> *mut pyffi::PyObject {
let py = unsafe { Python::assume_gil_acquired() };

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually removed the call to PyEval_InitThreads in PyO3/pyo3#1630

... so you might find you're able to use the #[pymodule] proc macro fine?!

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I can. That's because I have to define PyModuleDef::m_size to hold module state. I might be able to get a reference to the struct somehow at run-time to populate it accordingly. But since I already have this code written, I'll probably keep it as is.

pyembed/src/extension.rs Outdated Show resolved Hide resolved
if !unsafe { oldpyffi::PyErr_Occurred() }.is_null() {
return Err(cpython::PyErr::fetch(py));
if !unsafe { pyffi::PyErr_Occurred() }.is_null() {
return Err(PyErr::fetch(py).expect("Python exception should be set"));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting to see this. Makes me think PyErr::api_call_failed may want to be a public API of PyO3?


let optimize_level = match optimize_value {
0 => Ok(OptimizeLevel::Zero),
1 => Ok(OptimizeLevel::One),
2 => Ok(OptimizeLevel::Two),
_ => Err(cpython::PyErr::new::<cpython::exc::ValueError, _>(
py,
_ => Err(PyValueError::new_err(
"unexpected value for sys.flags.optimize",
)),
}?;

let capsule = unsafe {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be interested in potentially working through PyO3/pyo3#1193 sometime, looks like you might want it (and / or want to weigh in on its design).

pyembed/src/interpreter.rs Show resolved Hide resolved
@indygreg
Copy link
Owner Author

I'm attempting to port this series to use the 0.14 branch. It might be possible. But I won't know until I'm done.

I do know that pyo3-build-config::InterpreterConfig isn't a public symbol. Having it as a public symbol would save me from having to reinvent PyO3 config file writing logic.

@indygreg
Copy link
Owner Author

OK. My local tests and CI say I have things working against 0.14.4!

The linking behavior can be extremely finicky though and getting it just right has required days of effort in total. So I'm going to sit on this and keep refining the commits until I have more confidence this works and doesn't have undesirable regressions.

I think I can convince myself to land this without the additional InterpreterConfig fields requested in PyO3/pyo3#1793 because the behavior is essentially the same as it was with cpython/python3-sys. However, said behavior is extremely hacky and brittle because essentially what I'm doing is having python3-sys/pyo3 emit partial cargo: lines to control linking behavior and then I supplement the missing pieces in pyembed's build script. I would strongly prefer for pyo3's build script to emit all the cargo: lines needed to link libpython. Because PyOxidizer's libpython is so different from normal Python distributions, it is better to have the generic InterpreterConfig extension mechanism rather than support my specific needs in the PyO3 project.

What I'm trying to say is I think I can merge this against 0.14.4. But I'd strongly prefer PyO3/pyo3#1793 land so I can greatly simplify and de-risk PyOxidizer in the long term.

Also, I don't think I need the links = Cargo.toml annotation either if I use pyo3_build_config::get(). (But I still think adding that annotation is a good idea.)

@davidhewitt
Copy link

Brilliant!

It looks like I'll need to release a 0.14.5 anyway, so let's aim to merge the non-breaking bits of PyO3/pyo3#1793 so that PyOxidizer is derisked 👍

@indygreg
Copy link
Owner Author

indygreg commented Sep 1, 2021

OK. I've rebased onto the 0.14.5 release branch (commit 9dd16a828cedc583425088fb7221cffcc7058e2a) and things appear to be working great!

One small nit is the introduction of pyo3-build-config as a dependency in the pyoxidizer crate is requiring a Python interpreter be on PATH. This is a requirement that previously only existed for the pyembed crate (which holds all the Python API code). This dependency exists because pyo3-build-config has a build script that attempts to resolve a Python interpreter.

@davidhewitt: would you accept a patch to pyo3-build-config that adds a Cargo feature to effectively make the build script no-op? This would allow use of pyo3-build-config in library mode and would help ensure the crate builds in more environments.

@davidhewitt
Copy link

Awesome!

@davidhewitt: would you accept a patch to pyo3-build-config that adds a Cargo feature to effectively make the build script no-op? This would allow use of pyo3-build-config in library mode and would help ensure the crate builds in more environments.

Yes, but I'd prefer the feature work slightly differently, for the sake of being additive. I think it could be called something like resolve-config or embed-config and should enable the build script doing actual work, and maybe even add the pyo3_build_config::get API (and perhaps all the hidden exports used in pyo3's build script too). If it was made a default feature (maybe additionally required by both pyo3 and pyo3-macros-backend for good measure), then it should be a non-breaking addition. I can then remove it from being a default feature in 0.15, because maturin would eventually like to use parts of pyo3-build-config too and won't need the config resolution either.

I think with that design, you'd be able to depend on pyo3-build-config with default-features = false? How does that sound?

@indygreg
Copy link
Owner Author

indygreg commented Sep 2, 2021

I think with that design, you'd be able to depend on pyo3-build-config with default-features = false? How does that sound?

This sounds reasonable. Let me to try to throw up a PR in the next few hours.

@indygreg
Copy link
Owner Author

indygreg commented Sep 2, 2021

PyO3/pyo3#1856

@davidhewitt
Copy link

👍 that commit is now merged and also cherry-picked onto the release-0.14.5 branch - do you want to give it another try? Fingers crossed we're all good for the PyO3 release now! 🤞

@indygreg
Copy link
Owner Author

indygreg commented Sep 2, 2021

👍 that commit is now merged and also cherry-picked onto the release-0.14.5 branch - do you want to give it another try? Fingers crossed we're all good for the PyO3 release now! 🤞

error: couldn't read /home/gps/src/pyoxidizer.git/target/debug/build/pyo3-build-config-412784ceff9b2afb/out/pyo3-build-config-file.txt: No such file or directory (os error 2)
  --> /home/gps/.cargo/git/checkouts/pyo3-a22e69bc62b9f0fd/c3796c4/pyo3-build-config/src/lib.rs:82:27
   |
82 | const CONFIG_FILE: &str = include_str!(concat!(env!("OUT_DIR"), "/pyo3-build-config-file.txt"));
   |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: this error originates in the macro `include_str` (in Nightly builds, run with -Z macro-backtrace for more info)

error: couldn't read /home/gps/src/pyoxidizer.git/target/debug/build/pyo3-build-config-412784ceff9b2afb/out/pyo3-build-config-abi3.txt: No such file or directory (os error 2)
  --> /home/gps/.cargo/git/checkouts/pyo3-a22e69bc62b9f0fd/c3796c4/pyo3-build-config/src/lib.rs:87:27
   |
87 | const ABI3_CONFIG: &str = include_str!(concat!(env!("OUT_DIR"), "/pyo3-build-config-abi3.txt"));
   |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: this error originates in the macro `include_str` (in Nightly builds, run with -Z macro-backtrace for more info)

error: couldn't read /home/gps/src/pyoxidizer.git/target/debug/build/pyo3-build-config-412784ceff9b2afb/out/pyo3-build-config.txt: No such file or directory (os error 2)
  --> /home/gps/.cargo/git/checkouts/pyo3-a22e69bc62b9f0fd/c3796c4/pyo3-build-config/src/lib.rs:92:27
   |
92 | const HOST_CONFIG: &str = include_str!(concat!(env!("OUT_DIR"), "/pyo3-build-config.txt"));
   |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: this error originates in the macro `include_str` (in Nightly builds, run with -Z macro-backtrace for more info)

error: aborting due to 3 previous errors

I guess we did need to conditionalize some of the variables and APIs after all. I'll get another PR up soon.

@indygreg
Copy link
Owner Author

indygreg commented Sep 3, 2021

PyO3/pyo3#1859 is up. I cherry picked this to the release-0.14.5 branch and pushed to my fork as the pyoxidizer branch and rebased my series to use that branch. Everything looks good now.

@davidhewitt
Copy link

Thanks for fixing. I've added that commit to the 0.14.5 release branch now. I'll put it live on the weekend, looks like we've made it through the initial challenges!

This started out as a series of commits. But about 3-4 commits in
the hackery required to get cpython and pyo3 to play well together
was too much and writing microcommits proved to be too difficult.
So I just squased everything together into one giant commit.

This commit will break PyOxidizer in some configurations because the
crate configuration mechanism changed. This will be addressed in
a subsequent commit: this commit is already too large as it is.
This commit finishes the port of PyOxidizer to PyO3.

This entails a lot of changes. The gist of it is we write out
a PyO3 configuration file to configure that crate. Then we retool
pyembed and various Rust project building code to use that config
file and PyO3's configuration mechanisms. There are a lot of
changes in here.
After the switch to PyO3, the limitation documented by this disabled
test is no longer true. Since we provide PyO3 a config file with the
Python interpreter settings, its build script doesn't invoke a Python
interpreter and we should be able to cross build.
@indygreg
Copy link
Owner Author

indygreg commented Sep 6, 2021

I was able to rebase on top of the 0.14.5 release and things look great on my end. I'm shoring up a few things before I merge this. But hopefully within the next 48 hours.

I'm super excited about this migration:

  • The ergonomics of the Rust API make it vastly simpler for me to hack on the code interfacing with the Python interpreter.
  • It unblocks the ability to cross-compile binaries since pyo3's crates don't need to run a python when provided a config file.
  • It unblocks a lot of flexibility to move code around since we're no longer constrained by limitations in how libpython was hackily linked before. This should also lead to improved ergonomics around how builds work and making it easier for people to hack on Rust-based projects (as opposed to having PyOxidizer derive a Rust project for them).

So many great wins. Thank you, @davidhewitt, for your patience and willingness to work with me to get us to this point! I understand PyOxidizer is a bit of an edge case in terms of feature requests. I won't forget your support.

@indygreg
Copy link
Owner Author

indygreg commented Sep 6, 2021

This is now pushed to the main branch. So closing this PR.

Thanks again, @davidhewitt!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants