Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build configuration settings to facilitate complete link control #1793

Merged
merged 2 commits into from
Aug 31, 2021

Conversation

indygreg
Copy link
Contributor

This PR makes a handful of changes to the crate build configuration to support advanced control over linking, which is required by PyOxidizer.

I wrote extensive commit messages detailing each change. Please read the commit messages for more context.

I'm creating this as a draft until I have more confidence that these changes are sufficient to support PyOxidizer on all platforms. I figured I'd publish this as a PR in hopes of getting feedback about the general direction sooner.

The first commit is generally useful and could be cherry picked to main if you want.

@indygreg indygreg force-pushed the build-settings branch 2 times, most recently from 5ae8102 to 169e58b Compare August 14, 2021 22:34
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Overall I haven't really read the code super hard yet, I wanted to comment and ask questions about the high-level first. Covering each commit in turn:

  1. Yep this is much appreciated, thank you for writing this. It could be worth mentioning the defaults on deserialization (if the corresponding config key is missing)? In theory only version is required, depending on use case.
  2. I'm suprised that cargo/rustc doesn't have what you need already. Can you give an example of the lines you need to emit? It seems like something like using RUSTFLAGS env var to pass link args might do what you need?
  3. The links = "python" bit is a good idea, I think it probably does need to happen. A couple of questions about the use case though:
  • There's already pyo3_build_config::get() which can be used from build scripts to get the resolved InterpreterConfig. If that were stabilised and documented would it meet your needs?
  • If the above is no good, what about passing config file contents through the DEP_ env var, rather than writing out to a path? Seems like an extra step which may not be needed, however I'm happy to cede on this point.
  1. Yep I think having this option makes sense as an advanced feature (it's basically the extension-module cargo feature on steroids).

pyo3-build-config/src/impl_.rs Outdated Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
@indygreg
Copy link
Contributor Author

2. I'm suprised that cargo/rustc doesn't have what you need already. Can you give an example of the lines you need to emit? It seems like something like using RUSTFLAGS env var to pass link args might do what you need?

RUSTFLAGS isn't sufficient because you can't easily target single crates with environment variables.

One use case is we want to tell pyo3 to link additional static libraries. This is because PyOxidizer can derive a custom libpython. That libpython is defined as a static library of CPython's object files plus static libraries from other libraries, like libssl. It is easier to point Cargo at a bunch of static libraries than to assemble all the requisite object files into a single libpython static library/archive. The caveat is it currently fragments the build across multiple crates, as we need to emit cargo:rustc-link-* lines outside pyo3. But this feature will enable pyo3 to consume/link a working libpython and enable PyOxidizer to simplify build logic.

We also want to make use of cargo:rustc-link-lib=static-nobundle=pythonXY. Although come to think of it, if we tell pyo3 about all the static libraries, we may no longer need to use that static-nobundle hack!

* There's already `pyo3_build_config::get()` which can be used from build scripts to get the resolved `InterpreterConfig`. If that were stabilised and documented would it meet your needs?

Maybe? I'm not sure if that will work correctly since it relies on env!("OUT_DIR"). Won't the pyo3-build-config crate get built multiple times and have multiple configurations in play? By passing state through cargo:variable + environment variables, you ensure state is correct.

* If the above is no good, what about passing config file contents through the `DEP_` env var, rather than writing out to a path? Seems like an extra step which may not be needed, however I'm happy to cede on this point.

The commit message explains my reasoning on this topic.

@indygreg indygreg marked this pull request as ready for review August 15, 2021 00:32
@indygreg
Copy link
Contributor Author

I've got PyOxidizer working with this PR!

As hypothesized by my last comment, I was able to get rid of the cargo:rustc-link-lib=static-nobundle=pythonXY Windows hack because the ability to emit extra cargo:rustc-link-lib lines via the config file means that the crate linking libpython now has total awareness of the Python library and we no longer need to supplement that linking awareness with another crate. I have yet to clean up PyOxidizer's code, but this should simplify things considerably!

@davidhewitt
Copy link
Member

I've got PyOxidizer working with this PR!

Awesome! I think that if we need to add the links = "python" mechanism then it might take a bit longer for this to be released, because I wasn't planning to do another breaking release (0.15) too soon.

So in that spirit, it might be worth trying out pyo3_build_config::get() ? As long as all uses of pyo3_build_config are as [build-dependency] and/or dependencies of proc macro crates then I understood that it'd only be built once, for the host configuration.

(And I was expecting that you'd have the PYO3_BUILD_CONFIG env var present for the whole compilation.)

If pyo3_build_config::get() works for your use-case, then we can also avoid a bit of the complexity of adding a file-writing mechanism.

It's worth noting I had a bit of pain with sccache interacting with the config-file-writing - it wasn't reliably making it available to downwind crates. It was probably related to my (ab)use of getting PyO3's build script to write to the pyo3_build_config OUT_DIR. That was one of the additional motivations of #1758 - I made pyo3_build_config inline a lot more of the configuration to avoid bad interactions with sccache.

The commit message explains my reasoning on this topic.

Looks like you considered having multiple DEP_ keys? I was thinking of having a single cargo:PYO3_BUILD_CONFIG=<serialized config file> key. It might depend whether we can escape newlines and have a single key like this.

@davidhewitt davidhewitt mentioned this pull request Aug 15, 2021
indygreg added a commit to indygreg/pyo3 that referenced this pull request Aug 21, 2021
PyOxidizer has crates depending on `pyo3` that would like to access
the `pyo3` crate configuration. (This use case isn't unique to
PyOxidizer.)

Cargo has a facility for enabling the build scripts of dependent
crates to access _exported_ variables via `DEP_` environment
variables. However, this only works if the exporting crate defines a
`links` key in its Cargo manifest. See
https://doc.rust-lang.org/cargo/reference/build-scripts.html#the-links-manifest-key.

While `pyo3`'s build script doesn't yet export the variables that
PyOxidizer will need, a prerequisite to making this work is adding
the `links` key. Since this change could introduce unintended
side-effects, it warrants being made in its own commit, which is
why we're making this change outside of PyO3#1793.

I _think_ this change should be mostly safe: the `links` key is
effectively metadata advertising that a crate links against a named
library. The only side-effects setting it has is to enable the
aforementioned `DEP_` environment variables in build scripts and
enforcing a limitation that only a single crate may link against the
same native library. I believe the only potential for this change
to cause problems is if there are multiple crates with `links =
"python"` entries. I'm not aware of any other crates that advertise
`links = "python"`: even `python3-sys` / `cpython` use `links =
"python3"` so this change should not prevent dual use of `pyo3` and
`cpython` in the same build.
indygreg added a commit to indygreg/pyo3 that referenced this pull request Aug 21, 2021
PyOxidizer has crates depending on `pyo3` that would like to access
the `pyo3` crate configuration. (This use case isn't unique to
PyOxidizer.)

Cargo has a facility for enabling the build scripts of dependent
crates to access _exported_ variables via `DEP_` environment
variables. However, this only works if the exporting crate defines a
`links` key in its Cargo manifest. See
https://doc.rust-lang.org/cargo/reference/build-scripts.html#the-links-manifest-key.

While `pyo3`'s build script doesn't yet export the variables that
PyOxidizer will need, a prerequisite to making this work is adding
the `links` key. Since this change could introduce unintended
side-effects, it warrants being made in its own commit, which is
why we're making this change outside of PyO3#1793.

I _think_ this change should be mostly safe: the `links` key is
effectively metadata advertising that a crate links against a named
library. The only side-effects setting it has is to enable the
aforementioned `DEP_` environment variables in build scripts and
enforcing a limitation that only a single crate may link against the
same native library. I believe the only potential for this change
to cause problems is if there are multiple crates with `links =
"python"` entries. I'm not aware of any other crates that advertise
`links = "python"`: even `python3-sys` / `cpython` use `links =
"python3"` so this change should not prevent dual use of `pyo3` and
`cpython` in the same build.
@indygreg
Copy link
Contributor Author

I'm starting to split functionality from this into separate PRs to reduce the scope of this one.

I understand why we can't land some of this functionality in 0.14. But I want to make it clear that I'm blocked on adopting PyO3 in PyOxidizer until some aspects of this PR land. I'm on the fence about merging my PyO3 migration branch until there is a published pyo3 crate with the functionality I need. I've temporarily used Git branches/commits before and gotten burned because they block my ability to release a new version to crates.io. I don't like being blocked by factors outside my control.

With my discovery that cargo:rustc-link-lib=static-nobundle=pythonXY is no longer needed, it might be possible for me to reduce scope of this PR somewhat. But I'm pretty sure I need the links = Cargo.toml entry (#1819) in order to enable downstream crates to access exported variables. IMO that PR is backwards incompatible and requires a minor version bump. So PyOxidizer's dependence on 0.15 seems inevitable.

@davidhewitt
Copy link
Member

👍 I've merged the piece on interpreter config docs and also the new PyStringData api just earlier.

I'm on the fence about merging my PyO3 migration branch until there is a published pyo3 crate with the functionality I need. I've temporarily used Git branches/commits before and gotten burned because they block my ability to release a new version to crates.io. I don't like being blocked by factors outside my control.

I fully agree that having a git dependency blocking release for you is undesirable. Most of what you need is going into 0.14.3, which I'm about to release shortly.

If we can find ways to get the rest of what you need for PyOxidizer in as non-breaking changes I'd be happy to backport / cherry-pick to make a 0.14.4 to get you everything that you need.

I guess we could also release 0.15 before long to make breaking changes if necessary. However I do like to try to include new features (or other improvements) relevant to most users alongside breaking changes so that people have a reason to upgrade / don't get burned out by churn. 😄

indygreg added a commit to indygreg/pyo3 that referenced this pull request Aug 23, 2021
PyOxidizer has crates depending on `pyo3` that would like to access
the `pyo3` crate configuration. (This use case isn't unique to
PyOxidizer.)

Cargo has a facility for enabling the build scripts of dependent
crates to access _exported_ variables via `DEP_` environment
variables. However, this only works if the exporting crate defines a
`links` key in its Cargo manifest. See
https://doc.rust-lang.org/cargo/reference/build-scripts.html#the-links-manifest-key.

While `pyo3`'s build script doesn't yet export the variables that
PyOxidizer will need, a prerequisite to making this work is adding
the `links` key. Since this change could introduce unintended
side-effects, it warrants being made in its own commit, which is
why we're making this change outside of PyO3#1793.

I _think_ this change should be mostly safe: the `links` key is
effectively metadata advertising that a crate links against a named
library. The only side-effects setting it has is to enable the
aforementioned `DEP_` environment variables in build scripts and
enforcing a limitation that only a single crate may link against the
same native library. I believe the only potential for this change
to cause problems is if there are multiple crates with `links =
"python"` entries. I'm not aware of any other crates that advertise
`links = "python"`: even `python3-sys` / `cpython` use `links =
"python3"` so this change should not prevent dual use of `pyo3` and
`cpython` in the same build.
@davidhewitt
Copy link
Member

Based on indygreg/PyOxidizer#433, it sounds like if we change this PR to just contain just the two commits adding extra options to InterpreterConfig then we can proceed to merge this and include it in 0.14.5. (We don't need the mechanism to write the file out to the DEP_-advertised location any more, nor the links key?)

indygreg added a commit to indygreg/pyo3 that referenced this pull request Aug 30, 2021
PyOxidizer has crates depending on `pyo3` that would like to access
the `pyo3` crate configuration. (This use case isn't unique to
PyOxidizer.)

Cargo has a facility for enabling the build scripts of dependent
crates to access _exported_ variables via `DEP_` environment
variables. However, this only works if the exporting crate defines a
`links` key in its Cargo manifest. See
https://doc.rust-lang.org/cargo/reference/build-scripts.html#the-links-manifest-key.

While `pyo3`'s build script doesn't yet export the variables that
PyOxidizer will need, a prerequisite to making this work is adding
the `links` key. Since this change could introduce unintended
side-effects, it warrants being made in its own commit, which is
why we're making this change outside of PyO3#1793.

I _think_ this change should be mostly safe: the `links` key is
effectively metadata advertising that a crate links against a named
library. The only side-effects setting it has is to enable the
aforementioned `DEP_` environment variables in build scripts and
enforcing a limitation that only a single crate may link against the
same native library. I believe the only potential for this change
to cause problems is if there are multiple crates with `links =
"python"` entries. I'm not aware of any other crates that advertise
`links = "python"`: even `python3-sys` / `cpython` use `links =
"python3"` so this change should not prevent dual use of `pyo3` and
`cpython` in the same build.
@indygreg
Copy link
Contributor Author

Based on indygreg/PyOxidizer#433, it sounds like if we change this PR to just contain just the two commits adding extra options to InterpreterConfig then we can proceed to merge this and include it in 0.14.5. (We don't need the mechanism to write the file out to the DEP_-advertised location any more, nor the links key?)

Yeah, assuming pyo3_build_config::get() from my crate's build script won't diverge from what pyo3`` is actually using using, I just need the new fields in InterpreterConfigto provide explicit control overcargo:` directives. I'll rebase this PR accordingly.

PyOxidizer needs to do some... questionable things with regards to
configuring how the Python interpreter is linked. The way I solved this
problem for the `cpython` / `python3-sys` crates was by adding a bunch
of crate features to control what `cargo:` lines were emitted by the
build scripts. This added a lot of complexity to the those crates for
a target audience of ~1.

Now that PyO3 has support for config files to control settings, this
provides a richer mechanism than crate features to influence the build
script.

This commit defines a new field on the `InterpreterConfig` struct to
hold an arbitrary list of strings/lines that should be emitted by
the build script. This field is only every populated when parsing config
files and it is only read by pyo3's build script to `println!()`
additional values.

My intended use case for this is to have PyOxidizer effectively control
the interpreter link settings via the config file (at my own peril)
while having minimal impact on the maintainability of PyO3's code base.
Given the complexity of the link hacks employed, you probably don't want
this polluting pyo3's code base.
PyOxidizer requires advanced control over the settings used to link
libpython. We recently implemented support for configuration files
defining explicit lines to emit from build scripts to give callers
control over what lines to emit from build scripts so use cases
like PyOxidizer's are feasible without hacks in PyO3's code base.

However, the default logic in `emit_link_config()` may not be
appropriate in scenarios where link settings are provided via this
"extra lines" mechanism. The default logic may prohibit use of or
interfere with desired settings provided externally.

This commit defines a new field on the interpreter config that
suppresses the emission of the default link control logic from the
`pyo3` build script. It effectively gives advanced consumers like
PyOxidizer full control over link logic while minimally polluting
PyO3's build logic.

I thought about implementing this control as a crate feature. But
given the expected target audience size of ~1, I thought a crate
feature was too visible for a power user feature and decided to
implement it via the configuration file.
@indygreg
Copy link
Contributor Author

I rebased and reduced scope. I think this is ready for review.

(Although I haven't tested this exact branch against PyOxidizer yet. I may find time to do that in ~8 hours. But no guarantees.)

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good to merge now!

@davidhewitt davidhewitt merged commit 44c0287 into PyO3:main Aug 31, 2021
@davidhewitt davidhewitt mentioned this pull request Aug 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants