relax multiple-import check to only prevent subinterpreters #3446

davidhewitt · 2023-09-11T13:27:37Z

Some issues observed in the wild from the current implementation of the multiple-import check:

Sphinx 7.2.5 breaks with PyO3: "modules may only be initialized once per interpreter process" sphinx-doc/sphinx#11662
ImportError when running with coverage measurement activated pydantic/pydantic#6584

The motivation for the check is from #2346 (comment) - it's to prevent use of PyO3 modules in sub-interpreters, which we are definitely not ready for yet.

Given the issues above don't actually use sub-interpreters, it seemed worth trying to fix them. This PR does so by storing the current interpreter ID when initializing a #[pymodule], and checking that does not change on subsequent imports.

I've also updated the error message to point at #576 so that discussion on the community can at least be centralized here.

src/impl_/pymodule.rs

adamreichold · 2023-09-11T14:04:49Z

src/impl_/pymodule.rs

            return Err(PyImportError::new_err(
-                "PyO3 modules may only be initialized once per interpreter process",
+                "PyO3 modules do not yet support subinterpreters, see https://github.com/PyO3/pyo3/issues/576",
            ));
        }
        (self.initializer.0)(py, module.as_ref(py))?;


Do we really want to call the initializer multiple times or should the second import become a no-op?

I think the spirit of PEP 489 (which we don't support yet, but hopefully are heading towards) is that modules can be torn down and re-imported, so I think it's correct to call the initializer multiple times.

which we don't support yet, but hopefully are heading towards

I think this is the main issue here, isn't it? Does this change mean we opt into allowing tear-down and reinitialization of PyO3-based modules? What invariants would this break? Do we test this somewhere?

(I am not trying to say that it doesn't work. Just that I myself do not understand the consequences of this change and have little confidence about its effects on existing code.)

In pydantic/pydantic#6584 (comment), you mention opt-in. Might this be advisable even for this limited relaxation? What if the module is torn down but static data still contains pointers into the old module (e.g. capsules) which are now invalid? For example, is the code in https://github.com/PyO3/rust-numpy/blob/c16fbb1e630b0538fdf17cbc53043bf0467459f9/src/npyffi/mod.rs#L30-L32 valid after this change?

So I found the Cython implementation here:

https://github.com/cython/cython/blob/ce1aa59ab6300e18e50c752b5c9e28a04b665060/Cython/Utility/ModuleSetupCode.c#L1648C18-L1648C18

That code:

Has a similar check that the interpreter hasn't changed ID

Caches a module instance in a C static and returns the single module instance always.

So I guess let's implement this strategy too and defer the question of PEP 489 compatibility to a future PR. My goal for this PR is primarily to fix the use cases in the OP, which I believe the Cython strategy would cover.

It's clearly awkward to get this right. I think on balance it's desirable if we match Cython so that more of the ecosystem fails in the same way and it seems better that we only ever run the module initializer once for now. We can make progress towards per-module state to eventually remove this worry.

I agree with that. Do you know the upstream position on this, i.e. would they consider fixing this or does it work as intended?

(I made with_embedded_python_interpreter unsafe for this reason.)

I am somewhat confused as to what the consequences of this discovery are. Especially with the Cython module immediately showing the failure mode I was concerned about. Can we still relax this? Is the unsafe on with_embedded_python_interpreter sufficient? (I guess another other way to re-initialize Python in the current process would call unsafe Rust code or code in inherently unsafe languages like C, i.e. is trusted?)

I have just posted python/cpython#109785 - let's see what the outcome of that discussion is.

No movement yet on the upstream issue. @adamreichold are you happy if I proceed to merge this? As above at least we'll then be consistent with Cython and I think the generally accepted position is that things are expected to be wonky after finalizing and re-initializing the interpreter.

Yes, I think the language on with_embedded_python_interpreter is strong enough. This is the only API we provide to finalize and then re-initialize the interpreter?

Yes (apart from the raw FFI bindings). auto-initialize never bothers finalizing the interpreter.

src/impl_/pymodule.rs

pyo3-build-config/src/lib.rs

davidhewitt · 2023-09-14T12:46:32Z

Ok, I've pushed a new commit to this PR which:

Removes the conditional code with OnceLock to just use AtomicI64
Keeps the existing behaviour on CPython 3.7 and 3.8, where we cannot detect subinterpreter changes
Makes 3.9 and newer cache the module object in a GILOnceCell and repeatedly return it as long as the interpreter is the same one which created the module object.

The use of GILOnceCell may at first seem at odds with the discussion in #576 to remove it, but hey, I figure this is explicitly aimed at preventing subinterpreters so it seems fine to use it for now 😄

davidhewitt · 2023-09-14T13:12:32Z

(I'm not a huge fan of the divergence of behaviour between Python 3.7/3.8 and 3.9+, but I think in this case it is justified as the 3.9 behaviour is now a relaxation of the existing constraint to improve ergonomics where we can safely do so.)

src/impl_/pymodule.rs

davidhewitt mentioned this pull request Sep 11, 2023

Sphinx 7.2.5 breaks with PyO3: "modules may only be initialized once per interpreter process" sphinx-doc/sphinx#11662

Closed

davidhewitt force-pushed the relax-import-check branch from f5d6b6d to 3beadbd Compare September 11, 2023 13:30

adamreichold reviewed Sep 11, 2023

View reviewed changes

src/impl_/pymodule.rs Outdated Show resolved Hide resolved

adamreichold reviewed Sep 11, 2023

View reviewed changes

davidhewitt commented Sep 11, 2023

View reviewed changes

src/impl_/pymodule.rs Outdated Show resolved Hide resolved

mejrs reviewed Sep 11, 2023

View reviewed changes

pyo3-build-config/src/lib.rs Outdated Show resolved Hide resolved

davidhewitt force-pushed the relax-import-check branch from 3beadbd to ca7fa9f Compare September 14, 2023 12:41

davidhewitt force-pushed the relax-import-check branch from ca7fa9f to 9061ffa Compare September 14, 2023 13:09

davidhewitt mentioned this pull request Sep 16, 2023

0.20 release #3246

Closed

6 tasks

adamreichold reviewed Sep 17, 2023

View reviewed changes

src/impl_/pymodule.rs Outdated Show resolved Hide resolved

davidhewitt force-pushed the relax-import-check branch from 9061ffa to de2faba Compare September 23, 2023 10:13

davidhewitt added 2 commits September 23, 2023 11:13

relax multiple-import check to only prevent subinterpreters

1338020

return existing module on Python 3.9 and up

f17e703

davidhewitt force-pushed the relax-import-check branch from de2faba to f17e703 Compare September 23, 2023 10:13

davidhewitt mentioned this pull request Sep 23, 2023

Increment Interpreter ID on each Py_Initialize / Py_Finalize cycle python/cpython#109785

Open

adamreichold approved these changes Sep 28, 2023

View reviewed changes

davidhewitt added this pull request to the merge queue Sep 28, 2023

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 28, 2023

davidhewitt added the CI-build-full label Sep 29, 2023

adjust cfgs for windows 3.9

1a349c2

davidhewitt force-pushed the relax-import-check branch from bc0222e to 1a349c2 Compare September 29, 2023 13:23

davidhewitt added this pull request to the merge queue Sep 29, 2023

Merged via the queue into PyO3:main with commit f335f42 Sep 29, 2023
60 checks passed

davidhewitt deleted the relax-import-check branch September 29, 2023 17:57

alex mentioned this pull request Oct 13, 2023

PyO3 error pyca/cryptography#9719

Closed

hauntsaninja mentioned this pull request Dec 3, 2023

ImportError: PyO3 modules may only be initialized once per interpreter process openai/tiktoken#141

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

relax multiple-import check to only prevent subinterpreters #3446

relax multiple-import check to only prevent subinterpreters #3446

davidhewitt commented Sep 11, 2023

adamreichold Sep 11, 2023

davidhewitt Sep 11, 2023

adamreichold Sep 11, 2023 •

edited

Loading

adamreichold Sep 11, 2023

davidhewitt Sep 11, 2023

adamreichold Sep 23, 2023

davidhewitt Sep 23, 2023

davidhewitt Sep 27, 2023

adamreichold Sep 28, 2023

davidhewitt Sep 28, 2023

davidhewitt commented Sep 14, 2023

davidhewitt commented Sep 14, 2023

relax multiple-import check to only prevent subinterpreters #3446

relax multiple-import check to only prevent subinterpreters #3446

Conversation

davidhewitt commented Sep 11, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adamreichold Sep 11, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidhewitt commented Sep 14, 2023

davidhewitt commented Sep 14, 2023

adamreichold Sep 11, 2023 •

edited

Loading