Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit HPy ABI version into the binary and check it when loading a module #387

Merged
merged 5 commits into from
Dec 7, 2022

Conversation

steve-s
Copy link
Contributor

@steve-s steve-s commented Dec 2, 2022

#371 "Introduce the Hybrid ABI" introduced:

  • ABI tag in shared object filenames, e.g., mymodule.hpy0.so
  • HPY_ABI_VERSION constant

This PR adds:

  1. another constant HPY_ABI_VERSION_MINOR (see below)
  2. when compiling HPy extension: add exported symbols with values of HPY_ABI_VERSION and HPY_ABI_VERSION_MINOR == ABI versions of HPy that compiled the extension
  3. when loading HPy extension: first read those symbols and compare them to its own versions

Details:

When compiling an extension HPy_MODINIT macro creates exported symbols:

export uint32_t get_required_hpy_major_version_mymodule() { return HPY_ABI_VERSION; } // expands to 0 at the moment
export uint32_t get_required_hpy_minor_version_mymodule() { return HPY_ABI_VERSION_MINOR; } // expands to 0 at the moment

The suggestion is that we have two ABI versions. Major versions are ABI incompatible, higher minor version is ABI compatible to lower minor versions, i.e., when adding something at the end of HPyContext => minor++. This is useful, because when checking compatibility one can do something like expected_major == actual_major && expected_minor <= actual_minor.

I expect that we will likely change minor version often, OTOH, introducing ABI breaking changes will happen rarely. Note that with this scheme, one HPy implementation can support multiple ABI versions and the only thing we cannot ever change are the names and types of the two exported symbols. We can turn HPy upside down and move from handles to reference counting and still stay binary compatible.

Question is if we want to encode also minor version in the ABI tag? CPython has abi3 tag, but it also does binary compatible additions. This is solved by also adding Python version, so cp37-abi3 is tag for stable ABI with all the functions available in Python 3.7 stable ABI, it can run on any Python 3.7+. IIRC the cp37 part is sometimes dropped and it's just abi3?

When loading an extension:

  • assume we compiled example extension with ABI version 0.3 last year
  • assume the current ABI version now is 1.0 and some HPy implementation of that ABI version is loading our extension
    • as the very first thing it loads the ABI versions that the extension requires (the symbols generated in step 1)
    • as a sanity check, it verifies that the shared object filename does have the right ABI tag (now checking just major ABI version == 0)
    • it checks that it can support this major version, let's say it does (maybe HPy implementations would eventually drop support for old major versions, but it should not be forced by anything else but choice).
    • now it starts loading the module like it should with ABI version 0 (example: in ABI version 1 we decide to rename the entry point from HPyInit_{modulename} to something else).
    • it creates HPyContext of major version 0

I was re-reading this article: https://blog.trailofbits.com/2022/11/15/python-wheels-abi-abi3audit/ and it seems to me that having the ABI version at two places (inside the extension as exported symbols and in the extension filename) is actually a good thing that would prevent the security issues (segfaults) when something goes wrong during packaging/distribution like it happens with CPython stable ABI.

@steve-s steve-s marked this pull request as ready for review December 2, 2022 21:08
@mattip
Copy link
Contributor

mattip commented Dec 5, 2022

... it seems to me that having the ABI version at two places (inside the extension as exported symbols and in the extension filename) is actually a good thing

It took me a while to understand this, and now I agree it would be a good thing. Would it help non-c-based languages to export functions int get_required_hpy_major_version_mymodule(), int get_required_hpy_minor_version_mymodule() rather than as constants?

@steve-s
Copy link
Contributor Author

steve-s commented Dec 5, 2022

Would it help non-c-based languages to export functions int get_required_hpy_major_version_mymodule(), int get_required_hpy_minor_version_mymodule() rather than as constants?

Absolutely. I missed the non-c languages argument when thinking about constants vs some getter functions.

@steve-s
Copy link
Contributor Author

steve-s commented Dec 7, 2022

No merge conflict, green gates, review comment fixed. This is ready for final review & merge. The open question whether we want to include minor ABI version in the tag can be addressed in another PR.

/cc @antocuni, @mattip

@mattip mattip merged commit 10f222e into hpyproject:master Dec 7, 2022
@mattip
Copy link
Contributor

mattip commented Dec 7, 2022

Thanks @steve-s

@fangerer
Copy link
Contributor

fangerer commented Dec 7, 2022

AFAIK, this also closes issue #374

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants