Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upcoming changes to PyO3's API in 0.21 #3382

Closed
davidhewitt opened this issue Aug 11, 2023 · 22 comments
Closed

Upcoming changes to PyO3's API in 0.21 #3382

davidhewitt opened this issue Aug 11, 2023 · 22 comments
Milestone

Comments

@davidhewitt
Copy link
Member

davidhewitt commented Aug 11, 2023

Update: to avoid churn we have decided to keep Py<T> as-is and introduce the new API as Bound<'py, PyAny>. This naming can be revisited in future, but we have enough breaking changes with this 0.21 release that we do not wish to ship a wholesale rename at this point. (See #3674 (comment))

The tracking issue of the remaining work to implement and make Bound<'py, PyAny> a public API is in #3684

We are also building documentation at https://pyo3.rs/main/migration#from-020-to-021


This is a ticket to pin to give visibility to discussion and feedback of an overhaul we're considering for PyO3's API, projected for 0.21 release later this year.

TLDR; we plan to move types like &'py PyAny to be stored instead in a smart pointer type Py<'py, PyAny>. The existing Py<T> type will continue to exist, but renamed, maybe to PySend<T> or PyStatic<T> (name ideas welcome).

The reason for doing this is because we believe this will offer a faster and lower memory usage API. The &'py PyAny types are reliant on an internal datastructure we call the "reference pool", and it's the cause of memory frustrations such as #1056. It's a thread-local datastructure, so it also has a performance overhead whenever it is accessed. From benchmarks in prototypes I think as much as 30% overhead can be removed when doing lots of operations with Python objects.

There will be some work needed to migrate to this new API, mostly due to changes in ownership semantics. I think it may be possible to expose the existing &'py PyAny API in a separate crate (say pyo3_pool, name to be bikeshed) to make migration easier.

There has been a lot of thought put into this API design over a large span of time. We want to preserve as many existing semantics as possible as well as provide an as simple a migration path as possible from current PyO3 code. We want PyO3 to be as fast as possible while also being accessible to Python users coming to Rust for the first time.

For the discussion of the latest prototype of this new API, see #3361

All feedback and questions welcome.

@davidhewitt davidhewitt pinned this issue Aug 11, 2023
@davidhewitt davidhewitt added this to the 0.21 milestone Aug 11, 2023
@adamreichold
Copy link
Member

but renamed, maybe to PySend or PyStatic (name ideas welcome).

I was somewhat enamoured with PyUngil<T> matching the Ungil trait.

@davidhewitt
Copy link
Member Author

I could live with that, even if we have to rename it again in a nogil future.

@adamreichold
Copy link
Member

That's right, this might unnecessarily complicate things. My second least favourite was PyDetached because it comes with nicely fitting verbs attach/detach for method to switch between Py/PyDetached.

@davidhewitt
Copy link
Member Author

Yes I quite like PyDetached too

@hkpeaks

This comment was marked as off-topic.

@hkpeaks

This comment was marked as off-topic.

@adamreichold
Copy link
Member

@hkpeaks Please do not hijack this issue which is specific to the upcoming API changes. I you would like to discuss your experiments with us, please open a separate discussion. Thank you.

@hkpeaks

This comment was marked as off-topic.

@davidhewitt
Copy link
Member Author

I propose that PyCell<'py, T> becomes an alias for Py<'py, T>, and we can deprecate the alias. That way the API for pyclass and native types becomes more uniform.

@davidhewitt
Copy link
Member Author

(We would still need a different PyCell<T> internally for all the borrow tracking, we can just detach it from the public API.)

@alex
Copy link
Contributor

alex commented Aug 18, 2023

What would the equivalent of borrow(), try_borrow(), etc. be with this approach?

@davidhewitt
Copy link
Member Author

The existing Py<T> already has borrow(), try_borrow() etc (if T: PyClass), so I would think it would work exactly the same. Py<'py, T> would have borrow(), try_borrow() etc if T: PyClass.

@alex
Copy link
Contributor

alex commented Aug 18, 2023

Ah. Well then, I'd never noticed that. Sorry for the noise!

@davidhewitt
Copy link
Member Author

All good worth stating explicitly!

@davidhewitt davidhewitt mentioned this issue Aug 19, 2023
6 tasks
@holzschu
Copy link

I'm not sure if this qualifies as "API change", but I'd love for modules to have an m_free and m_clear method, that explicitly releases the memory and de-initializes them.

@davidhewitt
Copy link
Member Author

@holzschu I agree that'd be a nice feature but unrelated to the proposal in the OP. Probably best discussed on #3294 or similar proposals.

@davidhewitt
Copy link
Member Author

One thing which I noted when implementing #3531 is that if multiple traits methods with the same name are applicable for one type we get ambiguity, and that would be the advantage of inherent methods.

I think we are expecting to have a 1:1 mapping of traits to types so this should not be a problem. How I noticed this is because our (internal) Py2::borrowed_from_gil_ref is generic over T: AsTypeInfo and takes &T::AsRefTarget as the input, so that means that the output Py2<T> is not fully constrained and needs additional annotating sometimes.

@davidhewitt
Copy link
Member Author

Since we've got quite a few features and a couple of breaking changes beginning to take shape, I'm beginning to think it'll make sense to release 0.21 separate to (before?) this major API overhaul.

In the meanwhile I'll continue to chip away at the implementations like #3572, so hopefully if this doesn't make it into 0.21 we can land alpha releases of 0.22 with this API change committed shortly after.

@davidhewitt
Copy link
Member Author

Just a status update here: after discussion of #3668 the leaning in #3674 (comment) was that it makes sense to release this new API in 0.21, so for better or worse 0.21 is going to be quite a large release!

@davidhewitt davidhewitt changed the title Upcoming changes to PyO3's API (possibly in 0.21) Upcoming changes to PyO3's API in 0.21 Feb 9, 2024
@davidhewitt
Copy link
Member Author

PyO3 0.21 is released, this is now done.

@davidhewitt davidhewitt unpinned this issue Mar 29, 2024
@grothesque
Copy link

This change to PyO3 is great! I'm very excited by the prospect of an almost zero-cost but safe layer around the CPython API.

While studying these changes and PyO3 in general, I wonder: why was GILPool even introduced in the first place? Why wasn't something like Bound present in PyO3 since the beginning? I would be very grateful for some historic pointers.

Does gil-refs memory management have advantages over the new way? Or was it simpler to implement? Or just more obvious to come up with?

Thanks!

@davidhewitt
Copy link
Member Author

While studying these changes and PyO3 in general, I wonder: why was GILPool even introduced in the first place? Why wasn't something like Bound present in PyO3 since the beginning? I would be very grateful for some historic pointers.

A great question. The GIL Refs design was made long ago (it predates me), and perhaps at the time it wasn't realised what the performance consequences were likely to be. You can find it in early commits to this repository but with very little discussion. GIL Refs main advantage was simplicity. The syntax &PyAny is concise and allowed users to avoid writing lifetimes in many cases, which was certainly very friendly for newcomers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants