Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Perform borrow checking through a capsule-based API #361

Merged
merged 5 commits into from
Dec 31, 2022

Conversation

adamreichold
Copy link
Member

This places the global state required for dynamic borrow checking of NumPy arrays into a capsule exposed as the numpy.core.multiarray._BORROW_CHECKING_API attribute.

This has two benefits: First, all extensions built using rust-numpy will share this global state and hence cooperate in borrow checking. Second, in the future,
other libraries that want to borrow check their access to NumPy array can cooperate using this C-compatible API.

This does not checking the borrow checking implementation at all, but instead of accessing Rust static data, all accesses are funneled through the above-mentioned capsule API with access to it being cached for performance reasons as we already do for the _ARRAY_API and _UFUNC_API capsules.

This means that eventually, the implementation of the borrow checking API could be different from the one the current extension would provide with the adaptors to the C API taking of any differences in e.g. the definition of BorrowKey. For now, they will of course usually be the same and this change just adds a huge amount of boiler plate to go from Rust to C-compatible back to Rust.

Closes #359

cc @alex This is not the buffer-oriented protocol change you proposed in "Buffers on the edge", but may be of interested anyway as it is related if limited to NumPy arrays.

@adamreichold adamreichold force-pushed the borrow-checking-capsule-api branch 2 times, most recently from 74b7c1b to 6fbbec1 Compare November 12, 2022 11:25
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense, though am slightly cautious. I see the benefit and the global borrow checking is a clear improvement.

Will there need to be a way for the capsule contents to be "upgraded" at runtime if a too old version is present? Not really sure what functionality might be broken if a newer rust-numpy has to use an older capsule version. (Which might be unpredictable due to initialization order being deterministic but hard to predict, I think.)

Also, is it a safety/security concern if arbitrary code could spoof the capsule before this loads?

src/borrow/shared.rs Outdated Show resolved Hide resolved
let capsule = PyCapsule::new_with_destructor(
py,
shared,
Some(CString::new("_BORROW_CHECKING_API").unwrap()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be _RUST_NUMPY_BORROW_CHECKING_API to make it clear where it originates from?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered both options. I think including RUST_NUMPY in the name is the safe choice as this limits the intent of this API to extensions using rust-numpy. The generic name would cater more general aspirations of getting other languages to participate in the borrow checking along the lines @alex envisions in his blog post.

Personally, I am very much open to advice here and am fine with including RUST_NUMPY in the name. Especially as those more general aspirations are nothing that has been discussed with larger the community of potential users beforehand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a separate commit changing this which can keep or drop after further discussion.

fn insert_shared(py: Python) -> PyResult<*const Shared> {
let module = get_array_module(py)?;

let capsule: &PyCapsule = match module.getattr("_BORROW_CHECKING_API") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly can use PyCapsule::import here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried, but I don't think it is too helpful here: This code path is only taken if static Shared: SharedPtr; is still null, i.e. this extension tries to access the capsule for the first time, meaning that this is always a slow path.

In this case, it is also highly likely that the capsule is not present at all which means I do need the PyModule to call setattr on it.

Additionally, I cannot reasonably determine T prior to the import as I need to check the version field first. I could theoretically get a reference to u64 to just access the version and then cast that reference into a pointer and then a pointer the relevant struct and so on, but that feels more complicated than helpful as well.

@adamreichold
Copy link
Member Author

adamreichold commented Nov 13, 2022

Will there need to be a way for the capsule contents to be "upgraded" at runtime if a too old version is present? Not really sure what functionality might be broken if a newer rust-numpy has to use an older capsule version. (Which might be unpredictable due to initialization order being deterministic but hard to predict, I think.)

At the moment, I would not think this reasonably possible as the existing state cannot be safely transformed by an extension which wishes to upgrade the API if just due to ABI concerns as both extensions might have been built using different Rust versions. (Or there might eventually be implementations not written in Rust at all having a completely incompatible internal representation behind the flags field.) So this would mean that the first extension to load and instantiate the capsule determines the API version.

However, since changes to this API should be additive with older parts of it kept working (by just ignoring the additional fields), I suspect that we might be able to degrade gracefully if a newer rust-numpy encounters an older API version, i.e. functionality which worked before continues to work and newer functionality fails with a hopefully clear enough error message (by failing functions needing fields not yet present).

I think the main problem is that I am not confident whether even the promise of backwards compatibility can be kept since this is relatively new and we have yet to get any external feedback on the borrow checking we released with rust-numpy 0.17.0. IMHO, this is a good argument for outright waiting with the capsule API for a few rust-numpy releases to gather more insight whether at least the functionality we have now can be expected to stay in place for a reasonable amount of time.

@adamreichold
Copy link
Member Author

Also, is it a safety/security concern if arbitrary code could spoof the capsule before this loads?

I think this is certainly a safety concern, but I am not sure it can be helped. It is always possible to subvert Rust's type system by malicious or misguided action. Similarly, the borrow checking itself is not a sandbox, it just prevents programmers from making genuine mistakes.

More specially, loading e.g. a no-op borrow checker API would be similar to preloading libeatmydata to turn fsync into a no-op as far I can see. One can shoot oneself into the foot by doing so, but one has certainly asked for it then.

Base automatically changed from borrow-checking-simpler-hash to main November 13, 2022 15:47
This places the global state required for dynamic borrow checking of NumPy
arrays into a capsule exposed as the numpy.core.multiarray._BORROW_CHECKING_API
attribute.

This has two benefits: First, all extensions built using rust-numpy will share
this global state and hence cooperate in borrow checking. Second, in the future,
other libraries that want to borrow check their access to NumPy array can
cooperate using this C-compatible API.

This does not checking the borrow checking implementation at all, but instead of
accessing Rust static data, all accesses are funneled through the
above-mentioned capsule API with access to it being cached for performance reasons
as we already do for the _ARRAY_API and _UFUNC_API capsules.

This means that eventually, the implementation of the borrow checking API could
be different from the one the current extension would provide with the adaptors
to the C API taking of any differences in e.g. the definition of `BorrowKey`.
For now, they will of course usually be the same and this change just adds a
huge amount of boiler plate to go from Rust to C-compatible back to Rust.
This improves the changes of actually keeping it backwards compatible while
precluding the reuse of base address and borrow key for cloning and dropping.
@adamreichold
Copy link
Member Author

I think the main problem is that I am not confident whether even the promise of backwards compatibility can be kept since this is relatively new and we have yet to get any external feedback on the borrow checking we released with rust-numpy 0.17.0. IMHO, this is a good argument for outright waiting with the capsule API for a few rust-numpy releases to gather more insight whether at least the functionality we have now can be expected to stay in place for a reasonable amount of time.

I added another commit which makes the shared API much narrower, taking only *mut PyArrayObject instead of the various parts of the borrow key. This means it should be much more likely that we can keep this backwards compatible as it is not bound to the current implementation of borrow checking. Since it also uses the NumPy FFI exclusively, it would be easier to split this out into a separate crate as well which could be used by other crates besides rust-numpy. However, it does have a performance cost (I have not benchmarked it yet) because it means we cannot reuse the borrow key derived during acquire for release but have to rederive it inside the shared implementation.

@adamreichold
Copy link
Member Author

adamreichold commented Nov 27, 2022

I have not benchmarked it yet

Since the capsule API looks as minimal as it will get, I ran the bechmarks:

 name                      main ns/iter  capsule ns/iter  diff ns/iter  diff %  speedup 
 additional_shared_borrow  61            115                        54  88.52%   x 0.53 
 exclusive_borrow          123           134                        11   8.94%   x 0.92 
 initial_shared_borrow     137           148                        11   8.03%   x 0.93 

All of them being regressions are expected as there is the added cost of crossing the FFI boundary and using dynamic dispatch via the capsule. additional_shared_borrow being the worst of the bunch is also expected as it relies on reusing the borrow key which now needs to be recomputed to avoid making its structure part of the capsule API.

Finally, at less than 150 ns/iter for all three operations, I think the improved prospects for backwards compatibility and future improvements of the internal data structures used to perform the borrow checking which the minimal API provides are worth it.

Most applications will likely spent much more time operating on the resulting array views themselves and if not, there is always the unsafe escape hatch that bypasses the dynamic borrow checking.

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the long delay in revisiting this. This looks good to me to merge!

The benefit this allows multiple extensions using rust-numpy to work together is definitely worth the slight additional overhead and compatibility constraints.

@adamreichold
Copy link
Member Author

The benefit this allows multiple extensions using rust-numpy to work together is definitely worth the slight additional overhead and compatibility constraints.

I agree, the cross extension borrow checking alone makes this worth it IMHO.

@alex Any thoughts on the cross language borrow checking discussed above?

@kngwyu Any concerns before I merge this which imples that it will be part of the 0.18 release thereby committing us to backwards compatibility?

@alex
Copy link

alex commented Dec 28, 2022

Unfortunately not really, I haven't spent that much time thinking about what it would look like for extensions to provide this functionality themselves, most of my thoughts have been about what'd it mean to have this for buffers in general.

That said, I think the hardest question is: how (if at all) do you handle objects that represents sub-slices (views) of memory owned by some other object?

@adamreichold
Copy link
Member Author

That said, I think the hardest question is: how (if at all) do you handle objects that represents sub-slices (views) of memory owned by some other object?

Our existing dynamic borrow checking tries to handle this by always traversing the chain of NumPy views via the base attribute to its origin and then explicitly considering the data pointer, dimensions and strides by checking a necessary condition for aliasing between two views to be unsatisfied, i.e. we over-approximate conflicts which is safe even if we produce some incorrect borrow checking errors, c.f. the documentation of the borrow module.

(A complete solution of the Diophantine equation governing aliasing between views is possible, but more involved and slower and we released the current version to solicit feedback whether the approximate checking is a problem for anyone.)

@adamreichold adamreichold merged commit 65b6693 into main Dec 31, 2022
@adamreichold adamreichold deleted the borrow-checking-capsule-api branch December 31, 2022 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Borrow checking should use a capsule API
3 participants