-
Notifications
You must be signed in to change notification settings - Fork 763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New #[pyclass] system that is aware of its concrete layout #683
Conversation
b0e0777
to
1723599
Compare
71649ed
to
dc10d45
Compare
dc10d45
to
b86de93
Compare
Still incomplete, but here I note the rough sketch. #[repr(C)]
pub struct PyClassShell<T: PyClass> {
ob_base: <T::BaseType as PyTypeInfo>::ConcreteLayout,
pyclass: ManuallyDrop<T>,
...
} Currently, we use a hacky way to deal with (value.as_ptr() as *const u8).offset(T::OFFSET) as *const T (From https://github.com/PyO3/pyo3/blob/v0.8.4/src/conversion.rs#L399) By using (*(value.as_ptr() as *const PyClassShell<T>)).pyclass Additionally, this PR does
Now #[pymethods]
impl BaseClass {
#[new]
fn new() -> Self {
BaseClass { val1: 10 }
}
}
#[pyclass(extends=BaseClass)]
struct SubClass {
#[pyo3(get)]
val2: usize,
}
#[pymethods]
impl SubClass {
#[new]
fn new() -> PyClassInitializer<Self> {
let mut init = PyClassInitializer::from_value(SubClass { val2: 5 });
init.get_super().init(BaseClass { val1: 10 });
init
}
} The benefit of new |
One thing that should be supported (I didn't check) is the ability to skip initialization and throw away the new object when raising an exception from It would also be nice, but not required, to support conditionally returning an unrelated object from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the time taken for me to pick this up. (I'd been out travelling and hadn't the time to give this a proper read through.)
Overall, this looks nice. While we're here, I think there's two main things we should be aiming for:
- It would be great if we can keep
PyClassShell<T>
out of thepyo3
api, so that users can just use&T
/&mut T
. I don't know how easy this is. pybind11
has mechanisms to be able to wrap references for Python, without copying. I've made a comment onPyClassShell
which hints how the references are stored in there. Doing this safely also requires a "keep-alive" mechanism so that Python knows to keep the original data alive (so we don't end up with dangling references).
src/type_object.rs
Outdated
type BaseType: PyTypeInfo + PyTypeObject; | ||
|
||
/// Layout | ||
type ConcreteLayout: PyConcreteObject<Self>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary for the ConcreteLayout
to be part of the TypeInfo
trait? It seems like this is only needed for #[pyclass]
types. Having it here creates a lot of work to get the right definitions for all the py native types, but I'm not sure if it helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to give a fixed size to PyClassShell
, we need it.
It seems like this is only needed for #[pyclass] types.
We also need it to derive PyDict
or so.
Having it here creates a lot of work to get the right definitions for all the py native types,
Yeah, but we can use some scripts to get them.
#[repr(C)] | ||
pub struct PyClassShell<T: PyClass> { | ||
ob_base: <T::BaseType as PyTypeInfo>::ConcreteLayout, | ||
pyclass: ManuallyDrop<T>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PyClassShell
idea is nice, and is heading in the direction that pybind11
goes. There's two main differences between this idea and what pybind11
does:
-
The
pybind11
equivalent just stores*mut void
instead of being generic overT
. This means that the python type allocation for all pybind11 classes is more or less exactly the same. It also means smaller binary sizes, because there's only one copy of thePyClassShell
methods instead of one for eachT
. -
pybind11
allows references to be stored in thePyClassShell
as well as owned values. In Rust speak, instead ofpyclass: ManuallyDrop<T>
, we'd have something like:enum PyClassValue<T> { Borrowed(*const T), MutBorrowed(*mut T), Owned(ManuallyDrop<T>), }
The great thing about storing references in PyClassShell
is that it allows us to do a lot less copying when going Rust -> Python -> Rust.
For example, the code below doesn't currently compile with pyo3
because the &Address
return type. If we could store references in the PyClassShell
, then this should be fairly straightforward to support:
#[pyclass]
struct Address {
value: String
}
#[pyclass]
struct Employee {
address: Address
}
#[pymethods]
impl Employee {
pub fn address(&self) -> &Address {
&self.address
}
}
We need to make sure that we never can simultaneously hand out more than one This is important since |
Also, since python modifies refcounts and stuff, that part needs to be wrapped in |
Agreed. |
enum PyClassValue<T> {
Borrowed(*const T),
MutBorrowed(*mut T),
Owned(ManuallyDrop<T>),
} We never should do so. |
I agree with this in principle, but I don't think it's possible. As far as I know, Rust can't express the lifetime of the borrow in Python. That's why a keep-alive mechanism is necessary. I can demonstrate a PR for this if you like. |
@davidhewitt |
👍 that makes total sense. It's for reasons like this that I recommend we keep Then we can consider adding features like reference support without more breaking changes. I could then write a PR for that another time. |
819ee06
to
176cb2a
Compare
176cb2a
to
6b84401
Compare
I added some additional features to this PR.
#[pyclass(extends=PyDict)]
struct DictWithName {
#[pyo3(get(name))]
_name: &'static str,
} This SIGSEGV is fixed by properly using baseclass's
#[pyclass(extends=PyTuple)]
struct TupleWithName {
_name: &'static str,
} |
src/pyclass.rs
Outdated
let PyClassInitializer { init, super_init } = self; | ||
if let Some(value) = init { | ||
unsafe { shell.py_init(value) }; | ||
} else if !T::ConcreteLayout::NEED_INIT { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the condition reversed here, erroring when NEED_INIT
is false
seems backwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
erroring when NEED_INIT is false seems backwards
Sorry I'm not sure what you are saying..., but you mean we should not raise an error for redundant initialization?
I understand, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good!
src/pyclass.rs
Outdated
|
||
#[must_use] | ||
#[doc(hiddden)] | ||
pub fn init_class(self, shell: &mut T::ConcreteLayout) -> PyResult<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be unsafe
, it seems likely to be UB to invoke twice on the same shell
object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm 🤔 ... , I don't think it's UB.
init
just do shell.pyclass = value
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't this call the CPython init functions for the base classes? Those are likely to not be allowed to be called twice on the same object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's a part of tp_new
and doesn't call init function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, ok. Related, if shell
is partially uninitialized, it would be UB for safe code to read from it, so it should have MaybeUninitialized
and/or be passed in as *mut T::ConcreteLayout
. Also, if later code depends on shell
being initialized instead of being uninit
, then that's also a good reason for init_class
to be unsafe
.
We would presumably be able to work around this if the layout contained a pointer to the actual tuple object, e.g. This is actually how C++'s |
de14fb1
to
766a520
Compare
It would still be good to know how it affects performance. Does this add overhead, and if so, how much? One thing to consider is that the more overhead PyO3 adds, the less likely it is to be adopted (at least from the Python side), since many people are expicitly looking for a safer way to get speed in their Python files. If PyO3 adds a ton of overhead over C or Cython, people will simply say, "Oh well it would be nice if we could have safety and speed, but I guess we can't." If this sort of thing is super expensive but necessary to provide safety guarantees mainly in strange edge cases, it may make sense to have a middle-of-the-road "fast but possibly unsafe" interface for people who really care about speed and a "slow but safe" version for people whose bottlenecks are elsewhere. |
No. |
Important change for
|
I very much like the new I opened a PR against your fork to propose a simpler API for |
Further to my thoughts about
This doesn't seem desirable to me. What do others think? (If we merge my suggested PR, then it would be as simple as changing |
55fe489
to
f26e07c
Compare
Following the line of @davidhewitt's simplification, I replaced |
src/prelude.rs
Outdated
// This is only part of the prelude because we need it for the pymodule function | ||
pub use crate::pyclass_init::PyClassInitializer; | ||
pub use crate::types::PyModule; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new pub use crate::pyclass_init::PyClassInitializer;
has split up a comment from the line beneath it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
src/pyclass_init.rs
Outdated
super_init: Box<<T::BaseType as PyTypeInfo>::Initializer>, | ||
} | ||
|
||
impl<T: PyClass> PyClassInitializer<T> { | ||
pub fn new(init: T, super_init: <T::BaseType as PyTypeInfo>::Initializer) -> Self { | ||
Self { | ||
init, | ||
super_init: Box::new(super_init), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to box super_init
? This seems to be an allocation we may not need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, thank you.
After the changes in the `#[pyclass]` system in PyO3#683, `new` can return self and there is no reason anymore to ignore this lint.
This PR introduces a new
#[pyclass]
system that usesstruct PyClassShell
instead of offset access.I'll add a more detailed explanation later.
cc: @davidhewitt
TODO
#[new]
APIsuper().__init()__
weakref
anddict
supportConcreteLayout
s for C-FFI types