-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make instance_essentials hold references to patients #880
Conversation
The approach seems sound to me, but it would be really good to avoid a
What do you think? |
That's the only way I've managed to trigger the bug, but does Python make any guarantees or is this just a CPython implementation detail? Is it worth asking about on python-dev? By the way, code where I first ran into the problem doesn't use |
I haven't had a chance to look through this PR in detail yet, but I do have one question: does this split the nurse and patient destruction into separate GC passes, or do they remain in one? |
I guess this comes down to: When is an instance's weakref list cleared? AFAIK this happens only in two places:
We control the first case so it's not an issue. The second case can be avoided by excluding the patients from the list at
That can be the case if a pybind11-exported class is used as a base in Python, i.e.
As far I understand, still the same pass, it just delays patient destruction a bit. |
I don't think that was the case I had. I'll see if I still have a commit of the version of my code that ran into the bug. But it's an interesting case nevertheless: does your suggestion of putting the list in the dict work, given that the dict is part of the subclass rather than the pybind11-managed class? I'm also a little uneasy about the idea of storing this list in Some possible alternatives (other than the two approaches already discussed):
|
I had another look at where I was originally encountering the bug - it looks like it was indeed a Python subclass of a C++ class. What's the next step? Does anyone have opinions on the alternatives I suggested? |
It should still work because Regarding the alternatives that you proposed: (1) AFAIK this wouldn't work for subclasses of non-GC types (they wouldn't have the pointer in their instances). (2 & 3) I'm not a fan of either opt-in or opt-out. As you mentioned, it would be difficult to decide which one to use. (4) I don't think removing weakref support is an option. Another possibility is keeping an
Can you add tests for the situations that you found? (Keep the existing fix.) That way we can be sure about the source of the issue and that any proposed fix does actually work (also in light of PyPy vs CPython differences). Then we'll see about minimizing overhead. |
Okay, when I get a chance I'll add tests. |
I've added two unit tests: one where the parent participates in GC due to |
Re: options 2-4: #693 gets away from possibly-differing instance sizes because you can't multiply-inherit from classes in Python with different instance sizes. #693 also provides another option: we could allocate an extra pointer at the end of Alternatively, we could go with @dean0x7d 's |
I haven't looked at all the details of #693, but it sounds like a reasonable idea. I guess one question is whether the cost of converting to the non-simple layout would actually be higher than the cost of adding to an What's the timeframe for #693? If it's going to merged soon, then I'll park this until it is and then work on that approach. If it's unlikely to make the next release, maybe we should make a fix for this bug in the meantime (either this patch or the suggestion of using internals). I see the PyPy test failed because |
It probably isn't worthwhile complicating the layout, so go with the unordered_map internal. |
+1 for the |
I've implemented that, and also fixed the PyPy test. Let me know if you want me to rebase and squash (I haven't yet, just in case we spot some reason this approach won't work and want to revert to the previous one). One question: should I clear the |
tests/test_call_policies.py
Outdated
@@ -100,6 +100,7 @@ class Derived(Parent): | |||
with capture: | |||
del p, lst | |||
pytest.gc_collect() | |||
pytest.gc_collect() # Needed to make PyPy free the child |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a ConstructorStats.detail_reg_inst()
you can call here which handles the GC collection (including doing it twice for PyPy), and also returns the number of currently tracked instances. Aside from not needing the manual GC calls at all (and not needing to worry about invoking it twice under PyPy), it might be a useful check here, as well, to make sure that another object hasn't been created/copied somewhere along the line:
n_inst = ConstructorStats.detail_reg_inst()
# ... create parent, child
assert ConstructorStats.detail_reg_inst() == n_inst + 2
# ... destroy
assert ConstructorStats.detail_reg_inst() == n_inst
include/pybind11/pybind11.h
Outdated
if (!nurse || !patient) | ||
pybind11_fail("Could not activate keep_alive!"); | ||
|
||
if (patient.is_none() || nurse.is_none()) | ||
return; /* Nothing to keep alive or nothing to be kept alive by */ | ||
|
||
cpp_function disable_lifesupport( | ||
[patient](handle weakref) { patient.dec_ref(); weakref.dec_ref(); }); | ||
auto tinfo = get_type_info(Py_TYPE(nurse.ptr())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be: auto &tinfo = all_type_info(Py_TYPE(nurse.ptr())); if (!tinfo.empty()) { ... }
so that it can support a nurse which is an instance of a Python class inheriting from multiple pybind-registered bases.
Yes, definitely clear it just after you remove it from the map. Alternatively, you can change the |
Thanks for the comments @jagerman. They should all be addressed in my latest push. |
Ping - anything still need fixing here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay. Playing a bit of catch-up with the PRs now.
This looks like a great straightforward implementation that nicely solves the circular reference issue. I left just one minor comment.
include/pybind11/class_support.h
Outdated
@@ -304,6 +304,20 @@ inline void clear_instance(PyObject *self) { | |||
PyObject **dict_ptr = _PyObject_GetDictPtr(self); | |||
if (dict_ptr) | |||
Py_CLEAR(*dict_ptr); | |||
|
|||
if (instance->has_patients) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there's an add_patient()
function, it might be nice to pair it with clear_patients()
:
if (instance->has_patients)
clear_patients(instance)
Sounds like a reasonable idea - pushed. I've also rebased and squashed the branch so it should be ready to merge. |
Looks good to me, but it looks like #915 added another conflict, sorry about that! Fix that up and I think it's good to merge. |
This fixes pybind#856. Instead of the weakref trick, the internals structure holds an unordered_map from PyObject* to a vector of references. To avoid the cost of the unordered_map lookup for objects that don't have any keep_alive patients, a flag is added to each instance to indicate whether there is anything to do.
It was a pretty minor conflict fortunately. I've rebased and pushed. |
Merged, thanks for the PR! |
PR pybind#880 changed the implementation of keep_alive to avoid weak references when the nurse is pybind11-registered, but the documentation didn't get updated to match.
PR #880 changed the implementation of keep_alive to avoid weak references when the nurse is pybind11-registered, but the documentation didn't get updated to match.
This is a first attempt to address #856. There is still a bit of work to do (e.g., some tests, and double-checking that I'm not leaking references). At this point I'm looking for feedback on whether the idea
is sound and would be accepted.
Instead of the weakref trick, every pybind-managed object holds a (Python) list of references, which gets cleared in
clear_instance
. The list is only constructed the first time it is needed.The one downside I can see is that every pybind object will get bigger by
sizeof(PyObject *)
, even if no keep_alive is used. On the other hand, it should significantly reduce the overhead of using keep_alive,since there is no need to construct a weakref object and a callback object.
One thing I'm not sure about is whether the call to
Py_CLEAR(instance->patients);
belongs inside or outside thehas_value
test. At the moment it's inside, but that's cargo culting rather than from an understanding of the conditions under which an instance might not have a value.