-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-115874: Fix segfault in FutureIter_dealloc
#117741
Conversation
cc: @brandtbucher for his review as well |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it to @brandtbucher to review this -- I am somewhat disappointed that in order to fix this we have to copy a bunch of subtle code from inside PyType_GetModuleByDef()
-- especially since there the code is enclosed in macros BEGIN_TYPE_LOCK()
and END_TYPE_LOCK()
(though those seem needed only because of the call to lookup_tp_mro()
there).
At the same time I understand you don't want to call PyType_GetModuleByDef()
and have to deal with the exception it raises in exactly the scenario we're trying to tiptoe around here.
It looks like it's actually a bit more subtle than that. If I understand correctly, this isn't being done to avoid an exception, but to avoid a crash due to the type's MRO being |
@savannahostrowski, do you mind adding a NEWS blurb? No need to get too technical, just explaining that we've fixed a possible crash during garbage-collection of Can you also add a comment or two to the new code referencing the issue number and explaining that:
Can you also confirm for me that our old reproducer crashes on 3.12? If so, we can add the test (it's okay if it no longer works on 3.13, probably still good to have) and flag this for backport. |
Hm, that would point to a pretty universal problem (the instance being cleared after the type, when both are involved in a cycle). Why isn’t that crashing other code? What’s special about this example? (I am pushing back on this because the fix requires breaking through a nice abstraction, potentially in many more cases.) |
Thanks for the feedback, folks. I'll add some comments to the code. @brandtbucher I can confirm that the segfault still repros on 3.12 so I can add a test here. I know you did some investigation in other modules and that's how you found this issue. That said, is it worth doing a bit more spelunking to understand if this is happening in other places as well? I'm new to this part of the codebase to really understand how prolific this might be 😅 but I want to address concerns here. |
This comment was marked as outdated.
This comment was marked as outdated.
What's special is that only
In the original issue, we found a similar crash in A perhaps more "principled" fix would be to change In my opinion, the bug here is assuming that your type hasn't been cleared in a clear or dealloc func. That's incorrect. |
See my comment above. I'm reasonably confident that there aren't other offenders, but another pair of eyes definitely wouldn't hurt. |
Hmm... But it's an easy trap to fall into. Not exactly a bug magnet (only two instances in CPython itself), perhaps, but hard to debug, and hard to reason about: it seems it only happens when the type is cleared first. So isn't the bug that the type is cleared while it still has instances? Do we understand how exactly this happened? Is |
My hunch is that we have a cycle: |
Co-authored-by: Brandt Bucher <[email protected]>
Yes, I noticed that after reading through the discussion again (which is why I marked my comment as outdated); it is an unfortunate issue. Well, thanks a lot for implementing this workaround :) |
Thanks @savannahostrowski for the PR, and @brandtbucher for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12. |
(cherry picked from commit d8f3503) Co-authored-by: Savannah Ostrowski <[email protected]>
GH-118114 is a backport of this pull request to the 3.12 branch. |
GH-115874: Fix segfault in FutureIter_dealloc (GH-117741) (cherry picked from commit d8f3503) Co-authored-by: Savannah Ostrowski <[email protected]>
Thanks for fixing this crash! |
Misc/NEWS.d/next/Core and Builtins/2024-04-13-18-59-25.gh-issue-115874.c3xG-E.rst
Show resolved
Hide resolved
…ealloc`) (GH-121638) Address comments
…Iter_dealloc`) (pythonGH-121638) Address comments (cherry picked from commit 65feded) Co-authored-by: Savannah Ostrowski <[email protected]>
…Iter_dealloc`) (pythonGH-121638) Address comments
Per discussion here: #115874 (comment)
tp_dealloc
(itertools teedataobject clear) #115874