Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] fix exit handling of FiberState threads #45834
[core] fix exit handling of FiberState threads #45834
Changes from 2 commits
bece54d
ee93fd7
4623fe0
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we are relying on the behavior that
channel_.close();
andfiber_stopped_event_->Wait();
can be called multiple times, which I checked is true. But this seems fragile and rely on the underlying behavior of these libraries. Can we just have our ownstopped_
andjoined_
flags and early return if they are already called? Thoughts @hongchaodeng @rynewangThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge this and fix the issue first.
We are basically guaranteeing something that the library do not provide -- thread cancellation.
That's orthogonal to enhancing the capabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the unit test, can we also add an e2e test using the repro script mentioned in the GH issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is worth it.
The original repro script is sort of red herring. This test covers the root of the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this unit test covers the root cause of this issue, but it's still nice to make sure user's workload work well e2e even if, for example, we remove fiber completely in the future(and this unit test will be irrelevant)