Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cherry-pick] [Core] Defer SIGINT interrupt during task argument deserialization. (#30476) #30754

Conversation

clarkzinzow
Copy link
Contributor

This PR cherry-picks #30476 onto the 2.2.0 release branch.

Importing certain libraries (e.g. Arrow, Pandas, Torch) is not reentrant, and task cancellation is occasionally interrupting the Arrow import triggered via this deserialization add-on during task argument deserialization, which we are then trying to import again when serializing the error. See here for an example failure: https://buildkite.com/ray-project/oss-ci-build-branch/builds/1115#018485e1-df32-480f-9c36-cc898341f0a2

This PR prevents this import reentrancy from happening for the task cancellation case by deferring interrupts until after task argument deserialization finishes, so we can be sure that the serialization-related imports have finished before processing the interrupt.

…ay-project#30476)

Importing certain libraries (e.g. Arrow, Pandas, Torch) is not reentrant, and task cancellation is occasionally interrupting the Arrow import triggered via this deserialization add-on during task argument deserialization, which we are then trying to import again when serializing the error. See here for an example failure: https://buildkite.com/ray-project/oss-ci-build-branch/builds/1115#018485e1-df32-480f-9c36-cc898341f0a2

This PR prevents this import reentrancy from happening for the task cancellation case by deferring interrupts until after task argument deserialization finishes, so we can be sure that the serialization-related imports have finished before processing the interrupt.
@clarkzinzow
Copy link
Contributor Author

Closing this to reduce churn since I don't think that we need to cherry-pick this for 2.2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants