Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Fix a corner case where GCS will crash when actor is deleted #30563

Merged
merged 2 commits into from
Nov 22, 2022

Conversation

fishbone
Copy link
Contributor

Why are these changes needed?

When the lease is granted and writing to the table, if the actor is deleted, GCS is going to crash. This PR fix it by check the actor status first before creating the actor.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Yi Cheng <[email protected]>
Signed-off-by: Yi Cheng <[email protected]>
@fishbone fishbone linked an issue Nov 22, 2022 that may be closed by this pull request
@fishbone
Copy link
Contributor Author

Sadly I don't know how to test this one. This usually happens when GCS is overloaded (writing to table slow) and driver created a lot of actors and exits.

Any ideas are welcome.

@fishbone fishbone added the release-blocker P0 Issue that blocks the release label Nov 22, 2022
@fishbone
Copy link
Contributor Author

test failure not releated

@fishbone fishbone merged commit 572464c into ray-project:master Nov 22, 2022
fishbone added a commit to fishbone/ray that referenced this pull request Nov 22, 2022
…ay-project#30563)

When the lease is granted and writing to the table, if the actor is deleted, GCS is going to crash. This PR fix it by check the actor status first before creating the actor.
scv119 pushed a commit that referenced this pull request Nov 23, 2022
…30563) (#30595)

When the lease is granted and writing to the table, if the actor is deleted, GCS is going to crash. This PR fix it by check the actor status first before creating the actor.
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022
…ay-project#30563)

When the lease is granted and writing to the table, if the actor is deleted, GCS is going to crash. This PR fix it by check the actor status first before creating the actor.

Signed-off-by: Weichen Xu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-blocker P0 Issue that blocks the release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Core] GCS crashed in actor creation.
3 participants