Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for all Sessions during pantsd shutdown #11929

Merged
merged 4 commits into from
Apr 16, 2021

Conversation

stuhood
Copy link
Member

@stuhood stuhood commented Apr 16, 2021

As described in #11618, when pantsd intentionally exits due to low memory, a few types of work can be cut short:

  1. if the run ends in Ctrl+C, processes that were cancelled may not have had time to be dropped before `pantsd exits.
  2. async StreamingWorkunitHandler threads might still be running.

This change adds orderly-shutdown mechanisms to the Scheduler/Core to join all ongoing Sessions (including the SWH), and improves tests to ensure that the SWH is waited for.

Additionally, in the last commit, added purging of the pantsd metadata as soon as we decide to restart, which should reduce (but probably not eliminate) the incidence of item 1. from #11618. Work for #11831 will likely further harden this path.

[ci skip-build-wheels]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@stuhood stuhood marked this pull request as ready for review April 16, 2021 19:25
@tdyas
Copy link
Contributor

tdyas commented Apr 16, 2021

Is there any way for shutdown to hang? I'm wondering if a timeout to switch to a "forced shutdown" mode would be advisable.

@stuhood stuhood added this to the 2.4.x milestone Apr 16, 2021
# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@stuhood
Copy link
Member Author

stuhood commented Apr 16, 2021

Is there any way for shutdown to hang? I'm wondering if a timeout to switch to a "forced shutdown" mode would be advisable.

Good idea: added.

@stuhood stuhood merged commit c443d71 into pantsbuild:main Apr 16, 2021
@stuhood stuhood deleted the stuhood/clean-pantsd-shutdown branch April 16, 2021 22:07
stuhood added a commit to stuhood/pants that referenced this pull request Apr 16, 2021
As described in pantsbuild#11618, when `pantsd` intentionally exits due to low memory, a few types of work can be cut short:
1. if the run ends in Ctrl+C, processes that were cancelled may not have had time to be dropped before `pantsd exits.
2. async StreamingWorkunitHandler threads might still be running.

This change adds orderly-shutdown mechanisms to the `Scheduler`/`Core` to join all ongoing `Sessions` (including the SWH), and improves tests to ensure that the SWH is waited for.

Additionally, in the last commit, added purging of the `pantsd` metadata as soon as we decide to restart, which should reduce (but probably not eliminate) the incidence of item 1. from pantsbuild#11618. Work for pantsbuild#11831 will likely further harden this path.

[ci skip-build-wheels]
stuhood added a commit that referenced this pull request Apr 16, 2021
…11934)

As described in #11618, when `pantsd` intentionally exits due to low memory, a few types of work can be cut short:
1. if the run ends in Ctrl+C, processes that were cancelled may not have had time to be dropped before `pantsd` exits.
2. async StreamingWorkunitHandler threads might still be running.

This change adds orderly-shutdown mechanisms to the `Scheduler`/`Core` to join all ongoing `Sessions` (including the SWH), and improves tests to ensure that the SWH is waited for.

Additionally, in the last commit, added purging of the `pantsd` metadata as soon as we decide to restart, which should reduce (but probably not eliminate) the incidence of item 1. from #11618. Work for #11831 will likely further harden this path.

[ci skip-build-wheels]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants