-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tune] Enforce one future at a time for any given trial at any given time. #20783
Conversation
1472a44
to
5f51c1c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good but please run the result throughput single node and cluster release tests to make sure they still pass (you may have to set the environment variable there)
assert len( | ||
out | ||
) <= 1, "Expecting one future for any given trial at any given time." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly unrelated to this PR: Should we rename this into find_training_future
as we only use this helper to find futures in self._running
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and instead of retuning a list, we should just return the item.
May do that separately tho. Want to first enforce this basic behavior first.
Added env variable for result throughput release tests. |
Why are these changes needed?
Also enforce disabling (instead of allowing user to override this) buffer training when
checkpoint_at_end
is used.Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.