[tune] Enforce one future at a time for any given trial at any given time. #20783

xwjiang2010 · 2021-11-30T01:04:09Z

Why are these changes needed?

Also enforce disabling (instead of allowing user to override this) buffer training when checkpoint_at_end is used.

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

krfricke

Looks good but please run the result throughput single node and cluster release tests to make sure they still pass (you may have to set the environment variable there)

krfricke · 2021-12-01T16:57:50Z

python/ray/tune/ray_trial_executor.py

+        assert len(
+            out
+        ) <= 1, "Expecting one future for any given trial at any given time."


Possibly unrelated to this PR: Should we rename this into find_training_future as we only use this helper to find futures in self._running?

Yes, and instead of retuning a list, we should just return the item.
May do that separately tho. Want to first enforce this basic behavior first.

…ind_item

xwjiang2010 · 2021-12-02T20:38:54Z

Added env variable for result throughput release tests.

xwjiang2010 · 2021-12-02T20:58:33Z

link: https://buildkite.com/ray-project/periodic-ci/builds/1847

[try out] One future at a time for any given trial at any given time.

5f51c1c

xwjiang2010 force-pushed the find_item branch from 1472a44 to 5f51c1c Compare November 30, 2021 01:48

Clean up the assumption around buffered training.

be68175

xwjiang2010 changed the title ~~[try out] One future at a time for any given trial at any given time.~~ [tune] Enforce one future at a time for any given trial at any given time. Dec 1, 2021

xwjiang2010 added this to the Tune Tech Debt Reduction milestone Dec 1, 2021

Merge branch 'ray-project:master' into find_item

722bed2

xwjiang2010 requested a review from krfricke December 1, 2021 15:27

xwjiang2010 assigned krfricke Dec 1, 2021

krfricke reviewed Dec 1, 2021

View reviewed changes

xwjiang2010 added 2 commits December 1, 2021 11:39

Set TUNE_RESULT_BUFFER_LENGTH=1000 for result throughput tests.

cea85a7

Merge branch 'find_item' of https://github.com/xwjiang2010/ray into f…

6feac3f

…ind_item

krfricke approved these changes Dec 3, 2021

View reviewed changes

krfricke merged commit 368da17 into ray-project:master Dec 3, 2021

xwjiang2010 deleted the find_item branch July 26, 2023 19:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tune] Enforce one future at a time for any given trial at any given time. #20783

[tune] Enforce one future at a time for any given trial at any given time. #20783

xwjiang2010 commented Nov 30, 2021 •

edited

Loading

krfricke left a comment

krfricke Dec 1, 2021

xwjiang2010 Dec 1, 2021

xwjiang2010 commented Dec 2, 2021

xwjiang2010 commented Dec 2, 2021

[tune] Enforce one future at a time for any given trial at any given time. #20783

[tune] Enforce one future at a time for any given trial at any given time. #20783

Conversation

xwjiang2010 commented Nov 30, 2021 • edited Loading

Why are these changes needed?

Related issue number

Checks

krfricke left a comment

Choose a reason for hiding this comment

krfricke Dec 1, 2021

Choose a reason for hiding this comment

xwjiang2010 Dec 1, 2021

Choose a reason for hiding this comment

xwjiang2010 commented Dec 2, 2021

xwjiang2010 commented Dec 2, 2021

xwjiang2010 commented Nov 30, 2021 •

edited

Loading