Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tune] ProgressReporterTest::testVerboseReporting is non-deterministic #29693

Closed
bveeramani opened this issue Oct 26, 2022 · 0 comments · Fixed by #29748
Closed

[Tune] ProgressReporterTest::testVerboseReporting is non-deterministic #29693

bveeramani opened this issue Oct 26, 2022 · 0 comments · Fixed by #29748
Assignees
Labels
P0 Issues that should be fixed in short order ray-team-created Ray Team created testing topics about testing tune Tune-related issues

Comments

@bveeramani
Copy link
Member

bveeramani commented Oct 26, 2022

What happened + What you expected to happen

test_progress_reporter.py::ProgressReporterTest::testVerboseReporting checks that multi-line strings like

Trial train_xxxxx_00001 reported _metric=6 with parameters={'do': 'once'}.
Trial train_xxxxx_00001 completed. Last result: _metric=6

are in STDOUT.

The problem is that we print status updates like

== Status ==
Current time: 2022-10-24 17:07:56 (running for 00:00:05.31)
Memory usage on this node: 2.7/7.6 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/2 CPUs, 0/0 GPUs, 0.0/3.43 GiB heap, 0.0/1.71 GiB objects
Result logdir: /root/.cache/bazel/_bazel_root/5fe90af4e7d1ed9fcf52f59e39e126f5/execroot/com_github_ray_project_ray/_tmp/3fba95ff54c67161134a94c8463b3e4b/train_2022-10-24_17-07-50
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)

at non-deterministic times.

This causes outputs to sometimes look like

Trial train_xxxxx_00001 reported _metric=6 with parameters={'do': 'once'}.
== Status ==
Current time: 2022-10-24 17:07:56 (running for 00:00:05.31)
Memory usage on this node: 2.7/7.6 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/2 CPUs, 0/0 GPUs, 0.0/3.43 GiB heap, 0.0/1.71 GiB objects
Result logdir: /root/.cache/bazel/_bazel_root/5fe90af4e7d1ed9fcf52f59e39e126f5/execroot/com_github_ray_project_ray/_tmp/3fba95ff54c67161134a94c8463b3e4b/train_2022-10-24_17-07-50
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)


Trial train_xxxxx_00001 completed. Last result: _metric=6

We should rewrite the test in such a way that results are independent of when status updates are printed.

Versions / Dependencies

Ray: 8b4283f

Reproduction script

If you run test_progress_reporter.py::ProgressReporterTest::testVerboseReporting enough, you should eventually get this error.

Issue Severity

This test randomly fails 10% of the time.

Low: It annoys or frustrates me.

@bveeramani bveeramani added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Oct 26, 2022
@bveeramani bveeramani changed the title [Tune] test_progress_reporter.py::ProgressReporterTest::testVerboseReporting is non-deterministic [Tune] ProgressReporterTest::testVerboseReporting is non-deterministic Oct 26, 2022
@bveeramani bveeramani added P2 Important issue, but not time-critical tune Tune-related issues testing topics about testing and removed bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Oct 26, 2022
@bveeramani bveeramani self-assigned this Oct 26, 2022
@bveeramani bveeramani added P0 Issues that should be fixed in short order and removed P2 Important issue, but not time-critical labels Oct 27, 2022
krfricke added a commit that referenced this issue Oct 27, 2022
`ProgressReportingTest::testVerboseReporting` is flakey. For more information, see #29693.

Signed-off-by: Balaji Veeramani <[email protected]>
Co-authored-by: Kai Fricke <[email protected]>
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this issue Dec 19, 2022
`ProgressReportingTest::testVerboseReporting` is flakey. For more information, see ray-project#29693.

Signed-off-by: Balaji Veeramani <[email protected]>
Co-authored-by: Kai Fricke <[email protected]>
Signed-off-by: Weichen Xu <[email protected]>
@richardliaw richardliaw added the ray-team-created Ray Team created label Dec 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P0 Issues that should be fixed in short order ray-team-created Ray Team created testing topics about testing tune Tune-related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants