Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[air/tuner] Expose number of errored/terminated trials in ResultGrid #26655

Merged
merged 3 commits into from
Jul 18, 2022

Conversation

krfricke
Copy link
Contributor

Signed-off-by: Kai Fricke [email protected]

Why are these changes needed?

This introduces an easy interface to retrieve the number of errored and terminated (non-errored) trials from the result grid.

Previously tune.run(raise_on_failed_trial) could be used to raise a TuneError if at least one trial failed. We've removed this option to make sure we always get a return value. ResultGrid.num_errored will make it easy for users to identify if trials failed and react to it instead of the old try-catch loop.

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Kai Fricke <[email protected]>
@richardliaw
Copy link
Contributor

Can you also somehow capture the exception and traceback?

The number of failed trials is not as useful if you can't debug the trial failures

@krfricke
Copy link
Contributor Author

Users have access to this via Result already, e.g.

errors = [result.error for result in result_grid if result.error]

or

result_grid[4].error

These contain the full exceptions.

I can implement a shortcut for this, e.g. result_grid.errors?

Signed-off-by: Kai Fricke <[email protected]>
@krfricke krfricke merged commit 66ca7b1 into ray-project:master Jul 18, 2022
@krfricke krfricke deleted the air/tuner-result-grid-errors branch July 18, 2022 22:12
xwjiang2010 pushed a commit to xwjiang2010/ray that referenced this pull request Jul 19, 2022
…ay-project#26655)

This introduces an easy interface to retrieve the number of errored and terminated (non-errored) trials from the result grid.

Previously `tune.run(raise_on_failed_trial)` could be used to raise a TuneError if at least one trial failed. We've removed this option to make sure we always get a return value. `ResultGrid.num_errored` will make it easy for users to identify if trials failed and react to it instead of the old try-catch loop.

Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Xiaowei Jiang <[email protected]>
Stefan-1313 pushed a commit to Stefan-1313/ray_mod that referenced this pull request Aug 18, 2022
…ay-project#26655)

This introduces an easy interface to retrieve the number of errored and terminated (non-errored) trials from the result grid.

Previously `tune.run(raise_on_failed_trial)` could be used to raise a TuneError if at least one trial failed. We've removed this option to make sure we always get a return value. `ResultGrid.num_errored` will make it easy for users to identify if trials failed and react to it instead of the old try-catch loop.

Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Stefan van der Kleij <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants