Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tune] Fix CheckpointConfig validation for function trainables #31255

Merged
merged 6 commits into from
Dec 22, 2022

Conversation

justinvyu
Copy link
Contributor

@justinvyu justinvyu commented Dec 20, 2022

Why are these changes needed?

This fixes an issue where a ValueError wasn't being properly raised when passing in a function trainable and setting checkpoint_at_end=True or checkpoint_frequency > 0. Previously, the error was only raised for function trainables of the form def train_func(config, checkpoint_dir):, which is the old checkpoint dir function API.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@justinvyu justinvyu changed the title [Tune] Fix CheckpointConfig for function trainables [Tune] Fix CheckpointConfig validation for function trainables Dec 21, 2022
Copy link
Member

@Yard1 Yard1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor

@krfricke krfricke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@krfricke krfricke merged commit 51b56ad into ray-project:master Dec 22, 2022
AmeerHajAli pushed a commit that referenced this pull request Jan 12, 2023
This fixes an issue where a ValueError wasn't being properly raised when passing in a function trainable and setting `checkpoint_at_end=True` or `checkpoint_frequency > 0`. Previously, the error was only raised for function trainables of the form `def train_func(config, checkpoint_dir):`, which is the old checkpoint dir function API.

Signed-off-by: Justin Yu <[email protected]>
tamohannes pushed a commit to ju2ez/ray that referenced this pull request Jan 25, 2023
…roject#31255)

This fixes an issue where a ValueError wasn't being properly raised when passing in a function trainable and setting `checkpoint_at_end=True` or `checkpoint_frequency > 0`. Previously, the error was only raised for function trainables of the form `def train_func(config, checkpoint_dir):`, which is the old checkpoint dir function API.

Signed-off-by: Justin Yu <[email protected]>
Signed-off-by: tmynn <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants