-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] Move Learner Hp assignment to validate #33392
[RLlib] Move Learner Hp assignment to validate #33392
Conversation
…le with rllib yaml files Signed-off-by: Avnish <[email protected]>
Signed-off-by: Avnish <[email protected]>
@@ -395,9 +378,20 @@ def validate(self) -> None: | |||
"term/optimizer! Try setting config.training(" | |||
"_tf_policy_handles_more_than_one_loss=True)." | |||
) | |||
# learner hps need to be updated inside of config.validate in order to have | |||
# the correct values for when a user starts an experiment from a dict. This is | |||
# as oppposed to assigning the values inthe builder functions such as `training` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found your explanation from the PR itself more informative. Together they would read:
"LearnerHPs need to be updated inside of AlgorithmConfig.validate()
in order to work with tune's HP search. This is because, if a user passes their experiment config to tune.run as a dictionary, the LearnerHPs will not be ever set. In that case, downstream Learners will not receive the passed HPs."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ill throw it in my current pr. I've never seen my tests this green in my life and want to keep it that way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
am just gonna say we shouldn't make changes in random PR just because we don't want to touch CI again.
attributions will be wrong, and people will get confused if they git blame.
maybe it's better to send a separate comment PR for it, instead of including it in whatever next PR you are working on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although have to say that this is way too green ... :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know that's actually very fair.
I had not thought about it like that.
I will not do this again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change makes absolute sense, just got a nit concerning the added explanation.
Nit: I think your PR has a better description of what's going on.
iiuc, we could add some more info here.
* Move adding params to learner hps to validate in order to be compatible with rllib yaml files * Move learner_hp assignment from builder functions to validate Signed-off-by: Avnish <[email protected]> Signed-off-by: Jack He <[email protected]>
* Move adding params to learner hps to validate in order to be compatible with rllib yaml files * Move learner_hp assignment from builder functions to validate Signed-off-by: Avnish <[email protected]> Signed-off-by: Edward Oakes <[email protected]>
* Move adding params to learner hps to validate in order to be compatible with rllib yaml files * Move learner_hp assignment from builder functions to validate Signed-off-by: Avnish <[email protected]> Signed-off-by: chaowang <[email protected]>
* Move adding params to learner hps to validate in order to be compatible with rllib yaml files * Move learner_hp assignment from builder functions to validate Signed-off-by: Avnish <[email protected]> Signed-off-by: elliottower <[email protected]>
* Move adding params to learner hps to validate in order to be compatible with rllib yaml files * Move learner_hp assignment from builder functions to validate Signed-off-by: Avnish <[email protected]> Signed-off-by: Jack He <[email protected]>
Signed-off-by: Avnish [email protected]
If the assignment of hyperparmeters to learnerhp objects happens inside of the builder functions such as
config.training
then if a user passes their experiment config to tune.run as a dictionary, the learner_hps will not be ever set, meaning that downstream learners will not recieve the passed hparams.
This can be resolved by assigning learner_hps in algorithm_config.validate instead.
Why are these changes needed?
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.