-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIR] Refactor ScalingConfig
key validation
#25549
Conversation
@Yard1 could u please take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! I added an extra validation step and removed the now unused function, PTAL @JiahaoYao
Also enabled validation for the GBDT trainer which didn't have it in the first place! |
keys_not_in_dict = [key for key in allowed_keys if key not in default_data.__dict__] | ||
if keys_not_in_dict: | ||
raise ValueError( | ||
f"Key(s) {keys_not_in_dict} are not present in " | ||
f"{dataclass.__class__.__name__}. " | ||
"Remove them from `allowed_keys`. " | ||
f"Valid keys: {list(default_data.__dict__.keys())}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow this feels weird to me, as this is meant to be for the developer rather than the end user. I.e. if we check in a Trainer that fails here this isn't really actionable for the user.
Is this more suitable to be added in tests instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this will fail in the tests. There is no chance a user will see this error, unless they develop their own trainer (in which case that's actually good IMO. Makes it harder to shoot yourself in the foot).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah sounds reasonable. Can we update method name/docs?
Longer term maybe we should split this logic out so that we have one validation step for the Trainer definition (developer), and one for the instantiation (end user).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matthewdeng done, ptal
Follow another approach mentioned in #25350.
The scaling config is now converted to the dataclass letting us use a single function for validation of both user supplied dicts and dataclasses. This PR also fixes the fact the scaling config wasn't validated in the GBDT Trainer and validates that allowed keys set in Trainers are present in the dataclass.
Why are these changes needed?
Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.