-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib]: Raise deprecation warning in MARWIL OPE methods #26893
Conversation
Signed-off-by: Rohan138 <[email protected]>
Signed-off-by: Rohan138 <[email protected]>
Signed-off-by: Rohan138 <[email protected]>
Signed-off-by: Rohan138 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
love it
Signed-off-by: Rohan138 <[email protected]>
# TODO: Remove this when the off_polciy_estimation_methods | ||
# default config is removed from MARWIL | ||
# No off-policy estimation. | ||
self.off_policy_estimation_methods = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIce :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any implications to our CI?
Nope, but there might be a few minor changes once we actually do the eventual deprecation. |
…t#26893) Signed-off-by: Rohan138 <[email protected]>
…t#26893) Signed-off-by: Rohan138 <[email protected]>
…t#26893) Signed-off-by: Stefan van der Kleij <[email protected]>
MARWIL currently uses
off_policy_estimation_methods = {"is": {"type": ImportanceSampling}, "wis": {"type": WeightedImportanceSampling}}
by default instead of {} like all of the other algorithms. This should be deprecated and removed in a future release. We can't just remove it because of users that may be using MARWIL with the current default.Closes #26667
Checks
scripts/format.sh
to lint the changes in this PR.