-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] Examples folder cleanup: ModelV2 -> RLModule wrapper for migrating to new API stack. #47425
[RLlib] Examples folder cleanup: ModelV2 -> RLModule wrapper for migrating to new API stack. #47425
Conversation
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
…api_stack_migration_modelv2
…api_stack_migration_modelv2
…api_stack_migration_modelv2
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Great PR. This enables hopefully more users to move from old to new stack!
from ray.rllib.core.rl_module.apis.value_function_api import ValueFunctionAPI | ||
|
||
|
||
__all__ = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!!
del policy | ||
|
||
def _forward_inference(self, batch: Dict[str, Any], **kwargs) -> Dict[str, Any]: | ||
nn_output, state_out = self._model_v2(batch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!!! So simple. I wonder if this works fine with any more complicated setup like LSTM or MARWIL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not :| but it's just a start ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a LSTM example script coming up in another PR (the one that adds the wrapper "from config").
@@ -31,11 +31,9 @@ | |||
} | |||
) | |||
.training( | |||
gamma=0.99, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks. Forgot these ...
@@ -37,10 +37,8 @@ | |||
) | |||
.training( | |||
lr=0.0003 * ((args.num_gpus or 1) ** 0.5), | |||
gamma=0.99, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing b/c they are default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct.
@@ -29,10 +29,8 @@ | |||
) | |||
.training( | |||
lr=0.0003 * ((args.num_gpus or 1) ** 0.5), | |||
gamma=0.99, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
…ating to new API stack. (ray-project#47425) Signed-off-by: ujjawal-khare <[email protected]>
…ating to new API stack. (ray-project#47425) Signed-off-by: ujjawal-khare <[email protected]>
…ating to new API stack. (ray-project#47425) Signed-off-by: ujjawal-khare <[email protected]>
…ating to new API stack. (ray-project#47425) Signed-off-by: ujjawal-khare <[email protected]>
…ating to new API stack. (ray-project#47425) Signed-off-by: ujjawal-khare <[email protected]>
Adds example script and RLModule example class for easy ModelV2 -> RLModule migration to new API stack using an old API stack Policy checkpoint.
Up next: Enhance the wrapper to work with a
from_config
option instead of the currently needed checkpoint.Why are these changes needed?
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.