-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] AlgorithmConfig cleanup 03: Cleaner names and structuring of API-stack config settings. #44920
[RLlib] AlgorithmConfig cleanup 03: Cleaner names and structuring of API-stack config settings. #44920
Conversation
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just blindly trusting you Sven (as a doc reviewer)
…rithm_config_cleanup_03_new_api_stack_settings
Signed-off-by: sven1977 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Great change!
@@ -113,7 +113,7 @@ def __init__(self, *args, **kwargs): | |||
|
|||
config = ( | |||
PPOConfig() | |||
.experimental(_enable_new_api_stack=True) | |||
.api_stack(enable_rl_module_and_learner=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, I feel the new power rising! :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, me, too. EnvRunners
growing to become adults :D
# This enables the use of | ||
# a) RLModule (replaces ModelV2) and Learner (replaces Policy) | ||
# b) and automatically picks the correct EnvRunner (single-agent vs multi-agent) | ||
# and enables ConnectorV2 support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, if a user overrides the SingleAgentEnvRunner
, she still can pass it into Algorithm.env_runners
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you can bring your own EnvRunner sub-class as you like.
See e.g. rllib.algorithms.tests.test_worker_failures
, which uses sub-classes of SingleAgentEnvRunner
and MultiAgentEnvRunner
.
self.experimental( | ||
_enable_new_api_stack=config_dict["_enable_new_api_stack"] | ||
) | ||
enable_rl_module_and_learner = config_dict.get( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this? Imo users using still this "old" form must switch anyways in near future - so they could do it now. It's a small change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mostly for ourselves :|
Some scripts of ours work for both API stacks and we don't want to make the exploration_config
dependent on this. I'm fully with you: This is currently very hackish and should be cleaned up. I'll create another PR to see, what the impact would be for our code base (and tests), if we get rid of this hack.
Signed-off-by: sven1977 <[email protected]>
…rithm_config_cleanup_03_new_api_stack_settings
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
AlgorithmConfig cleanup 03:
AlgorithmConfig.experimental(_enable_new_api_stack=...)
renamed toAlgorithmConfig.api_stack(enable_rl_module_and_learner)
.env_runner_cls=SingleAgentEnvRunner|MultiAgentEnvRunner
automated toAlgorithmConfig.api_stack(enable_env_runner_and_connector_v2=True)
num_rollout_workers
and replace with the newnum_env_runners
setting.Why are these changes needed?
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.