-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] AlgorithmConfig cleanup 03: Cleaner names and structuring of API-stack config settings. #44920
[RLlib] AlgorithmConfig cleanup 03: Cleaner names and structuring of API-stack config settings. #44920
Changes from all commits
9aed235
52fd2a7
090f301
ad33e58
1fa79e8
2033dbd
688cfe7
3642334
d7b1b0f
443953a
215248a
6e46aae
f018bcb
d29eaa7
1708539
963409e
264431d
118e6f4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,19 @@ | ||
# __enabling-new-api-stack-sa-ppo-begin__ | ||
|
||
from ray.rllib.algorithms.ppo import PPOConfig | ||
from ray.rllib.env.single_agent_env_runner import SingleAgentEnvRunner | ||
|
||
|
||
config = ( | ||
PPOConfig().environment("CartPole-v1") | ||
# Switch the new API stack flag to True (False by default). | ||
# This enables the use of the RLModule (replaces ModelV2) AND Learner (replaces | ||
# Policy) classes. | ||
.experimental(_enable_new_api_stack=True) | ||
# However, the above flag only activates the RLModule and Learner APIs. In order | ||
# to utilize all of the new API stack's classes, you also have to specify the | ||
# EnvRunner (replaces RolloutWorker) to use. | ||
# Note that this step will be fully automated in the next release. | ||
# Set the `env_runner_cls` to `SingleAgentEnvRunner` for single-agent setups and | ||
# `MultiAgentEnvRunner` for multi-agent cases. | ||
.env_runners(env_runner_cls=SingleAgentEnvRunner) | ||
# Switch both the new API stack flags to True (both False by default). | ||
# This enables the use of | ||
# a) RLModule (replaces ModelV2) and Learner (replaces Policy) | ||
# b) and automatically picks the correct EnvRunner (single-agent vs multi-agent) | ||
# and enables ConnectorV2 support. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So, if a user overrides the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, you can bring your own EnvRunner sub-class as you like. |
||
.api_stack( | ||
enable_rl_module_and_learner=True, | ||
enable_env_runner_and_connector_v2=True, | ||
) | ||
# We are using a simple 1-CPU setup here for learning. However, as the new stack | ||
# supports arbitrary scaling on the learner axis, feel free to set | ||
# `num_learner_workers` to the number of available GPUs for multi-GPU training (and | ||
|
@@ -43,25 +40,22 @@ | |
# __enabling-new-api-stack-ma-ppo-begin__ | ||
|
||
from ray.rllib.algorithms.ppo import PPOConfig # noqa | ||
from ray.rllib.env.multi_agent_env_runner import MultiAgentEnvRunner # noqa | ||
from ray.rllib.examples.envs.classes.multi_agent import MultiAgentCartPole # noqa | ||
|
||
|
||
# A typical multi-agent setup (otherwise using the exact same parameters as before) | ||
# looks like this. | ||
config = ( | ||
PPOConfig().environment(MultiAgentCartPole, env_config={"num_agents": 2}) | ||
# Switch the new API stack flag to True (False by default). | ||
# This enables the use of the RLModule (replaces ModelV2) AND Learner (replaces | ||
# Policy) classes. | ||
.experimental(_enable_new_api_stack=True) | ||
# However, the above flag only activates the RLModule and Learner APIs. In order | ||
# to utilize all of the new API stack's classes, you also have to specify the | ||
# EnvRunner (replaces RolloutWorker) to use. | ||
# Note that this step will be fully automated in the next release. | ||
# Set the `env_runner_cls` to `SingleAgentEnvRunner` for single-agent setups and | ||
# `MultiAgentEnvRunner` for multi-agent cases. | ||
.env_runners(env_runner_cls=MultiAgentEnvRunner) | ||
# Switch both the new API stack flags to True (both False by default). | ||
# This enables the use of | ||
# a) RLModule (replaces ModelV2) and Learner (replaces Policy) | ||
# b) and automatically picks the correct EnvRunner (single-agent vs multi-agent) | ||
# and enables ConnectorV2 support. | ||
.api_stack( | ||
enable_rl_module_and_learner=True, | ||
enable_env_runner_and_connector_v2=True, | ||
) | ||
# We are using a simple 1-CPU setup here for learning. However, as the new stack | ||
# supports arbitrary scaling on the learner axis, feel free to set | ||
# `num_learner_workers` to the number of available GPUs for multi-GPU training (and | ||
|
@@ -95,20 +89,19 @@ | |
# __enabling-new-api-stack-sa-sac-begin__ | ||
|
||
from ray.rllib.algorithms.sac import SACConfig # noqa | ||
from ray.rllib.env.single_agent_env_runner import SingleAgentEnvRunner # noqa | ||
|
||
|
||
config = ( | ||
SACConfig().environment("Pendulum-v1") | ||
# Switch the new API stack flag to True (False by default). | ||
# This enables the use of the RLModule (replaces ModelV2) AND Learner (replaces | ||
# Policy) classes. | ||
.experimental(_enable_new_api_stack=True) | ||
# However, the above flag only activates the RLModule and Learner APIs. In order | ||
# to utilize all of the new API stack's classes, you also have to specify the | ||
# EnvRunner (replaces RolloutWorker) to use. | ||
# Note that this step will be fully automated in the next release. | ||
.env_runners(env_runner_cls=SingleAgentEnvRunner) | ||
# Switch both the new API stack flags to True (both False by default). | ||
# This enables the use of | ||
# a) RLModule (replaces ModelV2) and Learner (replaces Policy) | ||
# b) and automatically picks the correct EnvRunner (single-agent vs multi-agent) | ||
# and enables ConnectorV2 support. | ||
.api_stack( | ||
enable_rl_module_and_learner=True, | ||
enable_env_runner_and_connector_v2=True, | ||
) | ||
# We are using a simple 1-CPU setup here for learning. However, as the new stack | ||
# supports arbitrary scaling on the learner axis, feel free to set | ||
# `num_learner_workers` to the number of available GPUs for multi-GPU training (and | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, I feel the new power rising! :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, me, too.
EnvRunners
growing to become adults :D