-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rllib] Env to base env refactor #20785
[rllib] Env to base env refactor #20785
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be my favorite PR of the week :)
There are also a few test failures. Let me know if you need help debugging those. Something may be off with MultiAgentEnv.
@Deprecated( | ||
old="ray.rllib.env.base_env.BaseEnv.to_base_env", | ||
new="ray.rllib.env.base_env.convert_to_base_env", | ||
error=False) | ||
def to_base_env( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sven1977 @avnishn do you think we can just delete these internal functions classes?
these are neither @publicapi nor @DeveloperAPI, so not part of the APIs we publish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, let's leave those in for a while. I feel like users should always have n releases to catch the new deprecation warnings and change their code accordingly before we completely remove the old code. I know we have lots of deprecation_warnings
in the code right now, but feel free to - here and there - flip some of these to error=True
to start the next stage in their deprecation (the last stage being to remove the code entirely).
rllib/env/base_env.py
Outdated
def _with_dummy_agent_id(env_id_to_values: Dict[EnvID, Any], | ||
dummy_id: "AgentID" = _DUMMY_AGENT_ID | ||
) -> MultiEnvDict: | ||
return {k: {dummy_id: v} for (k, v) in env_id_to_values.items()} | ||
|
||
|
||
@DeveloperAPI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why make this @DeveloperAPI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, probably not necessary here. Let's not make this DevAPI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kk
# `env` is not a BaseEnv yet -> Need to convert/vectorize. | ||
|
||
# MultiAgentEnv (which is a gym.Env). | ||
if isinstance(env, MultiAgentEnv): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think things will be even cleaner if we can create some kind of common module level apis for these Envs.
Something like the following:
- external_env.py
- def is_env_type(env) -> bool: return isinstance(env, ExternalEnv)
- def to_base_env(env, make_env, num_envs, ...) -> BaseEnv:
... logics for converting ExternalEnv to BaseEnv.
- multi_agent_env.py
- def is_env_type(env) -> bool: return isinstance(env, MultiAgentEnv)
- def to_base_env(env, make_env, num_envs, ...) -> BaseEnv:
... logics for converting MultiAgentEnv to BaseEnv.
- vector_env.py
- def is_env_type(env) -> bool: return isinstance(env, VectorEnv)
- def to_base_env(env, make_env, num_envs, ...) -> BaseEnv:
... logics for converting VectorEnv to BaseEnv.
- generic_env.py
- def is_env_type(env) -> bool: return isinstance(env, EnvType)
- def to_base_env(env, make_env, num_envs, ...) -> BaseEnv:
... logics for converting basically gym.Env to BaseEnv.
... basically the logics in the last big else clause.
Then with this structure, we can clean up this function to be as simple as:
def convert_to_base_env(env, ...) -> BaseEnv:
if isinstance(env, BaseEnv):
return env
from ray.rllib.env import external_env, multi_agent_env, vector_env, generic_env
for env_type in [external_env, multi_agent_env, vector_env, generic_env]:
if env_type.is_env_type(env):
return env_type.to_base_env(env, make_env, ...)
raise ValueError("Unknown Env type: ", type(env))
Let me know if there is anything that's not clear here.
I can also help send a patch if it makes things faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the is_env_type()
methods. Wouldn't these just add extra weight to the code?
One could also just do (w/o needing these methods):
def convert_to_base_env(env, ...) -> BaseEnv:
if isinstance(env, BaseEnv):
return env
from ray.rllib.env import external_env, multi_agent_env, vector_env, generic_env
for env_type in [external_env, multi_agent_env, vector_env, generic_env]:
if isinstance(env, env_type):
return env_type.to_base_env(env, make_env, ...)
raise ValueError("Unknown Env type: ", type(env))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to move the to_base_env
logics into the individual classes. That makes a lot of sense.
Every env-class that has such a method is then directly visible as RLlib-supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@avnishn , we can do this in a follow-up PR, though. This one already has lots of changes in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason why we can't instead keep make this a static but standalone function, like we do in this PR anyways? The reason that we have this to_base_env function right now is because RLlib requires a BaseEnv for most of its operations. Is that reason enough that we should have this to_base_env function?
Perhaps it is cleaner, but it does definitely add weight to the class overall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the above solution though, because it removes the need for circular imports in the BaseEnv class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could probably call it something else? to_base_env implies that the function doesn't return a new class. Maybe we could call it get_base_env
or get_new_base_env
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point actually. I think if we can't delete the static BaseEnv.to_base_env() anyways, we might as well just keep using it. No need to add the top-level convert_to_base_env().
the part I care about is to keep BaseEnv.to_base_env() as minimal as possible, and make per-Env conversion logic live with those specific modules.
regarding is_env_type(), that's just so we don't have to import 4 Env classes using 4 import statements. and it provides a little bit of flexibility when you need more logics to decide whether a specific Env should handle the conversion. I won't insist on this either. the code will look like:
def convert_to_base_env(env, ...) -> BaseEnv:
if isinstance(env, BaseEnv): return env
from ray.rllib.env.external_env import ExternalEnv
from ray.rllib.env.multi_agent_env import MultiAgentEnv
... import VectorEnv
... import generic_env_to_base_env
for env_type in [ExternalEnv, MultiAgentEnv, VectorEnv]:
if isinstance(env, env_type):
return env_type.to_base_env(env, make_env, ...)
return generic_env_to_base_env(env, make_env, ...)
Just a little less concise.
Either way is fine. Let's worry about it with the followup PR.
Rename _XXXEnvToBaseEnv classes into XXXBaseEnvWrapper(BaseEnv)
e9a8da4
to
0bbe71b
Compare
@gjoliver I removed the developer API tag, and got the tests passing I think. Could you please re-review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool man. looking forward to the followup PR.
# `env` is not a BaseEnv yet -> Need to convert/vectorize. | ||
|
||
# MultiAgentEnv (which is a gym.Env). | ||
if isinstance(env, MultiAgentEnv): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point actually. I think if we can't delete the static BaseEnv.to_base_env() anyways, we might as well just keep using it. No need to add the top-level convert_to_base_env().
the part I care about is to keep BaseEnv.to_base_env() as minimal as possible, and make per-Env conversion logic live with those specific modules.
regarding is_env_type(), that's just so we don't have to import 4 Env classes using 4 import statements. and it provides a little bit of flexibility when you need more logics to decide whether a specific Env should handle the conversion. I won't insist on this either. the code will look like:
def convert_to_base_env(env, ...) -> BaseEnv:
if isinstance(env, BaseEnv): return env
from ray.rllib.env.external_env import ExternalEnv
from ray.rllib.env.multi_agent_env import MultiAgentEnv
... import VectorEnv
... import generic_env_to_base_env
for env_type in [ExternalEnv, MultiAgentEnv, VectorEnv]:
if isinstance(env, env_type):
return env_type.to_base_env(env, make_env, ...)
return generic_env_to_base_env(env, make_env, ...)
Just a little less concise.
Either way is fine. Let's worry about it with the followup PR.
Why are these changes needed?
BaseEnv
is awkwardly the owner of a significant amount of code that does not belong to BaseEnv. This PR addresses that by movingBaseEnv
wrappers to their corresponding class files, and (in progress) removing theto_base_env
method from theBaseEnv
class, as this class awkwardly returns an instance of itself doesn't rely on having access to itself, is a static function, and should probably be a standalone function.Checks
scripts/format.sh
to lint the changes in this PR.Should we add tests for these wrappers? There weren't tests for them already, so we probably should go back and add them eventually?