-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] [WIP] [MultiAgentEnv Refactor #1] Add new methods to base env #21027
[RLlib] [WIP] [MultiAgentEnv Refactor #1] Add new methods to base env #21027
Conversation
4755aea
to
1c6d566
Compare
1c6d566
to
956759a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 quick questions.
logger.warning("last has not been implemented for this environment.") | ||
return {}, {}, {}, {}, {} | ||
|
||
@PublicAPI | ||
def observation_space_contains(self, x: MultiEnvDict) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait, is the type still MultiEnvDict here and below??
I thought we are saying gym and multi-agent envs return different types now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes -- the return type of poll() and try_reset() are MultiEnvDicts, and so I thought it would be appropriate if observations and actions produced by the environment/policy should be able to be easily passed to the environment for checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, it seems to me we are settling on using multi-agent api for single agent env as well, which is totally fine, and probably logical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BaseEnv is never a single-agent env. If there is only one agent and we derive the BaseEnv from e.g. a gym.Env, we auto-create "DUMMY_AGENT_ID" in the env as the agent's ID.
self._space_contains(self.observation_space, x) | ||
|
||
@PublicAPI | ||
def action_space_contains(self, x: MultiEnvDict) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a very minor question, do you envision obs/action_space_contains() getting used outside of the environment checking module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can imagine it having other uses for users who are trying to develop their own environments -- I have definitely used functions like this while developing my own environments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
RLlib Environments are difficult to test. Particularly Base Environments and MultiAgentEnvironments. This is because they are missing necessary required fields in order to test them.
The biggest issue that I face right now is that the MultiAgentEnv API is too loosely defined to write basic environment tests where I would follow this rough workflow:
Because the MultiAgentEnv API is too loosely defined to be able to follow this workflow, I can’t write a BaseEnv checking module.
This PR Adds the necessary BaseEnv methods for making BaseEnvs unit-testable and checkable with an environment checking module.
Checks
scripts/format.sh
to lint the changes in this PR.