Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] [WIP] [MultiAgentEnv Refactor #1] Add new methods to base env #21027

Merged
merged 1 commit into from
Dec 16, 2021

Conversation

avnishn
Copy link
Member

@avnishn avnishn commented Dec 11, 2021

RLlib Environments are difficult to test. Particularly Base Environments and MultiAgentEnvironments. This is because they are missing necessary required fields in order to test them.

The biggest issue that I face right now is that the MultiAgentEnv API is too loosely defined to write basic environment tests where I would follow this rough workflow:

reset_obs = env.try_reset()
assert env.obs_space_contains(reset_obs)
sampled_actions = env.action_space_sample()
env.send_actions(sampled_actions)
next_obs, dones, rewards, infos = env.poll()

assert env.obs_space_contains(next_obs)
assert isinstance(dones, dict)
assert isinstance(rewards, dict)
assert isinstance(infos, dict)

Because the MultiAgentEnv API is too loosely defined to be able to follow this workflow, I can’t write a BaseEnv checking module.

This PR Adds the necessary BaseEnv methods for making BaseEnvs unit-testable and checkable with an environment checking module.

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@avnishn avnishn force-pushed the add_new_methods_to_base_env_2 branch from 1c6d566 to 956759a Compare December 15, 2021 22:09
Copy link
Member

@gjoliver gjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 quick questions.

logger.warning("last has not been implemented for this environment.")
return {}, {}, {}, {}, {}

@PublicAPI
def observation_space_contains(self, x: MultiEnvDict) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, is the type still MultiEnvDict here and below??
I thought we are saying gym and multi-agent envs return different types now?

Copy link
Member Author

@avnishn avnishn Dec 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes -- the return type of poll() and try_reset() are MultiEnvDicts, and so I thought it would be appropriate if observations and actions produced by the environment/policy should be able to be easily passed to the environment for checking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, it seems to me we are settling on using multi-agent api for single agent env as well, which is totally fine, and probably logical.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BaseEnv is never a single-agent env. If there is only one agent and we derive the BaseEnv from e.g. a gym.Env, we auto-create "DUMMY_AGENT_ID" in the env as the agent's ID.

self._space_contains(self.observation_space, x)

@PublicAPI
def action_space_contains(self, x: MultiEnvDict) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a very minor question, do you envision obs/action_space_contains() getting used outside of the environment checking module?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine it having other uses for users who are trying to develop their own environments -- I have definitely used functions like this while developing my own environments

@gjoliver gjoliver self-requested a review December 16, 2021 04:56
Copy link
Member

@gjoliver gjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@sven1977 sven1977 merged commit 85a368c into ray-project:master Dec 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants