Removing return_info argument to env.reset() and deprecated env.seed() function (reset now always returns info) #2962

balisujohn · 2022-07-13T17:43:50Z

Description

This PR removes the return_info argument so env.reset always returns a tuple of the form
(obs, info)where info is a dict which contains observation metadata.

Additionally, this PR removes the env.seed() function, for both the core env and vector envs.

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

…eeding

pseudo-rnd-thoughts

It mostly looks good, I just have a couple of questions on mainly on the environment checker and passive env checker

gym/core.py

pseudo-rnd-thoughts · 2022-08-03T09:38:23Z

gym/core.py

@@ -482,11 +446,8 @@ def observation(self, obs):

    def reset(self, **kwargs):


could we change this to the actual reset parameters?

I'm not necessarily against this, but some of our wrappers use this strategy to allow the wrapper to be agnostic to changes in the function definitions, such as https://github.com/openai/gym/blob/master/gym/wrappers/order_enforcing.py. Is this suggestion part of a broader goal of moving towards explicit argument type hinting for wrappers?

That is a good. Lets not do it because it allows the wrappers to be partially backward compatible

pseudo-rnd-thoughts · 2022-08-03T09:41:21Z

gym/utils/env_checker.py

@@ -281,7 +234,6 @@ def check_env(env: gym.Env, warn: bool = None, skip_render_check: bool = False):
    # ==== Check the reset method ====
    check_reset_seed(env)
    check_reset_options(env)
-    check_reset_info(env)


Should we replace this with a signature check?

added, but I also brought back check_reset_info. Open to feedback if the current tests seem redundant since there is some redundancy between test_reset_info and test_passive_env_reset_checker

pseudo-rnd-thoughts · 2022-08-03T09:42:22Z

gym/utils/env_checker.py

@@ -126,53 +126,6 @@ def check_reset_seed(env: gym.Env):
        )


-def check_reset_info(env: gym.Env):


Would we want to keep these with a signature check?

Do you mean if "return_info" is in the signature, check to see if the return_info argument is handled correctly? So it would be for legacy third part envs implementing return_info in their signature?

More like, if return info is in the signature then say that this feature is deprecate and the environment should return the obs and info

done, I added a test for the signature check as well

pseudo-rnd-thoughts · 2022-08-03T09:44:14Z

gym/vector/async_vector_env.py

@@ -250,15 +218,13 @@ def reset_wait(
        self,
        timeout: Optional[Union[int, float]] = None,
        seed: Optional[int] = None,
-        return_info: bool = False,
        options: Optional[dict] = None,
    ) -> Union[ObsType, Tuple[ObsType, List[dict]]]:
        """Waits for the calls triggered by :meth:`reset_async` to finish and returns the results.

        Args:
            timeout: Number of seconds before the call to `reset_wait` times out. If `None`, the call to `reset_wait` never times out.
            seed: ignored


The seed is ignored, how to vector env seed their environments without seed

So I looked into this, basically reset_wait doesn't use its arguments, since seeding is performed in reset_async, so I think we can safely remove the arguments in reset_wait for vector envs. I think this should be fixed in a separate PR though.

pseudo-rnd-thoughts · 2022-08-03T09:46:50Z

gym/vector/sync_vector_env.py

-                observation, info = env.reset(**kwargs)
-                observations.append(observation)
-                infos = self._add_info(infos, info, i)
+            observation, info = env.reset(**kwargs)


Do we need to use kwargs?

going to hold off on this unless specifically requested, in accordance with our discussion about kwargs on the other comment.

gym/utils/passive_env_checker.py

pseudo-rnd-thoughts · 2022-08-03T09:55:18Z

gym/utils/passive_env_checker.py

        logger.warn(
            "Future gym versions will require that `Env.reset` can be passed `options` to allow the environment initialisation to be passed additional information."
        )

    # Checks the result of env.reset with kwargs
    result = env.reset(**kwargs)
-    if kwargs.get("return_info", False) is True:


Personally, I would like the passive environment checker to have as minimal errors as possible. Therefore, keep some of these checks as warnings that the results is tuple and length == 2

I made the tuple check into a warning.

balisujohn · 2022-08-04T04:26:51Z

@pseudo-rnd-thoughts Alright so I had a few questions about some of the comments so I left my own comments and I made changes to address the other ones and resolved them. Tests are passing locally so I don't know what the deal with CI/CD is. I think we should remove RandomNumberGenerator in a different PR, since it breaks space seeding tests when done naively.

…eeding

…ion assertion if found in reset signature

balisujohn · 2022-08-09T03:21:57Z

Alright passing tests and ready for final review. Definitely we need to be as careful as possible about this PR. It's going to lead to a lot of issues being created since it's a hard deprecation of a core feature. If anyone has ideas for where to add more deprecation warnings, I think that could make this easier on users.

balisujohn · 2022-08-09T05:41:21Z

It occurs to me this will need a counterpart PR in the docs as well

pseudo-rnd-thoughts

A couple more comments for the changes to the environment checker

pseudo-rnd-thoughts · 2022-08-10T12:48:35Z

gym/utils/env_checker.py

@@ -73,7 +73,7 @@ def check_reset_seed(env: gym.Env):
        and signature.parameters["kwargs"].kind is inspect.Parameter.VAR_KEYWORD
    ):
        try:
-            obs_1 = env.reset(seed=123)


If this is the first check in the environment checker, we should probably check that the output is a tuple with length 2 so users don't get a weird an explained error

So now I call check_reset_return_type before calling check_reset_seed, in check_env, so this should address that possibility.

pseudo-rnd-thoughts · 2022-08-10T12:49:53Z

gym/utils/passive_env_checker.py

-        )
-
-    if "options" not in signature.parameters and "kwargs" not in signature.parameters:
+    if "options" not in signature.parameters or "kwargs" in signature.parameters:


Why have we lost the not for the kwargs check?

I think this is actually an automatic merge error since my original fork for this commit is sort of old. Good catch. I'm reverting it.

pseudo-rnd-thoughts · 2022-08-10T12:55:43Z

gym/utils/env_checker.py

        )


-def check_reset_options(env: gym.Env):
-    """Check that the environment can be reset with options.
+def check_reset_info(env: gym.Env):


Im not sure this is needed and could probably be easily covered by the other functions

So I split check_reset_info out into check_reset_return_type and check_reset_return_info_deprecation. Some behavior is redundant with the call to env_reset_passive_checker , so we may want to stop calling the env_reset_passive_checker in the standard env_checker in a later PR. I think the other option would be to instrument env_reset_passive_checker with an argument to put it into "assert mode," but I feel like that's out of scope for this PR.

Good idea for the changes, though Im not a fan of the assert mode as it feels like it could a lot more if statements

pseudo-rnd-thoughts

Looks good to me, do we want to add a check in the environment checker for seed. I don't think it should raise an error, just raise a warning

…eeding

pseudo-rnd-thoughts · 2022-08-21T08:32:53Z

gym/utils/env_checker.py

+        UserWarning
+    """
+    seed_fn = getattr(env, "seed", None)
+    print(seed_fn)


Remove prints

pseudo-rnd-thoughts · 2022-08-21T08:34:37Z

tests/utils/test_env_checker.py

+        assert callable(env.seed)
+        check_seed_deprecation(env)
+
+    with warnings.catch_warnings():


To check this works as intended we might want to add

with warnings.catch_warnings(record=True) as caught_warnings: <test code> assert len(caught_warnings) == 0

sounds good, I changed to this pattern

pseudo-rnd-thoughts

Other than the print statement, LGTM

RedTachyon · 2022-08-21T20:21:29Z

gym/utils/env_checker.py

-        raise gym.error.Error(
-            "The `reset` method does not provide an `options` or `**kwargs` keyword argument."
+    if "return_info" in signature.parameters:
+        raise AssertionError(


I think it's better to make this a warning - it is an important information, but at the same time, in principle, it's not incorrect to implement a custom environment which takes extra optional arguments.

Sounds good, I'm changing this to a UserWarning.

RedTachyon · 2022-08-21T20:26:37Z

gym/vector/sync_vector_env.py

@@ -164,8 +136,8 @@ def step_wait(self):
                info,
            ) = step_api_compatibility(env.step(action), True)
            if self._terminateds[i] or self._truncateds[i]:
+                observation, info = env.reset()


Is this a mistake? In the old version, reset is after updating the info dict, now it's before, so I think it will record an incorrect final_observation

Yeah this seems to be either a mistake or a merge error, so I'm reverting it.

This seems like a mistake rather than a merge error, and I also found the mistake in the async vector env. Tests break when I switch the order of these statements to the correct order, so I need to examine why the tests are wrong.

I addressed this and now tests are passing. One issues with a naive swap of the order is that calling obs, info = env,reset() overwrites the info object where "final_observation" was saved. So I used a temp variable to write the observation from the last call to .step() into the info returned by the call to .reset()

RedTachyon · 2022-08-21T20:29:42Z

tests/utils/test_env_checker.py

-
-
-def _reset_return_info_type(self, seed=None, return_info=False, options=None):
+def _deprecated_return_info(self, return_info=False):


What exactly is the purpose of this?

(some comments/description, and type hints)

The idea is we want to throw a warning if return_info is in the signature of an environment's reset function

Yea, but these specific functions are extremely odd. Please at least annotate these tests so that it's clear to us 6 months from now what these functions are meant to do exactly, or perhaps restructure the tests so that it's more obvious.

ah, I missed your second comment, yeah I can add some more explanation of the specific reason for the deprecation test, as well as type hints.

…o reset function

RedTachyon · 2022-08-22T15:42:33Z

tests/utils/test_env_checker.py

-    else:
-        return self.observation_space.sample()
+def _reset_return_info_type(self, seed=None, options=None):
+    return [1, 2]


I still don't like that this is a random list of integers. Is this meant to emulate a reset output of an incorrectly implemented env? We need some commentary about this specific sequence of functions.

The test can be updated to
return [self.observation_space.sample(), {}]
I think this makes more sense

The point of the function is as a reset function where the type of the return is wrong.

…tatement

pseudo-rnd-thoughts

LGTM

wookayin · 2022-09-04T17:14:33Z

Are we sure about this breaking change? Is it a part of 0.26.0 or v1.0? If it is 0.26.0, I don't understand why we have merged this so early; in my opinion this should not be the new default yet because it will break most of the existing code.

It'd be nice if such a breaking change can be introduced ONLY with MAJOR versions. My understanding is that the next release we're expecting is v0.26: #3056

wookayin · 2022-09-04T22:17:55Z

Through gym v0.25, it was NEVER warned to users that env.reset() will start returning a Tuple instead of a single observation. There were some warnings about env.step() API changes (see new_step_api), but we had no compatiblity flag like for env.reset().

It would be REALLY surprising if v0.26 suddenly changes the behavior and there is no way back. This doesn't make much sense, and I strongly suggest the new defaults should be a part of v1.0, and hopefully after some deprecation period.

wookayin · 2022-09-07T19:57:03Z

This has been modified to a full compatibility wrapper in #3066 (EnvCompatibility)

balisujohn added 5 commits July 10, 2022 15:11

removed return_info, made info dict mandatory in reset

c004fe2

tenatively removed deprecated seed api for environments

f4541c3

added more info type checks to wrapper tests

312bbce

Merge branch 'master' of github.com:openai/gym into dev-return-info-s…

0ae71ce

…eeding

formatting/style compliance

affa349

balisujohn changed the title ~~Removing deprecated return_info argument to env.reset and deprecated env.seed() function~~ Removing return_info argument to env.reset and deprecated env.seed() function Jul 13, 2022

balisujohn changed the title ~~Removing return_info argument to env.reset and deprecated env.seed() function~~ Removing return_info argument to env.reset() and deprecated env.seed() function Jul 13, 2022

pseudo-rnd-thoughts suggested changes Aug 3, 2022

View reviewed changes

addressed some comments

beddc39

balisujohn added 3 commits August 8, 2022 21:54

polish to address review

7ee12a1

Merge branch 'master' of github.com:openai/gym into dev-return-info-s…

7b4c6e5

…eeding

fixed tests after merge, and added a test of the return_info deprecat…

a153196

…ion assertion if found in reset signature

pseudo-rnd-thoughts suggested changes Aug 10, 2022

View reviewed changes

some organization of env_checker tests, reverted a probably merge error

1449c51

pseudo-rnd-thoughts reviewed Aug 19, 2022

View reviewed changes

balisujohn added 3 commits August 21, 2022 01:56

added deprecation check for seed function in env

dda5f3f

updated docstring

0626136

Merge branch 'master' of github.com:openai/gym into dev-return-info-s…

ce9b15d

…eeding

pseudo-rnd-thoughts reviewed Aug 21, 2022

View reviewed changes

pseudo-rnd-thoughts suggested changes Aug 21, 2022

View reviewed changes

balisujohn changed the title ~~Removing return_info argument to env.reset() and deprecated env.seed() function~~ Removing return_info argument to env.reset() and deprecated env.seed() function (reset now always returns info) Aug 21, 2022

removed debug prints, tweaked test_check_seed_deprecation

57d3a9d

RedTachyon reviewed Aug 21, 2022

View reviewed changes

changed return_info deprecation check from assertion to warning

daea9be

RedTachyon reviewed Aug 21, 2022

View reviewed changes

balisujohn added 3 commits August 21, 2022 16:30

fixes to vector envs, now should be correctly structured

bd42805

added some explanation and typehints for mockup depcreated return inf…

49bf8ab

…o reset function

re-removed seed function from vector envs

6e03a48

RedTachyon reviewed Aug 22, 2022

View reviewed changes

added explanation to _reset_return_info_type and changed the return s…

4c09aec

…tatement

pseudo-rnd-thoughts approved these changes Aug 23, 2022

View reviewed changes

jkterry1 merged commit 3a8daaf into openai:master Aug 23, 2022

wookayin mentioned this pull request Sep 4, 2022

Release notes for v0.26.0 #3056

Merged

		@@ -482,11 +446,8 @@ def observation(self, obs):

		def reset(self, **kwargs):

		@@ -126,53 +126,6 @@ def check_reset_seed(env: gym.Env):
		)


		def check_reset_info(env: gym.Env):



		def _reset_return_info_type(self, seed=None, return_info=False, options=None):
		def _deprecated_return_info(self, return_info=False):

Removing return_info argument to env.reset() and deprecated env.seed() function (reset now always returns info) #2962

Removing return_info argument to env.reset() and deprecated env.seed() function (reset now always returns info) #2962

Conversation

balisujohn commented Jul 13, 2022 • edited Loading

Description

Type of change

Checklist:

pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

balisujohn Aug 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pseudo-rnd-thoughts Aug 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

balisujohn commented Aug 4, 2022 • edited Loading

balisujohn commented Aug 9, 2022

balisujohn commented Aug 9, 2022

pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

balisujohn Aug 12, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pseudo-rnd-thoughts Aug 12, 2022 • edited Loading

Choose a reason for hiding this comment

pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pseudo-rnd-thoughts Aug 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

balisujohn Aug 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

wookayin commented Sep 4, 2022 • edited Loading

wookayin commented Sep 4, 2022

wookayin commented Sep 7, 2022

balisujohn commented Jul 13, 2022 •

edited

Loading

balisujohn Aug 4, 2022 •

edited

Loading

pseudo-rnd-thoughts Aug 4, 2022 •

edited

Loading

balisujohn commented Aug 4, 2022 •

edited

Loading

balisujohn Aug 12, 2022 •

edited

Loading

pseudo-rnd-thoughts Aug 12, 2022 •

edited

Loading

pseudo-rnd-thoughts Aug 21, 2022 •

edited

Loading

balisujohn Aug 21, 2022 •

edited

Loading

wookayin commented Sep 4, 2022 •

edited

Loading