Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Discussion 3644: Fix bug for complex obs spaces containing Box([2D shape]) and discrete component. #18917

Conversation

sven1977
Copy link
Contributor

@sven1977 sven1977 commented Sep 27, 2021

Discussion 3644: Fix bug for complex obs spaces containing Box([2D shape]) and discrete component.

See also this discussion here:
https://discuss.ray.io/t/unpack-obs-doesnt-know-to-expect-1-hot/3644

Reinstate test_supported_spaces_pg (was commented out).

Why are these changes needed?

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Copy link
Member

@avnishn avnishn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I had a few questions, and also left some comments (some were NITS feel free to ignore)

@@ -58,7 +58,7 @@ class TestSAC(unittest.TestCase):
def setUpClass(cls) -> None:
np.random.seed(42)
torch.manual_seed(42)
ray.init()
ray.init(local_mode=True) #TODO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason why you modified this test in this change? @sven1977

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sorry, this was debug. I always add the #TODO to make the LINTer fail so I won't forget to remove it at the end :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now.

orig_obs = restore_original_dimensions(input_dict[SampleBatch.OBS],
self.obs_space, "tf")
orig_obs = restore_original_dimensions(
input_dict[SampleBatch.OBS], self.processed_obs_space, "tf")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this and line 41 the core of the change?

@@ -69,6 +69,11 @@ def _do_check(alg, config, a_name, o_name):

try:
a = get_trainer_class(alg)(config=config, env=RandomEnv)
except ray.exceptions.RayActorError as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: is there a way to check for a tighter error than just RayActorError? AFAICT this could mean a host of issues with the actor, not just issues with the space of the environment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could maybe check the actual error message. You are right, we would miss any other error thrown on any of the workers (even if it's not a space exception) and interpret it as a "harmless" space-not-supported exception, which it may not be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, no, we do this already here:

            if isinstance(e.args[2], UnsupportedSpaceException):
                stat = "unsupported"
            else:
                raise

So only an underlying UnsupportedSpaceException makes this test know, it's harmless. All other underlying error types are re-raised.

@@ -146,7 +146,7 @@ def zero_logps_from_actions(actions: TensorStructType) -> TensorType:
# `deterministic_actions` or `stochastic_actions`). In case
# actions are just [B], zeros_like works just fine here, but if
# actions are [B, ...], we have to reduce logp back to just [B].
if len(logp_.shape) > 1:
while len(logp_.shape) > 1:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual core change here, allowing action spaces of Box(n, m, shape=(x, y, ...), int32) to be supported.
This and the reinstantiation of the supported_spaces tests, making sure all these combinations (observation + action-spaces) work well.

@sven1977 sven1977 merged commit ac3371a into ray-project:master Sep 30, 2021
@sven1977 sven1977 deleted the discussion_3644_action_placeholder_multidiscrete branch June 2, 2023 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants