-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] Discussion 3644: Fix bug for complex obs spaces containing Box([2D shape])
and discrete component.
#18917
[RLlib] Discussion 3644: Fix bug for complex obs spaces containing Box([2D shape])
and discrete component.
#18917
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I had a few questions, and also left some comments (some were NITS feel free to ignore)
rllib/agents/sac/tests/test_sac.py
Outdated
@@ -58,7 +58,7 @@ class TestSAC(unittest.TestCase): | |||
def setUpClass(cls) -> None: | |||
np.random.seed(42) | |||
torch.manual_seed(42) | |||
ray.init() | |||
ray.init(local_mode=True) #TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason why you modified this test in this change? @sven1977
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sorry, this was debug. I always add the #TODO
to make the LINTer fail so I won't forget to remove it at the end :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed now.
rllib/models/tf/complex_input_net.py
Outdated
orig_obs = restore_original_dimensions(input_dict[SampleBatch.OBS], | ||
self.obs_space, "tf") | ||
orig_obs = restore_original_dimensions( | ||
input_dict[SampleBatch.OBS], self.processed_obs_space, "tf") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this and line 41 the core of the change?
@@ -69,6 +69,11 @@ def _do_check(alg, config, a_name, o_name): | |||
|
|||
try: | |||
a = get_trainer_class(alg)(config=config, env=RandomEnv) | |||
except ray.exceptions.RayActorError as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: is there a way to check for a tighter error than just RayActorError? AFAICT this could mean a host of issues with the actor, not just issues with the space of the environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could maybe check the actual error message. You are right, we would miss any other error thrown on any of the workers (even if it's not a space exception) and interpret it as a "harmless" space-not-supported exception, which it may not be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, no, we do this already here:
if isinstance(e.args[2], UnsupportedSpaceException):
stat = "unsupported"
else:
raise
So only an underlying UnsupportedSpaceException
makes this test know, it's harmless. All other underlying error types are re-raised.
…ussion_3644_action_placeholder_multidiscrete
…ussion_3644_action_placeholder_multidiscrete
…ussion_3644_action_placeholder_multidiscrete
…ussion_3644_action_placeholder_multidiscrete
…ussion_3644_action_placeholder_multidiscrete
@@ -146,7 +146,7 @@ def zero_logps_from_actions(actions: TensorStructType) -> TensorType: | |||
# `deterministic_actions` or `stochastic_actions`). In case | |||
# actions are just [B], zeros_like works just fine here, but if | |||
# actions are [B, ...], we have to reduce logp back to just [B]. | |||
if len(logp_.shape) > 1: | |||
while len(logp_.shape) > 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the actual core change here, allowing action spaces of Box(n, m, shape=(x, y, ...), int32) to be supported.
This and the reinstantiation of the supported_spaces tests, making sure all these combinations (observation + action-spaces) work well.
Discussion 3644: Fix bug for complex obs spaces containing
Box([2D shape])
and discrete component.See also this discussion here:
https://discuss.ray.io/t/unpack-obs-doesnt-know-to-expect-1-hot/3644
Reinstate
test_supported_spaces_pg
(was commented out).Why are these changes needed?
Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.