-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] [Connectors] Fix test nested action spaces connectors #30459
[RLlib] [Connectors] Fix test nested action spaces connectors #30459
Conversation
@@ -552,6 +551,18 @@ def _unflatten_as_buffer_struct( | |||
"""Unflattens the given to match the buffer struct format for that key.""" | |||
if key not in self.buffer_structs: | |||
return data[0] | |||
if key == SampleBatch.ACTIONS and not self.disable_action_flattening: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this! Can you explain a little what's happening here for me and Jun?
Have you tested if this breaks other tests /w connectors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
action flattening is never actually happening at any point in the episode. This fix enables action flattening to happen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks man! I'll merge this into the collective branch soon to see if this breaks anything else. One question though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the flattening you are trying to do.
this commit flattens actions only if action flattening is not disabled. It does the action flattenning as elements are being added to the agent_collector buffer. Signed-off-by: Avnish <[email protected]>
7818310
to
1b26e68
Compare
…test_nested_action_spaces_connectors
Signed-off-by: Avnish <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one nit and contingent on tests passing. Let's wait for the tests to finish before another push.
rllib/algorithms/algorithm_config.py
Outdated
@@ -223,7 +223,7 @@ def __init__(self, algo_class=None): | |||
self.sample_collector = SimpleListCollector | |||
self.create_env_on_local_worker = False | |||
self.sample_async = False | |||
self.enable_connectors = False | |||
self.enable_connectors = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to revert this once the tests pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ofc
@@ -266,6 +269,8 @@ def add_action_reward_next_obs(self, input_values: Dict[str, TensorType]) -> Non | |||
or k.startswith("state_out_") | |||
or (k == SampleBatch.ACTIONS and not self.disable_action_flattening) | |||
): | |||
if k == SampleBatch.ACTIONS and not self.disable_action_flattening: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can rewrite the code to avoid repetition of the condition in two place:
should_flatten_action_key = (k == SampleBatch.ACTIONS and not self.disable_action_flattening)
if should_flatten_action_key:
v = flatten_to_single_ndarray(v)
if x or y or should_flatten_action_key:
self.buffers[k][0].append(v)
@@ -511,6 +516,8 @@ def _build_buffers(self, single_row: Dict[str, TensorType]) -> None: | |||
or col.startswith("state_out_") | |||
or (col == SampleBatch.ACTIONS and not self.disable_action_flattening) | |||
): | |||
if col == SampleBatch.ACTIONS and not self.disable_action_flattening: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing here
Signed-off-by: Avnish <[email protected]>
…test_nested_action_spaces_connectors
Signed-off-by: Avnish <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. merge contingent on tests passing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understand this is an easy change. But do we have to flatten the action here inside agent_collector?
In my imagination, we should create a super simple FlatteningActionConnector, and make it part of the action connector pipeline if config.disable_action_flattening
is False.
We will then be able to look at the action connectors and say "oh, ok, dude wants actions to be flattened ...".
Does this make sense? Are we trying to say that the actual action output doesn't need flattening, it only requires flattening when being added to agent_collector?
The action flattening only should happen during training not inference. Is there a way to write an action connector that only is invoked during training? |
It should be fed to the environment as unflattened, based on some flag that I should be able to set about the connector. |
…test_nested_action_spaces_connectors
…roject#30459) Signed-off-by: Weichen Xu <[email protected]>
Action flattening was never actually happening in the agent collector. This pr introduces that
Why are these changes needed?
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.