[RLlib] MultiAgentEpisode: Fix various bugs in `slice()`. #44594

sven1977 · 2024-04-09T15:49:37Z

MultiAgentEpisode (MAE): Fix various bugs in slice(), mostly related to using a lookback buffer.
Enhanced test cases for MAE slicing.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <[email protected]>

simonsays1980

LGTM. Great improvements in this tough field.

simonsays1980 · 2024-04-10T08:26:26Z

rllib/env/multi_agent_episode.py

            ):
                terminateds[aid] = sa_episode.is_terminated
                truncateds[aid] = sa_episode.is_truncated
+            # Determine this agent's t_started.
+            if start < len(mapping):
+                for i in range(start, len(mapping)):


agent_t_started[aid] = next(item for item in mapping[start:] if item != self.SKIP_ENV_TS_TAG)

?

simonsays1980 · 2024-04-10T08:35:36Z

rllib/env/multi_agent_episode.py

@@ -1809,12 +1820,14 @@ def _init_single_agent_episodes(
                    len(observations_per_agent[agent_id]) - 1
                )

-            # Those agents that did NOT step get None added to their mapping.
+            # Those agents that did NOT step get self.SKIP_ENV_TS_TAG added to their


Btw SKIP_ENV_TS_TAG is super important to be explicitly documented. I was at first wondering what it meant :)

simonsays1980 · 2024-04-10T08:38:51Z

rllib/env/single_agent_episode.py

+
+        # Extend ourselves. In case, episode_chunk is already terminated (and finalized)
+        # we need to convert to lists (as we are ourselves still filling up lists).
+        self.observations.extend(other.get_observations())


I am unsure here, but does extend with all the observations of other duplicate the one observation that is in both?

Yes, it does, but before that, we do:

self.observations.pop() self.infos.pop()

so, it's fine :)

simonsays1980 · 2024-04-10T08:39:46Z

rllib/env/single_agent_episode.py

@@ -637,6 +584,59 @@ def finalize(self) -> "SingleAgentEpisode":

        return self

+    def concat_episode(self, other: "SingleAgentEpisode") -> None:


I like the other here :)

simonsays1980 · 2024-04-10T08:41:23Z

rllib/env/tests/test_multi_agent_episode.py

+                {"a1": 7},
+                {"a1": 8},
+                {"a0": 9},
+            ]
        )
        check(len(episode), 9)

        # Slice the episode in different ways and check results.
        # Empty slice.
        slice_ = episode[100:100]


In relation to this: InfiniteLookbackBuffer[start:stop] results in a list. Do we want to keep it?

You mean to return a new InfiniteLookbackBuffer instead?
Yeah, that could be an option, too. I'm not sure. If we can safely change that, it would be better, I think. The good thing is that this API is not user-facing, so we can still change it later.

simonsays1980 · 2024-04-10T08:42:34Z

rllib/env/tests/test_multi_agent_episode.py

        check(a1.observations, [2, 3])
        check(a1.actions, [2])
        check(a1.rewards, [0.2])
        check(a1.is_done, False)

+        # Test what happens if we have lookback buffers.
+        observations = [


Btw very good example to explain what a lookback buffer is and what it does.

simonsays1980 · 2024-04-10T08:43:30Z

rllib/env/utils/infinite_lookback_buffer.py

-                If a `InfiniteLookbackBuffer` the data gets
-                concatenated. If a `list` the list is concatenated to the
-                `self.data`.
+            other: Another `InfiniteLookbackBuffer` or a `list` or a number.


Yup, that was missing. I am sorry :/

No worries! It takes a village ... :)
Such a complex API. It's not done yet. I also left some things open when I was working on this. Come time ...

sven1977 added 3 commits April 9, 2024 16:59

wip

0f5dbbe

Signed-off-by: sven1977 <[email protected]>

wip

867b272

Signed-off-by: sven1977 <[email protected]>

wip

33aea0c

Signed-off-by: sven1977 <[email protected]>

sven1977 assigned simonsays1980 Apr 9, 2024

wip

7c72c8c

Signed-off-by: sven1977 <[email protected]>

sven1977 marked this pull request as ready for review April 9, 2024 19:16

sven1977 requested review from avnishn, ArturNiederfahrenhorst, maxpumperla, kouroshHakha and simonsays1980 as code owners April 9, 2024 19:16

sven1977 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Apr 9, 2024

simonsays1980 approved these changes Apr 10, 2024

View reviewed changes

sven1977 merged commit 3fea138 into ray-project:master Apr 10, 2024
5 checks passed

sven1977 deleted the multi_agent_episode_fix_slice branch April 10, 2024 11:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] MultiAgentEpisode: Fix various bugs in `slice()`. #44594

[RLlib] MultiAgentEpisode: Fix various bugs in `slice()`. #44594

sven1977 commented Apr 9, 2024 •

edited

Loading

simonsays1980 left a comment

simonsays1980 Apr 10, 2024

simonsays1980 Apr 10, 2024

simonsays1980 Apr 10, 2024

sven1977 Apr 10, 2024

simonsays1980 Apr 10, 2024

simonsays1980 Apr 10, 2024

sven1977 Apr 10, 2024

simonsays1980 Apr 10, 2024

simonsays1980 Apr 10, 2024

sven1977 Apr 10, 2024

		@@ -637,6 +584,59 @@ def finalize(self) -> "SingleAgentEpisode":

		return self

		def concat_episode(self, other: "SingleAgentEpisode") -> None:

[RLlib] MultiAgentEpisode: Fix various bugs in slice(). #44594

[RLlib] MultiAgentEpisode: Fix various bugs in slice(). #44594

Conversation

sven1977 commented Apr 9, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

simonsays1980 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

[RLlib] MultiAgentEpisode: Fix various bugs in `slice()`. #44594

[RLlib] MultiAgentEpisode: Fix various bugs in `slice()`. #44594

sven1977 commented Apr 9, 2024 •

edited

Loading