[RLlib] Add Segmentation Buffer for DT #27829

charlesjsun · 2022-08-12T18:45:41Z

Signed-off-by: Charles Sun [email protected]

Why are these changes needed?

Added the SegmentationBuffer that DT (Decision Transformer) needs.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

charlesjsun · 2022-08-12T18:48:00Z

rllib/policy/sample_batch.py

+    # decision transformer
+    RETURNS_TO_GO = "returns_to_go"
+    ATTENTION_MASKS = "attention_masks"
+


Arguments could be made for putting it here, or putting it elsewhere like in dt.py which doesn't exist yet.

we're putting this here.

Signed-off-by: Charles Sun <[email protected]>

avnishn

minor changes

avnishn · 2022-08-15T17:07:28Z

rllib/algorithms/dt/segmentation_buffer.py

+        # TODO: sample proportional to episode length
+        # Sample a random episode from the buffer and then sample a random
+        # segment from that episode.
+        buffer_ind = np.random.randint(0, len(self._buffer))


this is not a seedable call. Can you instantiate your class with a numpy rng and set the seed of the rng using the seed in the global config?

avnishn · 2022-08-15T17:17:08Z

rllib/algorithms/dt/segmentation_buffer.py

+
+        obs = episode[SampleBatch.OBS][si:ei]
+        actions = episode[SampleBatch.ACTIONS][si:ei]
+        # Note that returns-to-go needs an extra elem as the target for the last action.


avnishn · 2022-08-15T17:23:35Z

rllib/algorithms/dt/segmentation_buffer.py

+                [returns_to_go, np.zeros((1, 1), dtype=returns_to_go.dtype)], axis=0
+            )
+
+        # Front-pad if at beginning of rollout.


add note that this is about inference :)

avnishn · 2022-08-15T17:25:02Z

rllib/algorithms/dt/segmentation_buffer.py

+        offset = min(self.max_seq_len, ep_len)
+        # We allow si to be negative (for now) because we want segments that only
+        # contains the first few transitions (and padd the rest),
+        # for example [0, 0, 0, 0, 0, 0, R0, s0, a0].


avnishn · 2022-08-15T17:28:18Z

rllib/algorithms/dt/tests/test_segmentation_buffer.py

+            # add to buffer and check that only last one is kept (due to replacement)
+            buffer.add(batch)
+
+            assert len(_get_internal_buffer(buffer)) == 1


needs message

avnishn · 2022-08-15T17:30:09Z

rllib/policy/sample_batch.py

+    # decision transformer
+    RETURNS_TO_GO = "returns_to_go"
+    ATTENTION_MASKS = "attention_masks"
+


we're putting this here.

Signed-off-by: Charles Sun <[email protected]>

kouroshHakha

@charlesjsun Thanks for taking care of the todos :)

rllib/algorithms/dt/segmentation_buffer.py

kouroshHakha · 2022-08-15T20:52:40Z

rllib/algorithms/dt/segmentation_buffer.py

+        # TODO: sample proportional to episode length
+        # Sample a random episode from the buffer and then sample a random
+        # segment from that episode.
+        buffer_ind = np.random.randint(0, len(self._buffer))


rllib/algorithms/dt/segmentation_buffer.py

rllib/algorithms/dt/tests/test_segmentation_buffer.py

Signed-off-by: Charles Sun <[email protected]>

…ion_buffer

rllib/algorithms/dt/segmentation_buffer.py

kouroshHakha · 2022-08-16T20:26:49Z

@richardliaw This can be merged. It passes all the tests.

Signed-off-by: Stefan van der Kleij <[email protected]>

charlesjsun assigned sven1977 Aug 12, 2022

charlesjsun requested review from sven1977 and gjoliver as code owners August 12, 2022 18:45

charlesjsun assigned kouroshHakha Aug 12, 2022

charlesjsun requested review from avnishn, ArturNiederfahrenhorst, smorad, maxpumperla, kouroshHakha and krfricke as code owners August 12, 2022 18:45

charlesjsun assigned avnishn Aug 12, 2022

charlesjsun changed the title ~~Added segmentation buffer~~ [RLlib] Added Segmentation Buffer for DT Aug 12, 2022

charlesjsun commented Aug 12, 2022

View reviewed changes

Added segmentation buffer

d6d446c

Signed-off-by: Charles Sun <[email protected]>

charlesjsun force-pushed the dt_segmentation_buffer branch from 9f8ce0b to d6d446c Compare August 13, 2022 02:11

charlesjsun changed the title ~~[RLlib] Added Segmentation Buffer for DT~~ [RLlib] Add Segmentation Buffer for DT Aug 15, 2022

avnishn reviewed Aug 15, 2022

View reviewed changes

address some pr reviews

4339108

Signed-off-by: Charles Sun <[email protected]>

kouroshHakha requested changes Aug 16, 2022

View reviewed changes

charlesjsun added 5 commits August 15, 2022 22:01

add padding test and fix sample

743ea9e

Signed-off-by: Charles Sun <[email protected]>

samplebatch split using dones

21a2e8b

Signed-off-by: Charles Sun <[email protected]>

add change warning to error and test

1f32321

Signed-off-by: Charles Sun <[email protected]>

comment

d06ee45

Signed-off-by: Charles Sun <[email protected]>

Merge branch 'master' of github.com:ray-project/ray into dt_segmentat…

35f9ea1

…ion_buffer

charlesjsun requested a review from kouroshHakha August 16, 2022 05:34

kouroshHakha reviewed Aug 16, 2022

View reviewed changes

rllib/algorithms/dt/segmentation_buffer.py Show resolved Hide resolved

rllib/algorithms/dt/segmentation_buffer.py Show resolved Hide resolved

rllib/algorithms/dt/segmentation_buffer.py Show resolved Hide resolved

kouroshHakha approved these changes Aug 16, 2022

View reviewed changes

richardliaw merged commit 753fad9 into ray-project:master Aug 16, 2022

Stefan-1313 pushed a commit to Stefan-1313/ray_mod that referenced this pull request Aug 18, 2022

[RLlib] Add Segmentation Buffer for DT (ray-project#27829)

53197ac

Signed-off-by: Stefan van der Kleij <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Add Segmentation Buffer for DT #27829

[RLlib] Add Segmentation Buffer for DT #27829

charlesjsun commented Aug 12, 2022

charlesjsun Aug 12, 2022

avnishn Aug 15, 2022

avnishn left a comment

avnishn Aug 15, 2022

kouroshHakha Aug 15, 2022

charlesjsun Aug 16, 2022

avnishn Aug 15, 2022

avnishn Aug 15, 2022

charlesjsun Aug 16, 2022

avnishn Aug 15, 2022

charlesjsun Aug 16, 2022

avnishn Aug 15, 2022

charlesjsun Aug 16, 2022

avnishn Aug 15, 2022

kouroshHakha left a comment

kouroshHakha Aug 15, 2022

kouroshHakha commented Aug 16, 2022

[RLlib] Add Segmentation Buffer for DT #27829

[RLlib] Add Segmentation Buffer for DT #27829

Conversation

charlesjsun commented Aug 12, 2022

Why are these changes needed?

Related issue number

Checks

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avnishn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kouroshHakha left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kouroshHakha commented Aug 16, 2022