[RLlib; Offline RL] Make data pipeline better configurable and tuneable for users. #46777

simonsays1980 · 2024-07-24T15:32:59Z

Why are these changes needed?

The new Offline RL API makes direct use of ray.data.Datasets and there map_batches and iter_batches methods to transform and iterate over batches. These methods allow for configuration to tune the data pipeline. This PR proposes a way for users to easily pass arguments to these methods in the ´AlgorithmConfig` to tune their data pipelines.

In addition the OfflinePreLearner is moved to its own file from which users can override its behavior in its _map_episodes and __call__ method to define in detail how their data should be transferred to a batch/episodes.

Furthermore, some bug fixes have been made.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

…class into separate files. Signed-off-by: simonsays1980 <[email protected]>

…gorithmConfig' and 'OfflineData' to make the data pipeline better configurable and tuneable. Tested single-and multi-learner sertups with BC. Signed-off-by: simonsays1980 <[email protected]>

rllib/algorithms/algorithm_config.py

sven1977 · 2024-07-24T15:44:55Z

rllib/algorithms/algorithm_config.py

@@ -2470,6 +2499,8 @@ def offline_data(
            self.input_read_method_kwargs = input_read_method_kwargs
        if input_read_schema is not NotProvided:
            self.input_read_schema = input_read_schema
+        if prelearner_class is not NotProvided:
+            self.prelearner_class = prelearner_class


Wait, where do these get assigned?
input_read_method_kwargs
map_batches_kwargs

Up in the file. Where all attributes get default values.

Ah, sorry, didn't see this. I think b/c it hadn't been changed in this PR. All good.

sven1977 · 2024-07-24T15:46:04Z

rllib/offline/offline_data.py


-class OfflinePreLearner:


Thanks for separating these!

rllib/tuned_examples/bc/cartpole_bc.py

Co-authored-by: Sven Mika <[email protected]> Signed-off-by: simonsays1980 <[email protected]>

rllib/offline/offline_prelearner.py

Signed-off-by: simonsays1980 <[email protected]>

sven1977

Looks good to me now! Thanks @simonsays1980 !! :)

…n class. Signed-off-by: simonsays1980 <[email protected]>

…defined. Signed-off-by: simonsays1980 <[email protected]>

simonsays1980 added 2 commits July 24, 2024 16:25

Added annotations and seprated 'OfflineData' and 'OfflinePreLearner' …

4d40a17

…class into separate files. Signed-off-by: simonsays1980 <[email protected]>

Added 'map_batches_kwargs' and 'iter_batches_kwargs' arguments to 'Al…

147a512

…gorithmConfig' and 'OfflineData' to make the data pipeline better configurable and tuneable. Tested single-and multi-learner sertups with BC. Signed-off-by: simonsays1980 <[email protected]>

sven1977 changed the title ~~[RLlib - Offline RL] - Make data pipeline better configurable and tuneable for users.~~ [RLlib; Offline RL] Make data pipeline better configurable and tuneable for users. Jul 24, 2024

sven1977 marked this pull request as ready for review July 24, 2024 15:41

sven1977 requested review from sven1977 and ArturNiederfahrenhorst as code owners July 24, 2024 15:41