[RLlib]: Off-Policy Evaluation fixes #25899

Rohan138 · 2022-06-17T21:53:13Z

Hotfixes for OPE methods:

Compute mean and stddev for estimators
Rename k_fold_cv to train_test_split and add a train_test_split_val parameter
torch.squeeze(-1) fixes

Next PR: #25911

Related issue number

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

…fixes

rapotdar added 3 commits June 17, 2022 14:14

wip

6223a71

wip

1288837

split fixes

40451fa

Rohan138 added rllib RLlib related issues rllib-offline-rl Offline RL problems labels Jun 17, 2022

Rohan138 assigned avnishn Jun 17, 2022

Rohan138 requested review from sven1977, gjoliver, avnishn, ArturNiederfahrenhorst and smorad as code owners June 17, 2022 21:53

Rohan138 self-assigned this Jun 17, 2022

Rohan138 requested review from maxpumperla, kouroshHakha and krfricke as code owners June 17, 2022 21:53

Rohan138 mentioned this pull request Jun 17, 2022

[RLlib]: Off-Policy Estimation does not work with single timesteps #25872

Closed

rapotdar added 3 commits June 17, 2022 14:57

Fix k

5f9a599

lint

dd0683f

drop k to 2

cb127b1

avnishn approved these changes Jun 17, 2022

View reviewed changes

Rohan138 assigned sven1977 and unassigned Rohan138 Jun 18, 2022

rapotdar added 3 commits June 17, 2022 19:08

Merge branch 'master' of https://github.com/ray-project/ray into ope-…

90a21e2

…fixes

fix train_test_split

d1c5465

Minor fixes, drop assertions (they only apply for k_fold_cv not splits)

68ce0b4

Rohan138 mentioned this pull request Jun 20, 2022

[RLlib]: Move OPE to evaluation config #25911

Merged

6 tasks

sven1977 merged commit 28df3f3 into ray-project:master Jun 21, 2022