LOOP: Learning Off-Policy with Online Planning

Accepted in Conference of Robot Learning (CoRL) 2021.

Harshit Sikchi, Wenxuan Zhou, David Held

Paper

Install

PyTorch 1.5
OpenAI Gym
MuJoCo
tqdm
D4RL dataset

File Structure

LOOP (Core method)
- Training code (Online RL): train_loop_sac.py
- Training code (Offline RL): train_loop_offline.py
- Training code (safe RL): train_loop_safety.py
- Policies (online/offline/safety): policies.py
- ARC/H-step lookahead policy: controllers/
Environments: envs/
Configurations: configs/

Instructions

All the experiments are to be run under the root folder.
Config files in configs/ are used to specify hyperparameters for controllers and dynamics.
Please keep all the other values in yml files consistent with hyperparamters given in paper to reproduce the results in our paper.

Experiments

Sec 6.1 LOOP for Online RL

python train_loop_sac.py --env=<env_name> --policy=LOOP_SAC_ARC --start_timesteps=<initial exploration steps> --exp_name=<location_to_logs>

Environments wrappers with their termination condition can be found under envs/

Sec 6.2 LOOP for Offline RL

Download CRR trained models from Link into the root folder.

python train_loop_offline.py --env=<env_name> --policy=LOOP_OFFLINE_ARC --exp_name=<location_to_logs>  --offline_algo=CRR --prior_type=CRR

Currently supported for d4rl MuJoCo locomotions tasks only.

Sec 6.3 LOOP for Safe RL

python train_loop_safety.py --env=<env_name> --policy=safeLOOP_ARC --exp_name=<location_to_logs>

Safety environments can be found under envs/safety_envs.py

Citing

If you find this work useful, please use the following citation:

@inproceedings{sikchi2022learning,
  title={Learning off-policy with online planning},
  author={Sikchi, Harshit and Zhou, Wenxuan and Held, David},
  booktitle={Conference on Robot Learning},
  pages={1622--1633},
  year={2022},
  organization={PMLR}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LOOP: Learning Off-Policy with Online Planning

Install

File Structure

Instructions

Experiments

Sec 6.1 LOOP for Online RL

Sec 6.2 LOOP for Offline RL

Sec 6.3 LOOP for Safe RL

Citing

About

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
configs		configs
controllers		controllers
envs		envs
logging_utils		logging_utils
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
core.py		core.py
loop.png		loop.png
policies.py		policies.py
sac.py		sac.py
train_loop_offline.py		train_loop_offline.py
train_loop_sac.py		train_loop_sac.py
train_loop_safety.py		train_loop_safety.py

License

hari-sikchi/LOOP

Folders and files

Latest commit

History

Repository files navigation

LOOP: Learning Off-Policy with Online Planning

Install

File Structure

Instructions

Experiments

Sec 6.1 LOOP for Online RL

Sec 6.2 LOOP for Offline RL

Sec 6.3 LOOP for Safe RL

Citing

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages