[Homepage] [Documentation] [Study Paper] [Study Website] [ARISE Initiative]
This fork extends the robomimic framework to handle gym-grasp
environments.
It requires that you have a working installation of gym-grasp
and corresponding demonstrations (e.g. the ones provided in the gym-grasp
repo itself).
To prepare a demonstration file for learning, split the demos into training and validation data:
python robomimic/robomimic/scripts/split_train_val.py --dataset <file> --ratio 0.1
You will also likely need configuration files, check robomimmic's config tutorial for more information.
We have added some configuration keys, most importantly experiment.gymgrasp_recording
to enable recordings in simulation. Please note that you can not use experiment.render
and experiment.render_video
when learning in gym-grasp
environments.
Rollouts are recorded in parallel environments (number of environments is equal to number of requested rollouts).
Then, you can train using
python train.py --config <file>
To run trained models, use run_trained_gymgrasp_agent.py
.
This will provide you with information about success rate and return achieved in rollouts.
Please ignore the Num Successes
output, it is not correct when using parallel environments.
You can also choose to record the rollouts, see the available command line parameters for more information.
- [12/16/2021] v0.2.0: Modular observation modalities and encoders 🔧, support for MOMART datasets 📂
- [08/09/2021] v0.1.0: Initial code and paper release
robomimic is a framework for robot learning from demonstration. It offers a broad set of demonstration datasets collected on robot manipulation domains, and learning algorithms to learn from these datasets. This project is part of the broader Advancing Robot Intelligence through Simulated Environments (ARISE) Initiative, with the aim of lowering the barriers of entry for cutting-edge research at the intersection of AI and Robotics.
Imitating human demonstrations is a promising approach to endow robots with various manipulation capabilities. While recent advances have been made in imitation learning and batch (offline) reinforcement learning, a lack of open-source human datasets and reproducible learning methods make assessing the state of the field difficult. The overarching goal of robomimic is to provide researchers and practitioners with:
- a standardized set of large demonstration datasets across several benchmarking tasks to facilitate fair comparisons, with a focus on learning from human-provided demonstrations
- a standardized set of large demonstration datasets across several benchmarking tasks to facilitate fair comparisons, with a focus on learning from human-provided demonstrations (see this link for a list of supported datasets)
- high-quality implementations of several learning algorithms for training closed-loop policies from offline datasets to make reproducing results easy and lower the barrier to entry
- a modular design that offers great flexibility in extending algorithms and designing new algorithms
This release of robomimic contains seven offline learning algorithms and standardized datasets collected across five simulated and three real-world multi-stage manipulation tasks of varying complexity. We highlight some features below (for a more thorough list of features, see this link):
- standardized datasets: a set of datasets collected from different sources (single proficient human, multiple humans, and machine-generated) across several simulated and real-world tasks, along with a plug-and-play Dataset class to easily use the datasets outside of this project
- algorithm implementations: several high-quality implementations of offline learning algorithms, including BC, BC-RNN, HBC, IRIS, BCQ, CQL, and TD3-BC
- multiple observation spaces: support for learning both low-dimensional and visuomotor policies, with support for observation tensor dictionaries throughout the codebase, making it easy to specify different subsets of observations to train a policy. This includes a set of useful tensor utilities to work with nested dictionaries of torch Tensors and numpy arrays.
- visualization utilities: utilities for visualizing demonstration data, playing back actions, visualizing trained policies, and collecting new datasets using trained policies
- train launching utilities: utilities for easily running hyperparameter sweeps, enabled by a flexible Config management system
This framework originally began development in late 2018. Researchers in the Stanford Vision and Learning Lab (SVL) used it as an internal tool for training policies from offline human demonstration datasets. Now it is actively maintained and used for robotics research projects across multiple labs. We welcome community contributions to this project. For details please check our contributing guidelines.
Please see the troubleshooting section for common fixes, or submit an issue on our github page.
The robomimic framework also makes reproducing the results from this study easy. See the results documentation for more information.
Please cite this paper if you use this framework in your work:
@inproceedings{robomimic2021,
title={What Matters in Learning from Offline Human Demonstrations for Robot Manipulation},
author={Ajay Mandlekar and Danfei Xu and Josiah Wong and Soroush Nasiriany and Chen Wang and Rohun Kulkarni and Li Fei-Fei and Silvio Savarese and Yuke Zhu and Roberto Mart\'{i}n-Mart\'{i}n},
booktitle={arXiv preprint arXiv:2108.03298},
year={2021}
}