Modular-HER

Modular-HER is revised from OpenAI baselines and supports many improvements for Hindsight Experience Replay (HER) as modules. We aim to provide a more modular, readable and concise package for Multi-goal Reinforcement Learning.

Welcome everyone to contribute suggestions or code !

Functions

DDPG (https://arxiv.org/abs/1509.02971);
HER (future, episode, final, random) (https://arxiv.org/abs/1707.01495);
Cut HER (incrementally increase the future sample length);
SHER (https://arxiv.org/abs/2002.02089);
Prioritized HER (same as PHER in https://arxiv.org/abs/1905.08786);
Energe-based Prioritized HER(https://www.researchgate.net/publication/341776498_Energy-Based_Hindsight_Experience_Prioritization);
Curriculum-guided Hindsight Experience Replay (http://papers.nips.cc/paper/9425-curriculum-guided-hindsight-experience-replay);
nstep DDPG and nstep HER;
more to be continued...

Prerequisites

Require python3 (>=3.5), tensorflow (>=1.4,<=1.14) and system packages CMake, OpenMPI and zlib. Those can be installed as follows

Ubuntu :

sudo apt-get update && sudo apt-get install cmake libopenmpi-dev python3-dev zlib1g-dev

Mac OS X :

With Homebrew installed, run the following:

brew install cmake openmpi

Installation

git clone https://github.com/YangRui2015/Modular_HER.git
cd Modular_HER
pip install -e .

Usage

Trainging DDPG and save logs and models.

python -m mher.run --env=FetchReach-v1 --num_epoch 30 --num_env 1 --sampler random --play_episodes 5 --log_path=~/logs/fetchreach/ --save_path=~/logs/models/fetchreach_ddpg/

Trainging HER + DDPG with different sampler ('her_future', 'her_random', 'her_last', 'her_episode' are supported).

python -m mher.run --env=FetchReach-v1 --num_epoch 30 --num_env 1 --sampler her_future --play_episodes 5 --log_path=~/logs/fetchreach/ --save_path=~/logs/models/fetchreach_herfuture/

Training SAC + HER.

python -m mher.run  --env=FetchReach-v1 --num_epoch 50  --algo sac --sac_alpha 0.05 --sampler her_episode

All support sampler flags.

Group	Samplers
Random sampler	random
HER	her_future, her_episode, her_last, her_random
Nstep	nstep, nstep_her_future, nstep_her_epsisode, nstep_her_last, nstep_her_random
Priority	priority, priority_her_future, priority_her_episode, priority_her_random, priority_her_last

Results

We use a group of test parameters in DEFAULT_ENV_PARAMS for performance comparison in FetchReach-v1 environment.

Performance of HER of different goal sample methods (future, random, episode, last).

Performance of Nstep HER and Nstep DDPG.

Performance of SHER (Not good enough in FetchReach environment, I will test more envs to report).

Update

9.27 V0.0: update readme;
10.3 V0.5: revised code framework hugely, supported DDPG and HER(future, last, final, random);
10.4 V0.6: update code framework, add rollouts and samplers packages;
10.6 add nstep sampler and nstep her sampler;
10.7 fix bug of nstep her sampler;
10.16 add priority experience replay and cut her;
10.31 V1.0: add SHER support;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Modular-HER

Functions

Prerequisites

Ubuntu :

Mac OS X :

Installation

Usage

Results

Update

Files

README.md

Latest commit

History

README.md

File metadata and controls

Modular-HER

Functions

Prerequisites

Ubuntu :

Mac OS X :

Installation

Usage

Results

Update