Andrei Marin, Traian Rebedea, Ionel Hosu
Politehnica University of Bucharest
Games on the Atari 2600 platform have served as a benchmark for reinforcement learning algorithms in recent years, and while deep reinforcement learning approaches make progress on most games, there are still some games that the majority of these algorithms struggle with. These are called hard exploration games. We introduce two new developments for the Random Network Distillation (RND) architecture. We apply self-attention and the mechanism of ego motion on the RND architecture and we evaluate them on three hard exploration tasks from the Atari platform. We find that the proposed ego network model improve the baseline of the RND architecture on these tasks.
First install the conda environment
conda create --name <env_name> --file conda_requirements.txt
Then install dependencies that cannot be installed with conda
pip install -r pip_requirements.txt
To train an Ego RND agent on Montezuma's Revenge, run the following command
python run_atari.py --save_model
This work is based on OpenAI's Exploration by Random Network Distillation by Yuri Burda, Harri Edwards, Amos Storkey, Oleg Klimov