This repository consists of several notebooks with my attempts at the (programming) exercises from the first few chapters of the book Reinforcement Learning, an Introduction, second edition, by Richard S. Sutton and Andrew G. Barto (2018). The plan is to work through the book up to and including at least Chapter 6 (Temporal-Difference Learning), reproducing all the figures and completing all the exercises. The various environments have been implemented in Python using the OpenAI Gym toolkit.
- Python version 3.6 or higher
git clone https://github.com/rhalbersma/doctrina.git
cd doctrina
python3 -m venv .env
source .env/bin/activate
pip install --upgrade pip
pip install -e .
Copyright Rein Halbersma 2020-2021.
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)