Slides - here
- [main] David Silver lecture on exploration and expoitation - video
- Alternative lecture by J. Schulman - video
- Alternative lecture by N. de Freitas (with bayesian opt) - video
- Our lectures (russian)
- Gittins Index - the less heuristical approach to bandit exploration - article
- "Deep" version: variational information maximizing exploration - video
- Same topics in russian - video
- Lecture covering intrinsically motivated reinforcement learning - video
In this seminar, you'll be solvilg basic and contextual bandits with uncertainty-based exploration like Bayesian UCB and Thompson Sampling.
You will also need Bayesian Neural Networks. You will need theano/lasagne for this one:
# either
conda install Theano
# or
pip install --upgrade https://github.com/Theano/Theano/archive/master.zip
# and then lasagne
pip install --upgrade https://github.com/Lasagne/Lasagne/archive/master.zip
Everything else is in the notebook :)