forked from MushroomRL/mushroom-rl
-
Notifications
You must be signed in to change notification settings - Fork 0
/
TODO.txt
23 lines (20 loc) · 776 Bytes
/
TODO.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Algorithms:
* Conservative Q-Learning
* Policy Search:
- Natural gradient
- NES
- PAPI
Policy:
* Add Boltzmann from logits for traditional policy gradient methods
Approximator:
* support for LSTM
* Generalize LazyFrame to LazyState
* add neural network generator
For Mushroom 2.0:
* Record method in environment and record option in the core
* Simplify Regressor interface: drop GenericRegressor, remove facade pattern
* vectorize basis functions and simplify interface, simplify facade pattern
* remove custom save for plotting, use Serializable
* support multi-objective RL
* support model-based RL
* Improve replay memory, allowing to store arbitrary information into replay buffer