[rllib] Should remove vf_clip param from PPO #8908
Labels
enhancement
Request for new feature and/or capability
P2
Important issue, but not time-critical
rllib
RLlib related issues
Describe your feature request
According to https://arxiv.org/pdf/2006.05990.pdf, we should remove VF clipping since it doesn't help. VF clipping has historically been a common cause of user problems in RLlib anyways, so this might be a nice double win.
Note: it seems the paper only evaluates relatively small epoch sizes (at most 4096 steps), whereas many of our examples are tuned for much higher batch sizes (up to 320k steps), which tend to reach higher final rewards. We might want to re-benchmark these to make sure VF clipping still doesn't matter.
cc @eugenevinitsky @sven1977
The text was updated successfully, but these errors were encountered: