Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve stability of default hyperparameters #462

Closed
vwxyzjn opened this issue Jun 23, 2023 · 2 comments
Closed

Improve stability of default hyperparameters #462

vwxyzjn opened this issue Jun 23, 2023 · 2 comments

Comments

@vwxyzjn
Copy link
Contributor

vwxyzjn commented Jun 23, 2023

Oops accidentally pushed to main directly at b56e8b3. This issue explained the changes.

As discovered in #454 (comment), we were probably doing too many minibatch updates. So this commit checks what happens if we do it with no minibatch updates, similar to how openai did it. When performing a benchmark, I found increased stability — only 1 out of 10 random seeds crashed.

Furthermore, I used a lower batch size and a target KL of 6 (the default setting in OAI's repo).

image
@natolambert
Copy link
Contributor

This makes sense as a reason that weird kl's could happen too.

@vwxyzjn
Copy link
Contributor Author

vwxyzjn commented Jun 26, 2023

Thanks @natolambert!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants