You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Oops accidentally pushed to main directly at b56e8b3. This issue explained the changes.
As discovered in #454 (comment), we were probably doing too many minibatch updates. So this commit checks what happens if we do it with no minibatch updates, similar to how openai did it. When performing a benchmark, I found increased stability — only 1 out of 10 random seeds crashed.
Furthermore, I used a lower batch size and a target KL of 6 (the default setting in OAI's repo).
The text was updated successfully, but these errors were encountered:
Oops accidentally pushed to main directly at b56e8b3. This issue explained the changes.
As discovered in #454 (comment), we were probably doing too many minibatch updates. So this commit checks what happens if we do it with no minibatch updates, similar to how openai did it. When performing a benchmark, I found increased stability — only 1 out of 10 random seeds crashed.
Furthermore, I used a lower batch size and a target KL of 6 (the default setting in OAI's repo).
The text was updated successfully, but these errors were encountered: