-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: added initial version of PPOTrainer support #3549
chore: added initial version of PPOTrainer support #3549
Conversation
Co-authored-by: David Berenstein <[email protected]>
It's in trainer_args instead
and chosen_rejected_func to formatting_func
Test failures also exist in |
|
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## feat/integration_trl #3549 +/- ##
========================================================
+ Coverage 89.96% 90.00% +0.04%
========================================================
Files 256 256
Lines 13777 13865 +88
========================================================
+ Hits 12394 12479 +85
- Misses 1383 1386 +3
☔ View full report in Codecov by Sentry. |
Merged before documentation is complete, to allow @davidberenstein1957 to extend PPO further in |
Description
I added support for the PPOTrainer.
Closes #3522
Type of change
How Has This Been Tested
tests/integration/client/feedback/training/test_trl.py
Checklist
CHANGELOG.md
file (See https://keepachangelog.com/)