Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update create_model_card to properly save peft details when using Trainer with PEFT #27754

Merged
merged 3 commits into from
Dec 7, 2023

Conversation

pacman100
Copy link
Contributor

What does this PR do?

  1. When using Trainer with PEFT, model.save_pretrained in PEFT adds PEFT-specific details to the existing model card or creates a new model card and adds these details. However, trainer.create_model_card is called after model is saved and it overwrites the entire file thereby nullifies the addition of details related to PEFT such as library name, quantization used and PEFT version.
  2. This PR fixes the above issue and thereby adds backs the PEFT details to the model card. This will help in better organization and understanding usage of PEFT on Hub.
  3. Example of the repo with the PR usage: https://huggingface.co/smangrul/mistral_lora_clm_with_added_tokens

Copy link
Contributor

@muellerzr muellerzr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solution makes sense to me! Thanks!

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very knowledgeable about Trainer, so can't comment on the whole integration.

From my understanding, this seems a little bit hacky to me, because we first create the model card with PEFT, then completely overwrite it with Trainer, then re-add PEFT-related content. It feels like the proper solution would be for the Trainer to update the model card if it already exists. But I understand that this would be more work, so I'd be okay with this more hacky approach.

src/transformers/trainer.py Outdated Show resolved Hide resolved
src/transformers/trainer.py Outdated Show resolved Hide resolved
@pacman100
Copy link
Contributor Author

From my understanding, this seems a little bit hacky to me, because we first create the model card with PEFT, then completely overwrite it with Trainer, then re-add PEFT-related content. It feels like the proper solution would be for the Trainer to update the model card if it already exists. But I understand that this would be more work, so I'd be okay with this more hacky approach.

Yes, Trainer should ideally update if README is already there, but it rewrites everything from the TrainerState and as such appending/updating would be trickier. Open to ideas on making this cleaner.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update.

Yes, Trainer should ideally update if README is already there, but it rewrites everything from the TrainerState and as such appending/updating would be trickier. Open to ideas on making this cleaner.

I'm not very knowledgeable about Trainer, so I don't have any suggestion for a better solution.

@muellerzr muellerzr requested review from ArthurZucker and removed request for amyeroberts December 4, 2023 18:49
@muellerzr
Copy link
Contributor

I can't really see a better solution currently either, so what we have here works for now

@pacman100 pacman100 merged commit 5324bf9 into main Dec 7, 2023
21 checks passed
@pacman100 pacman100 deleted the smangrul/fix-peft-model-card branch December 7, 2023 12:06
nevikw39 pushed a commit to NTHU-ML-2023-team19/transformers that referenced this pull request Dec 7, 2023
…rainer with PEFT (huggingface#27754)

* update `create_model_card` to properly save peft details when using Trainer with PEFT

* nit

* Apply suggestions from code review

Co-authored-by: Benjamin Bossan <[email protected]>

---------

Co-authored-by: Benjamin Bossan <[email protected]>
nevikw39 pushed a commit to NTHU-ML-2023-team19/transformers that referenced this pull request Dec 7, 2023
…rainer with PEFT (huggingface#27754)

* update `create_model_card` to properly save peft details when using Trainer with PEFT

* nit

* Apply suggestions from code review

Co-authored-by: Benjamin Bossan <[email protected]>

---------

Co-authored-by: Benjamin Bossan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants