Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warning and info message for beta and gamma parameters #33192

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

zly-idleness
Copy link

@zly-idleness zly-idleness commented Aug 29, 2024

What does this PR do?

This adds a warning message to notify about the renaming of gamma and beta parameters during initialisation and also during loading.

before:

(vqa-audio) (base) jeeves@notebook-5064-cadence:~/ChatTTS/rhapsodyaudio$ python tmp_save_pretrain.py 
bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████| 8/8 [00:16<00:00,  2.02s/it]
Some weights of Qwen2AudioForConditionalChatTTS were not initialized from the model checkpoint at /mnt/data/user/tc_agi/luoyuanZ/ChatTTS_default and are newly initialized: 

['tts.dvae.decoder.decoder_block.0.gamma', 'tts.dvae.decoder.decoder_block.1.gamma', 'tts.dvae.decoder.decoder_block.10.gamma', 'tts.dvae.decoder.decoder_block.11.gamma', 'tts.dvae.decoder.decoder_block.2.gamma', 'tts.dvae.decoder.decoder_block.3.gamma', 'tts.dvae.decoder.decoder_block.4.gamma', 'tts.dvae.decoder.decoder_block.5.gamma', 'tts.dvae.decoder.decoder_block.6.gamma', 'tts.dvae.decoder.decoder_block.7.gamma', 'tts.dvae.decoder.decoder_block.8.gamma', 'tts.dvae.decoder.decoder_block.9.gamma', 'tts.dvae.encoder.decoder_block.0.gamma', 'tts.dvae.encoder.decoder_block.1.gamma', 'tts.dvae.encoder.decoder_block.10.gamma', 'tts.dvae.encoder.decoder_block.11.gamma', 'tts.dvae.encoder.decoder_block.2.gamma', 'tts.dvae.encoder.decoder_block.3.gamma', 'tts.dvae.encoder.decoder_block.4.gamma', 'tts.dvae.encoder.decoder_block.5.gamma', 'tts.dvae.encoder.decoder_block.6.gamma', 'tts.dvae.encoder.decoder_block.7.gamma', 'tts.dvae.encoder.decoder_block.8.gamma', 'tts.dvae.encoder.decoder_block.9.gamma']

after:

(vqa-audio) (base) jeeves@notebook-5064-cadence:~/ChatTTS/rhapsodyaudio$ python tmp_save_pretrain.py 
bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
This model <class 'muffin.model.infer_qwen2tts.Qwen2AudioForConditionalChatTTS'>contains parameters that have been renamed internally (a few are listed below but more are present in the model):

Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████| 8/8 [00:14<00:00,  1.81s/it]

Fixes #29554 and #33190 (issue)

Before submitting

  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
image

@zly-idleness
Copy link
Author

cc @amyeroberts

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this @zly-idleness!

At the moment, this involves iterating over the loaded keys twice - once on L4116 and then on L4146. It also involves a redefinition of original_loaded_keys. We should rework the code to reduce the double logic

@zly-idleness
Copy link
Author

Thanks for adding this @zly-idleness!

At the moment, this involves iterating over the loaded keys twice - once on L4116 and then on L4146. It also involves a redefinition of original_loaded_keys. We should rework the code to reduce the double logic

Thank you for pointing that out. I'll refactor the code to eliminate redundancy ☺️

old_keys.append(key)
new_keys.append(new_key)
renamed_keys[key] = new_key
loaded_keys[i] = new_key
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work - this will modify both loaded_keys and original_loaded_keys:

In [1]: li_0 = list(range(10))

In [2]: li_1 = li_0

In [3]: for i in range(10):
   ...:     if i % 2:
   ...:         li_0[i] = -1
   ...:

In [4]: li_0
Out[4]: [0, -1, 2, -1, 4, -1, 6, -1, 8, -1]

In [5]: li_1
Out[5]: [0, -1, 2, -1, 4, -1, 6, -1, 8, -1]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you sir ! You are totally right , I forget add a copy() to ensure that the original_loaded_keys remain unaltered.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating!

At the moment, the logic is being forced around the old renaming code. There are also changes in the PR to the olmoe model which should be removed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be any changes to this file in the PR

Comment on lines +4149 to +4154
warning_msg += 'contains parameters that have been renamed internally ("gamma" and "beta" in parameters) (a few are listed below but more are present in the model):\n'
logger.warning(warning_msg)
for old_key, new_key in renamed_keys.items():
warning_msg += f"* `{old_key}` -> `{new_key}`\n"
warning_msg += "If you are using a model from the Hub, consider submitting a PR to adjust these weights and help future users."
logger.info(warning_msg)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message isn't consistent - at the moment all of the renamed keys will be listed. Like in the other places where this logic is added, let's just take the first renames

return None

for i, key in enumerate(loaded_keys):
new_key = _fix_key(key)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than try and use the existing _fix_key logic - it would be better to rework this to:

  • Not use copy
  • To only add the first renaming case for gamma and beta respectively

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't load models with a gamma or beta parameter
2 participants