Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Min-SNR Gamma: correct the fix for SNR weighted loss in v-prediction … #5238

Merged
merged 2 commits into from
Oct 5, 2023

Conversation

bghira
Copy link
Contributor

@bghira bghira commented Sep 29, 2023

…by adding 1 to SNR rather than the resulting loss weights

What does this PR do?

Fixes issue as mentioned by @parlance-zz here.

Sorry for all the back and forth on this stuff, I did about 6000 steps of training with the previous fix and all was well. If this works better, awesome!

…by adding 1 to SNR rather than the resulting loss weights
Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating!

@bghira
Copy link
Contributor Author

bghira commented Oct 5, 2023

Training two models from scratch to test this one, concurrently. They are using the exact same hyperparameters, with the exception of min-SNR gamma.

With SNR gamma:

image

Without SNR gamma:

image

@bghira
Copy link
Contributor Author

bghira commented Oct 5, 2023

The loss function is very similar but, it has about .1 lower on avg at each step for the gamma model vs the vanilla model. The results themselves seem more substantially-improved than .1 loss would usually indicate.

With gamma:

image

Without gamma:

image

@bghira
Copy link
Contributor Author

bghira commented Oct 5, 2023

the woman class:

image

image

@sayakpaul sayakpaul merged commit 02a8d66 into huggingface:main Oct 5, 2023
10 of 11 checks passed
@bghira bghira deleted the bugfix/snr-addition-vprediction branch October 5, 2023 18:58
@bghira
Copy link
Contributor Author

bghira commented Oct 6, 2023

just for completeness, at 1k steps on both models, the gap is really beginning to widen, where the min-SNR gamma model seriously pulls ahead in terms of representing details and coherence.

image

image

@feffy380
Copy link

feffy380 commented Nov 2, 2023

@bghira Doesn't this change reintroduce the division by zero issue when using zero terminal SNR + epsilon prediction? As is, the weight will be 0/0 = NaN at t=1000

@bghira
Copy link
Contributor Author

bghira commented Nov 3, 2023

in my opinion, you shouldn't be combining zero terminal SNR with epsilon anyway. i'm sure there's an additional fix that can be applied, but this trainer uses the simplest implementations of the logic from the research paper. using epsilon would be out of scope for that? @sayakpaul thoughts on that?

@sayakpaul
Copy link
Member

Yeah I agree.

AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
huggingface#5238)

Min-SNR Gamma: correct the fix for SNR weighted loss in v-prediction by adding 1 to SNR rather than the resulting loss weights

Co-authored-by: bghira <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants