-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some notable differences with official implementation #17
Comments
Thanks for the information! I could also observe the following differences:
|
Working on this at |
A few things I noticed when looking:
|
Working on this in my fork's fix/17, then will make a PR |
Changed everything to match, but for some reason, the official generator still has 1440 more params than this one. |
The weight norm for I just added a commit to fix that in |
Differences in a padding method look significant. |
I was doing a training run on part of voxceleb2 before you added the weight_norm to the shortcut, so I left it running, and the difference at 800K steps on that branch vs 1M steps at This is from the validation data, which are separate speakers in voxceleb2. Original audio: 1M steps on older commit: 800K steps on 1 commit old fix/17 branch: I am now doing another run from the latest commit on |
what are the differences between reflected padding and zero padding in melgan ? better audio quality? |
Hi,
Just FYI, in the official MelGan repo, the authors used Hinge losses. However, in the paper, the author described with L2 loss. This repo is consistent with the paper! I am setting up some experiments with the Hinge loss to see the differences. Another note is that the default length of the segment_length is 8912 in the official as well (vs 16k in this repo).
The text was updated successfully, but these errors were encountered: