Some notable differences with official implementation #17

chapter544 · 2019-10-28T03:47:39Z

Hi,
Just FYI, in the official MelGan repo, the authors used Hinge losses. However, in the paper, the author described with L2 loss. This repo is consistent with the paper! I am setting up some experiments with the Hinge loss to see the differences. Another note is that the default length of the segment_length is 8912 in the official as well (vs 16k in this repo).

seungwonpark · 2019-10-28T04:58:53Z

Thanks for the information!

I could also observe the following differences:

slope of LeakyReLU: 0.2 (official) / 0.01 (here)
mel_fmax: 11025.0 (official) / 8000.0 (here) (discussed at mel_fmax does not cover all frequency #7)
padding: reflection pad (official) / zero pad (here)

seungwonpark · 2019-11-11T09:29:58Z

Working on this at fix/17.

bob80333 · 2019-11-16T18:31:31Z

A few things I noticed when looking:

Reflection padding only in generator and the first layer of the discriminator, not any of the other discriminator layers
Discriminator downsample was:
nn.AvgPool1d(4, stride=2, padding=1, count_include_pad=False)
not
nn.AvgPool1d(kernel_size=4, stride=2, padding=2)
The shortcut connection for the resblocks have a convolution: Conv1d(256, 256, kernel_size=1)
The official generator has 4266050 parameters, on the master branch the generator here has 4262658 parameters, 3392 less (probably the shortcut convolutions)
The last convolution in the resblock has no padding, and has kernel_size=1 not 3
Reflection padding was not used for the Conv1dTranspose layers

bob80333 · 2019-11-16T18:39:56Z

Working on this in my fork's fix/17, then will make a PR

bob80333 · 2019-11-16T19:02:25Z

Changed everything to match, but for some reason, the official generator still has 1440 more params than this one.

seungwonpark · 2019-11-18T02:38:05Z

The weight norm for shortcut was missing. When I add them, the number of parameters became exactly the same as the official one: 4266050.

I just added a commit to fix that in fix/17 branch.

seungwonpark · 2019-11-18T02:42:10Z

Differences in a padding method look significant.
I'll be testing our fix/17 branch with both LJSpeech and private data.

bob80333 · 2019-11-23T01:46:25Z

I was doing a training run on part of voxceleb2 before you added the weight_norm to the shortcut, so I left it running, and the difference at 800K steps on that branch vs 1M steps at 1725a7508f4f3a2de0bbc0ec83d33deaa40c3255 (reached 3200 epochs commit on master branch) is quite large.

This is from the validation data, which are separate speakers in voxceleb2.

Original audio:
https://voca.ro/imS3kiSQ4ik

1M steps on older commit:
https://voca.ro/eOJv99KjtZO

800K steps on 1 commit old fix/17 branch:
https://voca.ro/fMTYWO4vUvX

I am now doing another run from the latest commit on fix/17 to see if the shortcut weight norm will help even more.

nukes · 2021-02-23T11:33:29Z

what are the differences between reflected padding and zero padding in melgan ? better audio quality?

seungwonpark changed the title ~~L2 vs Hinge Loss~~ Some notable differences with official implementation Oct 28, 2019

seungwonpark mentioned this issue Nov 11, 2019

strange noises in your samples && error when running inference.py #30

Closed

bob80333 mentioned this issue Nov 16, 2019

Modify generator and discriminator to match official repo #31

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some notable differences with official implementation #17

Some notable differences with official implementation #17

chapter544 commented Oct 28, 2019

seungwonpark commented Oct 28, 2019 •

edited

Loading

seungwonpark commented Nov 11, 2019

bob80333 commented Nov 16, 2019 •

edited

Loading

bob80333 commented Nov 16, 2019

bob80333 commented Nov 16, 2019

seungwonpark commented Nov 18, 2019

seungwonpark commented Nov 18, 2019

bob80333 commented Nov 23, 2019 •

edited

Loading

nukes commented Feb 23, 2021

Some notable differences with official implementation #17

Some notable differences with official implementation #17

Comments

chapter544 commented Oct 28, 2019

seungwonpark commented Oct 28, 2019 • edited Loading

seungwonpark commented Nov 11, 2019

bob80333 commented Nov 16, 2019 • edited Loading

bob80333 commented Nov 16, 2019

bob80333 commented Nov 16, 2019

seungwonpark commented Nov 18, 2019

seungwonpark commented Nov 18, 2019

bob80333 commented Nov 23, 2019 • edited Loading

nukes commented Feb 23, 2021

seungwonpark commented Oct 28, 2019 •

edited

Loading

bob80333 commented Nov 16, 2019 •

edited

Loading

bob80333 commented Nov 23, 2019 •

edited

Loading