Experiment of using Instance Normalization vs Layer Normalization on Decoder #107

tom99763 · 2022-11-12T02:43:49Z

By the computation operation of the normalization methods, the MUNIT architecture can be summarized as follows.

This means that since there's no tuning channel correlation on the upsampling layer (i.e., Adaptive Instance Normalization, StyleGAN), if you use instance normalization during upsampling, the tunned channel correlation (ResNet + Adaptive Instance Normalization) will be destroyed.

Sarmadfismael · 2022-11-23T10:45:57Z

Hi
I have an equation about the decoder normalization step.
does the normalization step done in ResidualBlock parts of decoder or in the upsampling part of the decoder , I misunderstanding the difference between the AdaptiveInstanceNorm2d and the LayerNorm in the code?

tom99763 · 2022-11-23T14:35:53Z

In their model series (nvlabs imaginaire), the adaptive step only in Resblocks which means there's no resolution changing when adding conditional information.

Hi I have an equation about the decoder normalization step. does the normalization step done in ResidualBlock parts of decoder or in the upsampling part of the decoder , I misunderstanding the difference between the AdaptiveInstanceNorm2d and the LayerNorm in the code?

Sarmadfismael · 2022-11-23T15:55:26Z

thank you for your replying
but what is the purpose of the LayerNorm and how can i get the beta and gamma in the LayerNorm function?

especially in this:
x = x * self.gamma.view(*shape) + self.beta.view(*shape)
is gamma and beta same as the equation 7 in the paper?

tom99763 · 2022-11-24T07:42:15Z

thank you for your replying but what is the purpose of the LayerNorm and how can i get the beta and gamma in the LayerNorm function?

especially in this: x = x * self.gamma.view(*shape) + self.beta.view(*shape) is gamma and beta same as the equation 7 in the paper?

LN normalizes the feature cross all dimension (spatial & channel). Different to LN, IN normalizes the feature cross only spatial which means each dimension (channels) is normalized by its spatial statistics. That's why the correlation is destroyed. You can refer gram matrix (Style Transfer) and U-GAT-IT (combine instance and layer norm). (You can treat dimension as units in mlp case which does not have spatial information)

Gamma and beta are the predicted vector conditioned on the input of style feature. In the official implementation, they predict all gamma and beta use wide mlp, i.e., style feature (batch, 8) --> (batch, 256 * 2 * 2 * 9). That means there are 9 residual blocks and each contains 2 convolutional blocks which need 2 gamma & beta pair and the dimension is 256. Due to the benefit that deep structure can fit more complex function than wide structure, instead of predicting all gamma & beta at ones, you can equiped with two gamma & beta predictors in each layer as follows.

class InstanceNorm(layers.Layer):
    def __init__(self, epsilon=1e-5, affine=False, **kwargs):
        super(InstanceNorm, self).__init__(**kwargs)
        self.epsilon = epsilon
        self.affine = affine

    def build(self, input_shape):
        if self.affine:
            self.gamma = self.add_weight(name='gamma',
                                         shape=(input_shape[-1],),
                                         initializer=tf.random_normal_initializer(0, 0.02),
                                         trainable=True)
            self.beta = self.add_weight(name='beta',
                                        shape=(input_shape[-1],),
                                        initializer=tf.zeros_initializer(),
                                        trainable=True)

    def call(self, inputs, training=None):
        mean, var = tf.nn.moments(inputs, axes=[1, 2], keepdims=True)
        x = tf.divide(tf.subtract(inputs, mean), tf.math.sqrt(tf.add(var, self.epsilon)))
        if self.affine:
            return self.gamma * x + self.beta
        return x

Sarmadfismael · 2022-11-24T13:39:19Z

Thank you very much for your detailed explanation
i just wonder about the LN in up sampling layer, is it the same as in the equation7 in the paper? or it is an additional normalization layer different than AdaIN.

tom99763 · 2022-11-24T22:22:41Z

Thank you very much for your detailed explanation i just wonder about the LN in up sampling layer, is it the same as in the equation7 in the paper? or it is an additional normalization layer different than AdaIN.

No, the tunning parameter gamma and beta are not use in LN. They just cite the paper to refer the LN method.

Sarmadfismael · 2022-11-25T07:16:34Z

so, this means that the LN is different normalization layer and not mention in the paper,
Can we remove the LN from up sampling layer?

tom99763 · 2022-11-26T10:13:26Z

so, this means that the LN is different normalization layer and not mention in the paper, Can we remove the LN from up sampling layer?

Yes you can, and you will find that the training is unstable as the magnitude of feature values becomes large.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment of using Instance Normalization vs Layer Normalization on Decoder #107

Experiment of using Instance Normalization vs Layer Normalization on Decoder #107

tom99763 commented Nov 12, 2022 •

edited

Loading

Sarmadfismael commented Nov 23, 2022

tom99763 commented Nov 23, 2022 •

edited

Loading

Sarmadfismael commented Nov 23, 2022 •

edited

Loading

tom99763 commented Nov 24, 2022 •

edited

Loading

Sarmadfismael commented Nov 24, 2022

tom99763 commented Nov 24, 2022 •

edited

Loading

Sarmadfismael commented Nov 25, 2022

tom99763 commented Nov 26, 2022

Experiment of using Instance Normalization vs Layer Normalization on Decoder #107

Experiment of using Instance Normalization vs Layer Normalization on Decoder #107

Comments

tom99763 commented Nov 12, 2022 • edited Loading

Sarmadfismael commented Nov 23, 2022

tom99763 commented Nov 23, 2022 • edited Loading

Sarmadfismael commented Nov 23, 2022 • edited Loading

tom99763 commented Nov 24, 2022 • edited Loading

Sarmadfismael commented Nov 24, 2022

tom99763 commented Nov 24, 2022 • edited Loading

Sarmadfismael commented Nov 25, 2022

tom99763 commented Nov 26, 2022

tom99763 commented Nov 12, 2022 •

edited

Loading

tom99763 commented Nov 23, 2022 •

edited

Loading

Sarmadfismael commented Nov 23, 2022 •

edited

Loading

tom99763 commented Nov 24, 2022 •

edited

Loading

tom99763 commented Nov 24, 2022 •

edited

Loading