-
Notifications
You must be signed in to change notification settings - Fork 31
Losses
The wavelet_guided
loss enables the use of WGSR. As explained in the paper, the purpose of this is to stability gan training and reduce artifacts.
The option wavelet_init
specifies the number of iterations before to enable wavelet_guided.
[train]
wavelet_guided = true
wavelet_init = 80000
Note
This loss works better for finetuning than for training from scratch. It is recommended you train the model for at least ~40k before enabling it.
The pixel_opt
option defines the pixel loss.
[train.pixel_opt]
type = "L1Loss"
loss_weight = 1.0
reduction = "mean"
The above option sets pixel loss with L1 criteria and weight of 1.0.
Possible values for type
are: L1Loss
, MSELoss
(also known as L2), HuberLoss
and chc
(Clipped Huber with Cosine Similarity Loss - can improve color consistency and decrease noise, reduction is done using Huber loss).
The mssim_opt
option defines the Multi-scale SSIM loss. The implementation on neosr has been adapted from "A better pytorch-based implementation for the mean structural similarity. Differentiable simpler SSIM and MS-SSIM.".
The options bellow are the defaults when calling mssim
function by itself:
[train.mssim_opt]
type = "mssim_loss"
loss_weight = 1.0
window_size = 11
sigma = 1.5
in_channels = 3
K1 = 0.01
K2 = 0.03
L = 1
This option sets the NCC loss. It uses Normalized Cross-Correlation.
[train.ncc_opt]
type = "ncc_loss"
loss_weight = 1.0
This option sets the Frequency Distribution Loss, which is a perceptual loss.
[train.fdl_opt]
type = "fdl_loss"
model = "vgg" # "resnet", "effnet", "inception"
num_proj = 24
phase_weight = 1.0
loss_weight = 1.0
patch_size = 5
stride = 1
s1_w = 1.0
s2_w = 1.0
s3_w = 1.0
s4_w = 1.0
s5_w = 1.0
This loss uses pretrained network features. Possible networks are "vgg"
(19), "resnet"
(101), "effnet"
(efficientnet v1) and "inception"
(v3). The default value for num_proj
is set to 24, due to heavy hit on training performance. In the official implementation however, the value 256 is used. You may increase it at the end of a finetuning process to achieve better perceptual quality.
The sx_w
parameters are the weight for each stage (layer) when using VGG.
This option sets the perceptual loss. It uses the VGG19 network to extract features from images.
[train.perceptual_opt]
type = "vgg_perceptual_loss"
loss_weight = 1.0
criterion = "huber"
patchloss = true
ipk = true
patch_weight = 1.0
vgg_type = "vgg19"
use_input_norm = true
range_norm = false
[train.perceptual_opt.layer_weights]
conv1_2 = 0.1
conv2_2 = 0.1
conv3_4 = 1.0
conv4_4 = 1.0
conv5_4 = 1.0
Possible values for criterion
are: l1
, l2
, huber
and chc
.
The options patchloss
, ipk
and perceptual_patch_weight
specifies to use Patch Loss. By default, those options are disabled. The option patchloss
enables Feature Patch Kernel, as described in the paper. The option ipk
enables Image Patch Kernel.
This option enables DISTS (vgg16) as a perceptual loss. Can be used in combination with perceptual_opt
.
[train.dists_opt]
type = "dists_loss"
loss_weight = 0.5
This option enables GAN training.
[train.gan_opt]
type = "gan_loss"
gan_type = "bce"
loss_weight = 0.3
real_label_val = 1.0
fake_label_val = 0.0
Possible values for gan_type
are: bce
, mse
or huber
.
This option sets the LDL loss. See the research paper for details.
[train.ldl_opt]
type = "ldl_loss"
loss_weight = 1.0
criterion = "huber"
ksize = 7
Possible values for type
are: l1
, l2
and huber
.
This option sets the Focal-Frequency Loss. See the research paper for details.
[train.ff_opt]
type = "ff_loss"
loss_weight = 1.0
alpha = 1.0
patch_factor = 1
ave_spectrum = true
log_matrix = false
batch_matrix = false
Note
Focal Frequency loss can cause instabilities if enabled without using a pretrained model.
This option specifies to use Gradient-Weighted Loss from the CDC research. In practice, this loss makes the network focus more on high-frequencies.
[train.gw_opt]
type = "gw_loss"
loss_weight = 1.0
criterion = "chc_loss"
corner = true
Possible values for criterion
are: l1
, l2
, huber
, and chc
.
This option specifies to use the Kullback-Leibler divergence loss.
[train.kl_opt]
type = "kl_loss"
loss_weight = 1.0
Note
KL-loss should only be enabled if using a pretrained model. Enabling it from scratch may cause incorrect results or NaN.
This option specifies to match color and luma from your LQ images, instead of the GT images. Can increase stability if your dataset has too much variations in color/luma. Only applicable if consistency_loss
is enabled.
[train]
match_lq_colors = true
This option sets the color and luma consistency loss. It allows for matching the brightness and colors of your generated images to GT or LQ (see match_lq
option). The loss uses Oklab and CIE L* color space transforms, as well as Cosine Similarity.
[train.consistency_opt]
type = "consistency_loss"
loss_weight = 1.0
criterion = "chc" # "l1"
blur = true
cosim = true
saturation = 1.0
brightness = 1.0
This option sets Multiscale Sliced Wasserstein Distance loss. It is a color consistency loss.
[train.msswd_opt]
type = "msswd_loss"
num_scale = 3
num_proj = 24
loss_weight = 1.0
patch_size = 11
stride = 1
c = 3
The parameters num_proj
and num_scale
defaults to 24 and 3, respectively, due to heavy hit on training performance. In the official implementation however, the values 128 and 5 are used. You may increase it at the end of a finetuning process to achieve better perceptual quality.