This repo is built on grounds of exploring and developing various deep generative models to perform super-resolution (SR) on limited set of unaligned/unpaired face images.
Following models are part of my Masters' thesis (pdf)
- CycleGAN-with-EDSR
- Image-Degrade
- AdaIN for degradation
- Based on Adaptive Instance normalisation from style transfer literature
- degradeSR
- Degrade an image using AdaIN and then perform super-resolution.
Map two different domains and generate high-quality images.
-
CYCADA: Cycle consistent Adversarial Domain Adaptation
-
Adversarial adaptation models: Feature spaces discover domain invariant representation, but are different to visualise and sometimes fail to capture pixel-level and low-level domain shifts.
-
Alignment typically involves minimizing some measure of distance between the source and target distributions such as
- Maximum mean discrepancy - Correlation distance - Adversarial discriminator accuracy
This method is only for domain adaptation, however the conjecture is that we can do super-resolution along with domain adaptation.
-
-
Attribute-Guided Face Generation Using Condition CycleGAN
-
StyleVAE: Style basedVAE for Real-World SR StyleVAE + SR Network
** Yet to Come **
FIDs scores between model and interpolation technique
How can we evaluate our model if we don't have groud-truth high resolution images?
We can have a decent workaround for this problem, by calculating the FID of our generated images with frame of reference dataset such as Celeb-A, FFHQ, AIT3D etc. Idea is that we calculate FID between the interpolated high-res and frame of reference dataset. We compare this value with FID between synthesized high res image from our model and frame of reference dataset
Reference Dataset | Input Dataset | Upsampling Process | FID |
---|---|---|---|
celebA | DIV2k-faces | NEAREST | 221.990232219738 |
celebA | DIV2k-faces | BILINEAR | 231.9694814498381 |
celebA | DIV2k-faces | BICUBIC | 252.76578951346124 |
celebA | DIV2k-faces | LANCZOS | 242.29334933169804 |
celebA | DIV2k-faces | HAMMING | 221.990232219738 |
celebA | DIV2k-faces | BOX | 199.87696132053242 |
Reference Dataset CelebA
(from Kaggle)
Upsample | Input | FID | Experiment Notes |
---|---|---|---|
EDSR* | DIV2k-faces | 315.56615386029983 | Scale: 4x, lr: 3x16x16, maybe normalization? |
EDSR* | DIV2k-faces | 310.02485583684836 | Scale: 4x, lr: 3x16x16, Scaled |
EDSR* | DIV2k-faces | 307.22565249983984 | Scale: 4x, lr: 3x16x16, ImageNet normalization |
CycleGAN (G1: EDSR*) | DIV2k-faces | 252.2 | Epoch:31, D from PatchGAN, SpectralNorm, ForwardCycle |
*With official pretrained weights.