[Project Website] [Google Colab] [Paper]
LARGE: Latent-Based Regression through GAN Semantics
Yotam Nitzan*, Rinon Gal*, Ofir Brenner, and Daniel Cohen-Or
Abstract: We propose a novel method for solving regression tasks using few-shot or weak supervision. At the core of our method is the fundamental observation that GANs are incredibly successful at encoding semantic information within their latent space, even in a completely unsupervised setting. For modern generative frameworks, this semantic encoding manifests as smooth, linear directions which affect image attributes in a disentangled manner. These directions have been widely used in GAN-based image editing. We show that such directions are not only linear, but that the magnitude of change induced on the respective attribute is approximately linear with respect to the distance traveled along them. By leveraging this observation, our method turns a pre-trained GAN into a regression model, using as few as two labeled samples. This enables solving regression tasks on datasets and attributes which are difficult to produce quality supervision for. Additionally, we show that the same latent-distances can be used to sort collections of images by the strength of given attributes, even in the absence of explicit supervision. Extensive experimental evaluations demonstrate that our method can be applied across a wide range of domains, leverage multiple latent direction discovery frameworks, and achieve state-of-the-art results in few-shot and low-supervision settings, even when compared to methods designed to tackle a single task.
StyleGAN2 implementation:
https://github.com/rosinality/stylegan2-pytorch
Copyright (c) 2019 Kim Seonghyeon
License (MIT) https://github.com/rosinality/stylegan2-pytorch/blob/master/LICENSE
StyleGAN2 Models: https://github.com/NVlabs/stylegan2-ada/ https://github.com/NVlabs/stylegan2 Copyright (c) 2021, NVIDIA Corporation Nvidia Source Code License-NC
pSp model and implementation:
https://github.com/eladrich/pixel2style2pixel
Copyright (c) 2020 Elad Richardson, Yuval Alaluf
License (MIT) https://github.com/eladrich/pixel2style2pixel/blob/master/LICENSE
e4e model and implementation:
https://github.com/omertov/encoder4editing
Copyright (c) 2021 omertov
License (MIT) https://github.com/omertov/encoder4editing/blob/main/LICENSE
ReStyle model and implementation:
https://github.com/yuval-alaluf/restyle-encoder/
Copyright (c) 2021 Yuval Alaluf
License (MIT) https://github.com/yuval-alaluf/restyle-encoder/blob/main/LICENSE
We would like to thank Raja Gyres, Yangyan Li, Or Patashnik, Yuval Alaluf, Amit Attia, Noga Bar and Zonzge Wu for helpful comments. We additionaly thank Zonzge Wu for the trained e4e models for AFHQ cats and dogs.
If you use this code for your research, please cite our papers.
@misc{nitzan2021large,
title={LARGE: Latent-Based Regression through GAN Semantics},
author={Yotam Nitzan and Rinon Gal and Ofir Brenner and Daniel Cohen-Or},
year={2021},
eprint={2107.11186},
archivePrefix={arXiv},
primaryClass={cs.CV}
}