Skip to content

zcain117/taming-transformers-tpu

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Taming Transformers for High-Resolution Image Synthesis, CVPR 2021 (Oral)

Refactoring Taming Transformers for TPU VM.

teaser

Taming Transformers for High-Resolution Image Synthesis
Patrick Esser*, Robin Rombach*, Björn Ommer
* equal contribution

tl;dr We combine the efficiancy of convolutional approaches with the expressivity of transformers by introducing a convolutional VQGAN, which learns a codebook of context-rich visual parts, whose composition is modeled with an autoregressive transformer.

teaser arXiv | BibTeX | Project Page

Requirements

pip install -r requirements.txt

Data Preparation

Place any image dataset with ImageNet-style directory structure (at least 1 subfolder) to fit the dataset into pytorch ImageFolder.

Training models

You can easily test main.py with randomly generated fake data.

python main.py --use_tpus --fake_data

For actual training provide specific directory for train_dir, val_dir, log_dir:

python main.py --use_tpus --train_dir [training_set] --val_dir [val_set] --log_dir [where to save results]

BibTeX

@misc{esser2020taming,
      title={Taming Transformers for High-Resolution Image Synthesis}, 
      author={Patrick Esser and Robin Rombach and Björn Ommer},
      year={2020},
      eprint={2012.09841},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

About

Refactoring taming-transformer for TPU

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%