This project is paddle implementation for few-shot photorealistic video-to-video translation. Go to this link for more details. This project is heavily copied from original project Few-shot vid2vid and Imaginaire. This work is made available under the Nvidia Source Code License (1-Way Commercial). To view a copy of this license, visit License. If this work is benifit for you, please cite:
@inproceedings{wang2018fewshotvid2vid,
author = {Ting-Chun Wang and Ming-Yu Liu and Andrew Tao and Guilin Liu and Jan Kautz and Bryan Catanzaro},
title = {Few-shot Video-to-Video Synthesis},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
year = {2019},
}
This project is totally conducted in AI Studio, Because of the lack of ecology in AI Studio, other dependencies should implemented first to support this project. Here I implemented
YouTube Dancing Videos Dataset
YouTube Dancing Videos Dataset is a large scale dancing video dataset collected from YouTube site. Note that the dataset used in this project is slightly different from the original project with augmentation data collected from bilibili. In the end, 700 raw videos froms the raw video dataset. Then, I use openpose-paddle and densepose-paddle to extract pose annotations from the raw videos. Finially, I get 4240 video sequences, and 1,382,329 raw frames with corresponding pose annotations. This dataset is split into 4 subdataset for convenience of training and the constriction of storage. For more details, please go to preprocess and YouTube Dancing Videos Dataset.
For other dataset
Not implemented here!
! cd /home/aistudio/vid2vid/ && python ./train.py --logdir /path/to/log/dictory/ \
--max_epoch 20 \
--max_iter_per_epoch 10000 \
--num_epochs_temporal_step 4 \
--train_data_root /path/to/dancing/video/dataset \
--val_data_root /path/to/evaluation/dataset
To train your own dataset, please follow the instruction in preprocess to get your dataset first, then run command above. Be careful of mode collapse when training with highly inner-variance dataset.
Here I trained four models for 4 subdataset described above. Note that these four models are train on YouTube Dancing Videos Dataset, you need to finetune on your own dataset to synthesis your own videos.
I put these pretrained models to here. Model trained on set 4 have not be released currently.
! cd /home/aistudio/vid2vid/ && python ./evaluate.py --log_dir /path/to/evaluation/results/output/directory
--checkpoint_logdir /path/to/checkpoints/directory
--eval_data_dir /path/to/eval/data/directory
For example, to evaluate model trained on set-1, I can run
python ./evaluate.py --logdir /home/aistudio/vid2vid/outputs/evaluation/
--checkpoint_logdir /home/aistudio/work/logs/checkpoints/
--eval_data_dir /home/aistudio/data/data68795/home/aistudio/test_pose/1_pose/images/
Thanks for the support of GPU resourses provided by AI Studio for this project.