Skip to content

Latest commit

 

History

History
131 lines (91 loc) · 4.76 KB

PERSON_IMAGE_ANIMATION.md

File metadata and controls

131 lines (91 loc) · 4.76 KB

Pose-Guided Person Image Animation

The details of the person image animation task are provided here.

Person image animation is to generate a video clip using a source person image and target pose skeletons. Compare with the pose-guided person image generation task, this task requires to model the temporal consistency. Therefore, we modify the model in two ways: the noisy poses extracted by the popular pose extraction methods are first prepossessed by a Motion Extraction Network to obtain clean poses. Then we generate the final animation results in a recurrent manner. The technical details are provided in this paper.

From Left to Right: Skeleton Squences. Propossessed Skeleton Seqences; Animation Results.

Dataset

Two datasets are used in this task: The FashionVideo dataset and the iPER dataset.

  • Download the videos of the datasets.

  • We provide the Alphapose extraction results of these datasets. Meanwhile, the prepossessed clean poses are also available. Please use the following code to download these resources.

    ./script/download_animation_skeletons.sh
  • Extract the image frames and resize them as 256 x 256 using the following code

    python ./script/extract_video_frames.py \
    --frame_root=[path to write the video frames] \
    --video_path=[path to the mp4 files] \
    --anno_path=[path to the previously downloaded skeletons]

Note: you can also extract the skeleton on your own. Please use the Alphapose algorithm and output the results with openpose format.

Testing

Download the trained weights from FashionVideo and iPER. Put the obtained checkpoints under ./result/dance_fashion_checkpoints and ./result/dance_iper_checkpoints respectively.

Run the following codes to obtain the animation results.

# test on the fashionVideo dataset 
python test.py \
--name=dance_fashion_checkpoints \
--model=dance \
--attn_layer=2,3 \
--kernel_size=2=5,3=3 \
--gpu_id=0 \
--dataset_mode=dance \
--sub_dataset=fashion \
--dataroot=./dataset/danceFashion \
--results_dir=./eval_results/dance_fashion \
--checkpoints_dir=result

# test on the iper dataset
python test.py \
--name=dance_iper_checkpoints \
--model=dance \
--attn_layer=2,3 \
--kernel_size=2=5,3=3 \
--gpu_id=0 \
--dataset_mode=dance \
--sub_dataset=iper \
--dataroot=./dataset/iPER \
--results_dir=./eval_results/dance_iper \
--checkpoints_dir=result

Training on your custom dataset

If you want to train the model on your own dataset, you need to first extract the skeletons using the pose extraction algorithm Alphapose. Then extract the clean skeletons from the noisy data using the motion extraction net.

Motion Extraction Net

This network is used to prepossess the noisy skeletons extracted by some pose extraction models. We train this model using the Human36M dataset. We download the training ground-truth label data_2d_h36m_gt.npz from here. The corresponding input label data_2d_h36m_detectron_pt_coco.npz is download from here.

Use the following code to train this model

python train.py \
--name=keypoint \
--model=keypoint \
--gpu_id=2 \
--dataset_mode=keypoint \
--continue_train

We also provide trained weights. Assuming that you want to smooth the skeleton sequences of the iPER training set, you can use the following code

python test.py \
--name=dance_keypoint_checkpoints \
--model=keypoint \
--gpu_id=2 \
--dataset_mode=keypointtest \
--dataroot=[root path of your dataset] \
--sub_dataset=iper \
--results_dir=[path to save the results] \
--eval_set=[train/test/val]

After obtain the clean skeletons. You can train our model on your dataset using the following code. (Note: you need to modify the dance_dataset.py to add your dataset as a sub_set)

python train.py \
--name=[name_of_the_experiment] \
--model=dance \
--attn_layer=2,3 \
--kernel_size=2=5,3=3 \
--gpu_id=0,1 \
--dataset_mode=dance \
--sub_dataset=[iper/fashion/your_dataset_name] \
--dataroot=[your_dataset_root] \
--continue_train