The official PyTorch implementation of the paper "Self-Correcting Self-Consuming Loops for Generative Model Training", which has been accepted at ICML 2024. Please visit our webpage for more details.
Environment setup
The main building blocks for this repo include Human Motion Diffusion Model, Universal Humanoid Controller, VPoser. Please visit their webpages for more details, including license info. Note that their code depends on other libraries, including CLIP, SMPL, SMPL-X, PyTorch3D, and uses datasets that each have their own respective licenses that must also be followed.
Step 1: build conda env
Run the script:
cd utils/setup
./setup.sh
This will create a conda virtual environment and perform a basic test (test_environment.py
) to see if all succeeds.
The environment setup has several major steps which depend greatly on the host machine. While setup.sh
aspires to be robust / 'just work', there will be differences from system to system.
For completeness, those steps are:
- Create a Python 3.8.12 conda virtual environment named
"scsc"
- Install the dependencies of MDM
- Install the dependencies of UHC (including Mujoco, which requires Boost to cythonize.)
- Install visualization dependencies (Body Visualizer, VPoser)
Optionally, the last step of ./setup.sh
will facilitate moving the SMPL, SMPL+H, and SMPL+X models into their expected locations.
You must have an account on the following websites AND AGREE TO THEIR TERMS AND CONDITIONS:
- SMPL-X: https://smpl-x.is.tue.mpg.de/register.php
- SMPL: https://smpl.is.tue.mpg.de/modellicense.html
- SMPL-H: https://mano.is.tue.mpg.de/register.php
The data download script may also be run independently of the model setup. One can run: ./get_smpl_data.sh
.
More detail on the data dependencies can be found in Step 3.
Step 2: obtain HumanML3D dataset, and filter it to obtain our subset
First, must build HumanML3D dataset.
Instructions on how to build the dataset may be found here: LINK
To obtain the AMASS data required to build HumanML3D, one can use the get_smpl_data.sh
script and then the extract_humanml_3d.sh
script. Together, this will download the required AMASS datasets and put them in a convenient location to proceed with HumanML3D's instructions.
Again, usage of this script requires one have an active account on the AMASS website and agree to the license of all individual datasets.
git clone https://github.com/EricGuo5513/HumanML3D.git
# follow HumanML3D setup instructions at the above repo, then
cp -r HumanML3D/HumanML3D ./dataset/HumanML3D
cp HumanML3D/index.csv ./dataset/HumanML3D/index.csv
Then, at BMLMoVi, you need to download and unpack the files:
and put their contents together inside the folder dataset/F_Subjects_1_90
at the root of the repository.
We use this when we run the following script, to filter the HumanML3D dataset into smaller subdata sets of sizes
python exp_scripts/filter_dataset.py
Step 3: Download dependencies for MDM, UHC, and inverse kinematics engine
# from original MDM repo
pip install gdown
bash prepare/download_glove.sh
bash prepare/download_smpl_files.sh
bash prepare/download_t2m_evaluators.sh
The download_smpl_files.sh
will place files inside body_models/smpl
.
Then, download and place these files in the repo as indicated:
-
DMPL model (go to downloads, then "Download DMPLs compatible with SMPL", then put
dmpls
folder insidebody_models
directory) -
VPoser v2.0 (sign up for an account and find the VPoser v2 download in the 'Downloads tab') and unzip, then place it in
body_models/vposer_v2_05
(i.e. rename downloaded folder tovposer_v2_05
) -
SMPL-H model (find the Extended SMPL+H model download in the 'Downloads tab') and place the
smplh
folder inbody_models
After all this, body_models
directory should look like this:
body_models
├── dmpls
│ ├── female
│ ├── model.npz
│ ├── male
│ ├── model.npz
│ ├── neutral
│ ├── model.npz
├── smpl
│ ├── J_regressor_extra.npy
│ ├── kintree_table.pkl
│ ├── SMPL_NEUTRAL.pkl
│ ├── smplfaces.npy
├── smplh
│ ├── female
│ ├── model.npz
│ ├── male
│ ├── model.npz
│ ├── neutral
│ ├── model.npz
├── vposer_v2_05
│ ├── snapshots
│ ├── V02_05_epoch=08_val_loss=0.03.ckpt
│ ├── V02_05_epoch=13_val_loss=0.03.ckpt
│ ├── V02_05.yaml
Also need data for UHC:
# Also need data for UHC
cd UniversalHumanoidControl
bash download_data.sh
Toy example #1: Gaussian
python exp_scripts/gaussian_toy_example.py
Toy example #2: MNIST
Create a separate conda env for these experiments:
conda create -n mnist_toy python=3.11
conda activate mnist_toy
conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=12.1 -c pytorch -c nvidia
conda install tqdm matplotlib -y
conda install scikit-learn -y
Train an image classifer for MNIST digits; the learned embeddings will be used to compute the FID scores later.
mkdir -p exp_outputs/mnist
python exp_scripts/mnist/fid_lenet.py
This first training script trains the baseline, which is generations 0 through 50. The last checkpoint from Generation 0 will be used to seed all of the self-consuming experiments. Don't start the other runs until this run finishes.
NUM_EPOCH=20
python exp_scripts/mnist/self_consuming_ddpm_mini.py \
--n_epoch_for_training_from_scratch ${NUM_EPOCH} \
--train_type baseline \
--synth_aug_percent 0.0 \
--fraction_of_train_set_to_train_on 0.2 \
--save_dir_parent ./exp_outputs/mnist/ \
--lr_divisor 20 \
--resume_starting_at_generation 0
This script trains the self-consuming loop.
To recreate the results from the paper, you should run this script four times, for each SYNTH_AUG_PERCENT
in {0.2, 0.5, 1.0, 1.5}
.
These can all be run in parallel.
NUM_EPOCH=20
SYNTH_AUG_PERCENT=0.2
python exp_scripts/mnist/self_consuming_ddpm_mini.py \
--n_epoch_for_training_from_scratch ${NUM_EPOCH} \
--train_type iterative_finetuning \
--synth_aug_percent ${SYNTH_AUG_PERCENT} \
--fraction_of_train_set_to_train_on 0.2 \
--save_dir_parent ./exp_outputs/mnist/ \
--lr_divisor 20 \
--resume_starting_at_generation 0
And this script trains the self-consuming loop with self-correction.
Again, to recreate the results from the paper, you should run this script four times, for each SYNTH_AUG_PERCENT
in {0.2, 0.5, 1.0, 1.5}
.
These can also all be run in parallel.
NUM_EPOCH=20
SYNTH_AUG_PERCENT=0.2
python exp_scripts/mnist/self_consuming_ddpm_mini.py \
--n_epoch_for_training_from_scratch ${NUM_EPOCH} \
--train_type iterative_finetuning_with_correction \
--synth_aug_percent ${SYNTH_AUG_PERCENT} \
--fraction_of_train_set_to_train_on 0.2 \
--n_clusters_per_digit 16 \
--save_dir_parent ./exp_outputs/mnist/ \
--lr_divisor 20 \
--resume_starting_at_generation 0
At any point during training, you can check on progress by running the below script.
It will generate graphs and write them to exp_outputs/mnist/graphs
.
python exp_scripts/mnist/generate_graphs.py ./exp_outputs/mnist
Human motion generation
The bash scripts below can be run without any changes.
If your compute resources are managed by Slurm, then you might consider taking a look at
the Slurm script that we used, which is provided at exp_scripts/slurm.sh
.
You would need to change the resource requests and environment to match whatever your slurm setup is,
and of course you would need to change the last line, which executes the bash script listed below.
dataset size = 64
$n = 64$ , training from scratch
# STEP 1: we train generation 0 on just ground truth data
bash exp_scripts/dataset_0064/train_generation_0.sh
# STEP 2: copy the checkpoint from that experiment to seed all the other experiments
python exp_scripts/dataset_0064/copy_generation_0.py
# STEP 3: After the above scripts finish, each of following 9 scripts can run in parallel
# STEP 3A: we train the baseline model
bash exp_scripts/dataset_0064/train_baseline.sh
# STEP 3B: train the iterative finetuning models
bash exp_scripts/dataset_0064/train_iterative_finetuning.sh 025
bash exp_scripts/dataset_0064/train_iterative_finetuning.sh 050
bash exp_scripts/dataset_0064/train_iterative_finetuning.sh 075
bash exp_scripts/dataset_0064/train_iterative_finetuning.sh 100
# STEP 3C: train the iterative finetuning models with correction
bash exp_scripts/dataset_0064/train_iterative_finetuning_with_correction.sh 025
bash exp_scripts/dataset_0064/train_iterative_finetuning_with_correction.sh 050
bash exp_scripts/dataset_0064/train_iterative_finetuning_with_correction.sh 075
bash exp_scripts/dataset_0064/train_iterative_finetuning_with_correction.sh 100
# STEP 4: we can graph our results; to see intermediate results, this script can be run
# while the above 9 scripts are still running
python exp_scripts/generate_graphs.py 0064
$n = 64$ , synthesizing motions using those trained weights
These scripts randomly select prompts from the test split for visualization, then sample from the
checkpoint, and render them. The second step takes a while, but note that you can execute the same
script
# STEP 1: run this script to copy over the relevant checkpoints into a new folder.
# command line arg #1: dataset size
# command line arg #2: quantity of prompts to sample from the test split
python exp_scripts/prep_for_visualization.py 0064 16
# STEP 2: sample motions from checkpoints, then render motions.
# command line arg #1: the path output from previous script
# command line arg #2: quantity of samples to synthesize for each prompt
python sample/checkpoint_visual_sampler.py exp_outputs/dataset_0064/visualization 4
dataset size = 128
$n = 128$ , training from scratch
The logic for the case where the dataset has size
# train generation 0, then use it to seed other results
bash exp_scripts/dataset_0128/train_generation_0.sh
python exp_scripts/dataset_0128/copy_generation_0.py
# train generations 1 through 50
bash exp_scripts/dataset_0128/train_baseline.sh
bash exp_scripts/dataset_0128/train_iterative_finetuning.sh 025
bash exp_scripts/dataset_0128/train_iterative_finetuning.sh 050
bash exp_scripts/dataset_0128/train_iterative_finetuning.sh 075
bash exp_scripts/dataset_0128/train_iterative_finetuning.sh 100
bash exp_scripts/dataset_0128/train_iterative_finetuning_with_correction.sh 025
bash exp_scripts/dataset_0128/train_iterative_finetuning_with_correction.sh 050
bash exp_scripts/dataset_0128/train_iterative_finetuning_with_correction.sh 075
bash exp_scripts/dataset_0128/train_iterative_finetuning_with_correction.sh 100
# graph the results
python exp_scripts/generate_graphs.py 0128
$n = 128$ , synthesizing motions using those trained weights
# copy the checkpoints into a new folder, randomly choose 16 prompts from test split
python exp_scripts/prep_for_visualization.py 0128 16
# synthesize motions from checkpoints, then render 4 samples for each one
python sample/checkpoint_visual_sampler.py exp_outputs/dataset_0128/visualization 4
dataset size = 256
$n = 256$ , training from scratch
The logic for the case where the dataset has size
# train generation 0, then use it to seed other results
bash exp_scripts/dataset_0256/train_generation_0.sh
python exp_scripts/dataset_0256/copy_generation_0.py
# train generations 1 through 50
bash exp_scripts/dataset_0256/train_baseline.sh
bash exp_scripts/dataset_0256/train_iterative_finetuning.sh 025
bash exp_scripts/dataset_0256/train_iterative_finetuning.sh 050
bash exp_scripts/dataset_0256/train_iterative_finetuning.sh 075
bash exp_scripts/dataset_0256/train_iterative_finetuning.sh 100
bash exp_scripts/dataset_0256/train_iterative_finetuning_with_correction.sh 025
bash exp_scripts/dataset_0256/train_iterative_finetuning_with_correction.sh 050
bash exp_scripts/dataset_0256/train_iterative_finetuning_with_correction.sh 075
bash exp_scripts/dataset_0256/train_iterative_finetuning_with_correction.sh 100
# graph the results
python exp_scripts/generate_graphs.py 0256
$n = 256$ , synthesizing motions using those trained weights
# copy the checkpoints into a new folder, randomly choose 16 prompts from test split
python exp_scripts/prep_for_visualization.py 0256 16
# synthesize motions from checkpoints, then render 4 samples for each one
python sample/checkpoint_visual_sampler.py exp_outputs/dataset_0256/visualization 4
dataset size = 2794
$n = 2794$ , training from scratch
The logic for the case where the dataset has size
# train generation 0, then use it to seed other results
bash exp_scripts/dataset_2794/train_generation_0.sh
python exp_scripts/dataset_2794/copy_generation_0.py
# train generations 1 through 50
bash exp_scripts/dataset_2794/train_baseline.sh
bash exp_scripts/dataset_2794/train_iterative_finetuning.sh 025
bash exp_scripts/dataset_2794/train_iterative_finetuning.sh 050
bash exp_scripts/dataset_2794/train_iterative_finetuning.sh 075
bash exp_scripts/dataset_2794/train_iterative_finetuning.sh 100
bash exp_scripts/dataset_2794/train_iterative_finetuning_with_correction.sh 025
bash exp_scripts/dataset_2794/train_iterative_finetuning_with_correction.sh 050
bash exp_scripts/dataset_2794/train_iterative_finetuning_with_correction.sh 075
bash exp_scripts/dataset_2794/train_iterative_finetuning_with_correction.sh 100
# graph the results
python exp_scripts/generate_graphs.py 2794
$n = 2794$ , synthesizing motions using those trained weights
# copy the checkpoints into a new folder, randomly choose 16 prompts from test split
python exp_scripts/prep_for_visualization.py 2794 16
# synthesize motions from checkpoints, then render 4 samples for each one
python sample/checkpoint_visual_sampler.py exp_outputs/dataset_2794/visualization 4
After a training run of iterative fine-tuning with correction finishes, you have the option of visualizing synthetic motions for a given generation before and after the UHC self-correction.
To do this, you can run the following script.
The command line argument should point do a folder which contains the synthetic_motions
subdirectory.
python sample/visualize_motion_sequences.py exp_outputs/dataset_0064_seed_47/synthetic_percent_100_iterative_finetuning_with_correction/generation_50
We thank the authors of the works we build upon:
- Human Motion Diffusion Model
- Universal Humanoid Controller
- VPoser aka Human Body Prior
- Body Visualizer
- HumanML3D
- Conditional_Diffusion_MNIST
Our visualizations are inspired by
- PhysDiff: Physics-Guided Human Motion Diffusion Model
- RoHM: Robust Human Motion Reconstruction via Diffusion
If you find this code useful in your research, please cite:
@InProceedings{gillman2024selfcorrecting,
title={Self-Correcting Self-Consuming Loops for Generative Model Training},
author={Gillman, Nate and Freeman, Michael and Aggarwal, Daksh and Hsu, Chia-Hong and Luo, Calvin and Tian, Yonglong and Sun, Chen},
booktitle={Proceedings of the 41st International Conference on Machine Learning},
pages={15646--15677},
year={2024},
volume={235},
publisher={PMLR}
}