Wangbo Yu*, Jinbo Xing*, Li Yuan*, Wenbo Hu†, Xiaoyu Li, Zhipeng Huang,
Xiangjun Gao, Tien-Tsin Wong, Ying Shan, Yonghong Tian†
ViewCrafter can generate high-fidelity novel views from a single or sparse reference image, while also supporting highly precise pose control. Below shows some examples:
Reference image | Camera trajecotry | Generated novel view video |
Reference image 1 | Reference image 2 | Generated novel view video |
- [2024-10-15]: 🔥🔥 Release the code for sparse view novel view synthesis.
- [2024-09-01]: Launch the project page and update the arXiv preprint.
- [2024-09-01]: Release pretrained models and the code for single-view novel view synthesis.
Model | Resolution | Frames | GPU Mem. & Inference Time (A100, ddim 50steps) | Checkpoint | Description |
---|---|---|---|---|---|
ViewCrafter_25 | 576x1024 | 25 | 23.5GB & 120s (perframe_ae=True ) |
Hugging Face | Used for single view NVS, can also adapt to sparse view NVS |
ViewCrafter_25_sparse | 576x1024 | 25 | 23.5GB & 120s (perframe_ae=True ) |
Hugging Face | Used for sparse view NVS |
ViewCrafter_16 | 576x1024 | 16 | 18.3GB & 75s (perframe_ae=True ) |
Hugging Face | 16 frames model, used for ablation |
ViewCrafter_25_512 | 320x512 | 25 | 13.8GB & 50s (perframe_ae=True ) |
Hugging Face | 512 resolution model, used for ablation |
git clone https://github.com/Drexubery/ViewCrafter.git
cd ViewCrafter
# Create conda environment
conda create -n viewcrafter python=3.9.16
conda activate viewcrafter
pip install -r requirements.txt
# Install PyTorch3D
conda install https://anaconda.org/pytorch3d/pytorch3d/0.7.5/download/linux-64/pytorch3d-0.7.5-py39_cu117_pyt1131.tar.bz2
# Download DUSt3R
mkdir -p checkpoints/
wget https://download.europe.naverlabs.com/ComputerVision/DUSt3R/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth -P checkpoints/
Note
If you use a high PyTorch version (like torch 2.4), it may cause CUDA OOM error. Please refer to these issues for solutions.
(1) Download pretrained ViewCrafter_25 and put the model.ckpt
in checkpoints/model.ckpt
.
(2) Run inference.py using the following script. Please refer to the configuration document and render document to set up inference parameters and camera trajectory.
sh run.sh
(1) Download pretrained ViewCrafter_25_sparse and put the model_sparse.ckpt
in checkpoints/model_sparse.ckpt
. (ViewCrafter_25_sparse is specifically trained for the sparse view NVS task and performs better than ViewCrafter_25 on this task)
(2) Run inference.py using the following script. Adjust the --bg_trd
parameter to clean the point cloud; higher values will produce a cleaner point cloud but may create holes in the background.
sh run_sparse.sh
Download pretrained ViewCrafter_25 and put the model.ckpt
in checkpoints/model.ckpt
, then run:
python gradio_app.py
Please consider citing our paper if our code is useful:
@article{yu2024viewcrafter,
title={ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis},
author={Yu, Wangbo and Xing, Jinbo and Yuan, Li and Hu, Wenbo and Li, Xiaoyu and Huang, Zhipeng and Gao, Xiangjun and Wong, Tien-Tsin and Shan, Ying and Tian, Yonghong},
journal={arXiv preprint arXiv:2409.02048},
year={2024}
}