This is the official implementation of LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.
Project Page | Arxiv | Weights |
demo.mp4
[2024.4.3] Thanks to @yxymessi and @florinshen, we have fixed a severe bug in rotation normalization here. We have finetuned the model with correct normalization for 30 more epochs and uploaded new checkpoints.
Thanks to @camenduru!
# xformers is required! please refer to https://github.com/facebookresearch/xformers for details.
# for example, we use torch 2.1.0 + cuda 11.8
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118
pip install -U xformers --index-url https://download.pytorch.org/whl/cu118
# a modified gaussian splatting (+ depth, alpha rendering)
git clone --recursive https://github.com/ashawkey/diff-gaussian-rasterization
pip install ./diff-gaussian-rasterization
# for mesh extraction
pip install git+https://github.com/NVlabs/nvdiffrast
# other dependencies
pip install -r requirements.txt
Our pretrained weight can be downloaded from huggingface.
For example, to download the fp16 model for inference:
mkdir pretrained && cd pretrained
wget https://huggingface.co/ashawkey/LGM/resolve/main/model_fp16_fixrot.safetensors
cd ..
For MVDream and ImageDream, we use a diffusers implementation. Their weights will be downloaded automatically.
Inference takes about 10GB GPU memory (loading all imagedream, mvdream, and our LGM).
### gradio app for both text/image to 3D
python app.py big --resume pretrained/model_fp16.safetensors
### test
# --workspace: folder to save output (*.ply and *.mp4)
# --test_path: path to a folder containing images, or a single image
python infer.py big --resume pretrained/model_fp16.safetensors --workspace workspace_test --test_path data_test
### local gui to visualize saved ply
python gui.py big --output_size 800 --test_path workspace_test/saved.ply
### mesh conversion
python convert.py big --test_path workspace_test/saved.ply
For more options, please check options.
NOTE: Since the dataset used in our training is based on AWS, it cannot be directly used for training in a new environment. We provide the necessary training code framework, please check and modify the dataset implementation!
We also provide the ~80K subset of Objaverse used to train LGM in objaverse_filter.
# debug training
accelerate launch --config_file acc_configs/gpu1.yaml main.py big --workspace workspace_debug
# training (use slurm for multi-nodes training)
accelerate launch --config_file acc_configs/gpu8.yaml main.py big --workspace workspace
This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!
@article{tang2024lgm,
title={LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation},
author={Tang, Jiaxiang and Chen, Zhaoxi and Chen, Xiaokang and Wang, Tengfei and Zeng, Gang and Liu, Ziwei},
journal={arXiv preprint arXiv:2402.05054},
year={2024}
}