3DTopia-XL: High-Quality 3D PBR Asset Generation via Primitive Diffusion

TL;DR

3DTopia-XL is a 3D diffusion transformer (DiT) operating on primitive-based representation.
It can generate 3D asset with smooth geometry and PBR materials from single image or text.

Paper | Project Page | Video | Weights | Hugging Face 🤗

teaser_en.mp4

News

[09/2024] Technical report released!

[09/2024] Hugging Face demo released!

[08/2024] Inference code released!

Citation

If you find our work useful for your research, please consider citing this paper:

@article{chen2024primx,
  title={3DTopia-XL: High-Quality 3D PBR Asset Generation via Primitive Diffusion},
  author={Chen, Zhaoxi and Tang, Jiaxiang and Dong, Yuhao and Cao, Ziang and Hong, Fangzhou and Lan, Yushi and Wang, Tengfei and Xie, Haozhe and Wu, Tong and Saito, Shunsuke and Pan, Liang and Lin, Dahua and Liu, Ziwei},
  journal={arXiv preprint arXiv:2409.12957},
  year={2024}
}

Installation

We highly recommend using Anaconda to manage your python environment. You can setup the required environment by the following commands:

# install dependencies
conda create -n primx python=3.9
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=11.8 -c pytorch -c nvidia
# requires xformer for efficient attention
conda install xformers::xformers
# install other dependencies
pip install -r requirements.txt
# compile third party libraries
bash install.sh
# Now, all done!

Pretrained Weights

Our pretrained weight can be downloaded from huggingface

For example, to download the singleview-conditioned model in fp16 precision for inference:

mkdir pretrained && cd pretrained
# download DiT
wget https://huggingface.co/FrozenBurning/3DTopia-XL/resolve/main/model_sview_dit_fp16.pt
# download VAE
wget https://huggingface.co/FrozenBurning/3DTopia-XL/resolve/main/model_vae_fp16.pt
cd ..

We will release the multiview-conditioned model and text-conditioned model in the near future!

Inference

Gradio Demo

The gradio demo will automatically download pretrained weights using huggingface_hub.

You could locally launch our demo with Gradio UI by:

python app.py

Alternatively, you can run the demo online

CLI Test

Run the following command for inference:

python inference.py ./configs/inference_dit.yml

Furthermore, you can modify the inference parameters in inference_dit.yml, detailed as follows:

Parameter	Recommended	Description
`input_dir`	-	The path of folder that stores all input images.
`ddim`	25, 50, 100	Total number of DDIM steps. Robust with more steps but fast with fewer steps.
`cfg`	4 - 7	The scale for Classifer-free Guidance (CFG).
`seed`	Any	Different seeds lead to diverse different results.
`export_glb`	True	Whether to export textured mesh in GLB format after DDIM sampling is over.
`fast_unwrap`	False	Whether to enable fast UV unwrapping algorithm.
`decimate`	100000	The max number of faces for mesh extraction.
`mc_resolution`	256	The resolution of the unit cube for marching cube.
`remesh`	False	Whether to run retopology after mesh extraction.

Training

We will release the training code and details in the future!

Acknowledgement

This work is built on many amazing research works and open-source projects, thanks all the authors for sharing!

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
configs		configs
dva		dva
models		models
utils		utils
.gitignore		.gitignore
README.md		README.md
app.py		app.py
inference.py		inference.py
install.sh		install.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3DTopia-XL: High-Quality 3D PBR Asset Generation via Primitive Diffusion

TL;DR

3DTopia-XL is a 3D diffusion transformer (DiT) operating on primitive-based representation.
It can generate 3D asset with smooth geometry and PBR materials from single image or text.

Paper | Project Page | Video | Weights | Hugging Face 🤗

News

Citation

Installation

Pretrained Weights

Inference

Gradio Demo

CLI Test

Training

Acknowledgement

About

Releases

Languages

3DTopia/3DTopia-XL

Folders and files

Latest commit

History

Repository files navigation

3DTopia-XL: High-Quality 3D PBR Asset Generation via Primitive Diffusion

TL;DR

3DTopia-XL is a 3D diffusion transformer (DiT) operating on primitive-based representation. It can generate 3D asset with smooth geometry and PBR materials from single image or text.

Paper | Project Page | Video | Weights | Hugging Face 🤗

News

Citation

Installation

Pretrained Weights

Inference

Gradio Demo

CLI Test

Training

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Releases

Languages

3DTopia-XL is a 3D diffusion transformer (DiT) operating on primitive-based representation.
It can generate 3D asset with smooth geometry and PBR materials from single image or text.