Self-training Room Layout Estimation via Geometry-aware Ray-casting
Bolivar Solarte, Chin-Hsuan Wu, Jin-Cheng Jhang, Jonathan Lee, Yi-Hsuan Tsai, Min Sun
National Tsinghua University, Industrial Technology Research Institute ITRI (Taiwan) and Google
This is the implementation of our proposes algorithm Multi-cycle ray-casting to creating pseudo-labels for self-training 360-room layout models.
For convenience, we recommend using conda
to create a new environment for this project.
conda create -n ray-casting-mlc python=3.9
conda activate ray-casting-mlc
For reproducibility, we recommend creating a workspace directory where the datasets, pre-trained models, training results, and other files will be stored. In this description, we assume that ${HOME}/ray_casting_mlc_ws
is the workspace.
mkdir -p ${HOME}/ray_casting_mlc_ws
cd ${HOME}/ray_casting_mlc_ws
git clone https://github.com/EnriqueSolarte/ray_casting_mlc
cd ray_casting_mlc
pip install .
With the publication of this project, we release a new dataset called hm3d_mvl
. This dataset complements previous panorama datasets by adding multiple registered views as inputs for the task of self-training room layout estimation. This dataset can be downloaded using the following command:
# Downloading the dataset hm3d_mvl
python experiments/download/mvl_datasets.py dataset=hm3d_mvl
Using the format, we also provide the mp3d_fpe_mvl
and zind_mvl
datasets processed from the official repositories 360-MLC and ZiND, respectively. For convenience, we called these datasets as multi-view-layout datasets mvl-datasets
. These datasets are public available in MLV-Datases (Huggingface 🤗). To download these datasets, use the following commands:
# Downloading the dataset mp3d_fpe_mvl
python experiments/download/mvl_datasets.py dataset=mp3d_fpe_mvl
# Downloading the dataset zind_mvl
python experiments/download/mvl_datasets.py dataset=zind_mvl
# Downloading the all mvl-datasets
python experiments/download/mvl_datasets.py dataset=main
We provide pre-trained models for LGTNet and HorizonNer models. These pretrained weights are taken from officially repositories provided by the authors. These models can be downloaded using the following command:
python experiments/download/pre_trained_models.py
After downloading the datasets, we can create the pseudo-labels using LGTNet model as follows:
# To precomputed 360-MLC pseudo labels. This is needed for initializing multi-cycle ray-casting process.
python experiments/lgt_net/pre_compute_mlc.py dataset=hm3d_mvl
# To precomputed estimations from a pre-trained model
python experiments/lgt_net/pre_compute_xyz.py dataset=hm3d_mvl
# To run multi-cycle ray-casting process
python experiments/lgt_net/multi_cycle_ray_casting.py dataset=hm3d_mvl
# To build pseudo labels for the self-training process
python experiments/lgt_net/sampling_ray_casting_pseudo_labels.py dataset=hm3d_mvl
For convenience, we also provide similar scripts for HorizonNer model.
After creating the pseudo-labels, we can run the self-training process using the following command:
python experiments/lgt_net/train/train_lgt_net_xyz.py dataset=hm3d_mvl
Results will be saved in the ``${HOME}/ray_casting_mlc_ws/train_results` directory.
For the hm3d-mvl
dataset and Multi-cycle ray-casting
please cite the following paper:
@article{solarte2024_ray_casting_mlc,
title ={Self-training Room Layout Estimation via Geometry-aware Ray-casting},
author ={Bolivar Solarte and Chin-Hsuan Wu and Jin-Cheng Jhang and Jonathan Lee and Yi-Hsuan Tsai and Min Sun},
year ={2024},
url ={https://arxiv.org/abs/2407.15041},
}
For the mp3d-fpe-mvl
dataset please cite the following paper:
@article{Solarte2022_360_MLC,
title ={360-mlc: Multi-view layout consistency for self-training and hyper-parameter tuning},
author ={Solarte, Bolivar and Wu, Chin-Hsuan and Liu, Yueh-Cheng and Tsai, Yi-Hsuan and Sun, Min},
journal ={Advances in Neural Information Processing Systems (NeurIPS)},
volume ={35},
pages ={6133--6146},
year ={2022}
}
For the zind-mvl
dataset please cite the following paper:
@inproceedings{ZInD,
title = {Zillow Indoor Dataset: Annotated Floor Plans With 360º Panoramas and 3D Room Layouts},
author = {Cruz, Steve and Hutchcroft, Will and Li, Yuguang and Khosravan, Naji and Boyadzhiev, Ivaylo and Kang, Sing Bing},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021},
pages = {2133--2143}
}