by Jingwen Wang, Juan Tarrio, Lourdes Agapito, Pablo F. Alcantarilla, Alexander Vakhitov.
Jingwen Wang ([email protected]) is the original author of the core of the method and the evaluaton scripts. Alexander Vakhitov ([email protected]) is the author of the QPOS over-segmentation method implementation.
This repository contains the code to train and evaluate the method and the link to the Semantic Mapping with Realsense dataset.
python=3.9 pytorch=1.11.0 cuda=11.3
conda env create -f environment.yml
conda activate semlaps
pytorch3d
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py39_cu113_pyt1110/download.html
You can download ScanNet by following their official instruction. Apart from the basic data, you will also need 2D semantic GT and 3D meshes with GT labels. You will expect to have the following fil types:
.sens
: extracted todepth
,pose
,color
,intrinsic
.txt
: some meta-data_vh_clean_2.labels.ply
: 3D GT mesh_2d-label-filt.zip
: 2D semantic data
You will also need to process the raw 2D semantic images, resizing them to 640x480
and convert the semantic encoding from NYU-40 to ScanNt-20.
Unfortunately, we are not able to provide the code for this part. Please refer to official ScanNet GitHub or raise an issue if you have questions.
Please find the Semantic Mapping for Realsense dataset here.
python create_fragments_n_views.py --scannet_root ${scannet_root} --save_files_root image_pairs
This will creates the multi-view training indices (triplet) of the camera frames for all the 1513 scenes in the train/val dataset of ScanNet.
Training:
python train_lpn.py --config configs/config_lpn.yaml --scannet_root /media/jingwen/Data2/scannet/scans --log_dir exps/LPN
LPN supports 4 different modes for you to explore:
- Multi-view RGBD (default): rgb and depth fusion with SSMA + feature warping (w/ depth, camera poses and K)
modality=rgbd, use_ssma=True, reproject=True
- Multi-view RGBD with RGB feature: rgb-only encoder, no SSMA, depth only used in feature warping (w/ depth, camera poses and K)
modality=rgbd, use_ssma=False, reproject=True
- Single-view RGBD: rgb and depth fusion with SSMA
modality=rgbd, use_ssma=True, reproject=False
- Single-view RGB: rgb input only
modality=rgb, use_ssma=False, reproject=False
Evaluate script for LPN (2D):
python eval_lpn.py --log_dir exps/LPN --dataset_type scannet --dataset_root /media/jingwen/Data3/scannet/scans --save_dir exps/LPN/eval/scannet_val --eval
segment_suffix=segments/QPOS
python run_qpos.py --segment_suffix ${segment_suffix} --dataset_type scannet --dataset_root ${scannet_root} --small_segment_size 30 --expected_segment_size 60
Segments will be saved under ${scannet_root}/${scene}/${segment_suffix}
for each scene. Note that for slamcore sequences we have to adjust the segment size --small_segment_size 120 --expected_segment_size 240
log_dir=logs/LPN
label_fusion_dir=exps/LPN_labels_3D
python eval_lpn_bayesian_label.py --log_dir ${log_dir} --dataset_type scannet --dataset_root ${scannet_root} --save_dir ${label_fusion_dir}
Labelled meshes will be saved under ${label_fusion_dir}/${scene}
python prepare_3d_training_data.py --label_fusion_dir ${label_fusion_dir} --segment_suffix ${segment_suffix} --dataset_type scannet --dataset_root ${scannet_root} --save_mesh
The training data will be saved under ${label_fusion_dir}/${scene}/${segment_suffix}
python train_segconvnet.py --config configs/config_segconvnet.yaml --log_dir exps/SegConvNet --label_fusion_dir ${label_fusion_dir} --segment_suffix ${segment_suffix}
Evaluation script for the SegConvNet:
segconv_logdir=exps/SegConvNet
python eval_segconvnet.py --log_dir ${segconv_logdir} --dataset_type scannet --dataset_root ${scannet_root} --label_fusion_dir ${label_fusion_dir} --segment_suffix ${segment_suffix} --save_dir exps/SegConvNet_labels
To reproduce results on the SMR dataset from the paper, please do steps 1, 2, 3 and then run the evaluation script for the SegConvNet eval_segconvnet.py.
First download example scannet scene0645_00
from here and extract it under $SCANNET_ROOT
. You then should expect to have the following directory structure:
$SCANNET_ROOT
├── scene0645_00
├── color
├── 0.jpg
├── 1.jpg
...
├── depth
├── 0.png
├── 1.png
...
├── intrinsic
├── pose
scene0645_00_vh_clean_2.ply
...
Then you need to update the ScanNet root path mapping here, simply put your hostname and $SCANNET_ROOT
as the key and value.
Then you also need to download the checkpoint files:
And extract them under $EXP_DIR
. Then run the following command:
python sequential_runner_scannet.py --exp_dir $EXP_DIR --scene scene0645_00 --mapping_every 20 --skip 1
This will save the results under $EXP_DIR/scannet/scene0645_00_skip20
First download example RealSense sequencekitchen1
from here and extract it under $SLAMCORE_ROOT
. You then should expect to have the following directory structure:
$SLAMCORE_ROOT
├── kitchen1
├── color
├── 0.png
├── 10.png
├── 20.png
...
├── depth
├── 0.png
├── 10.png
├── 20.png
...
├── pose
├── 0.txt
├── 10.txt
├── 20.txt
align.txt
K.txt
global_map_mesh.clean.ply
...
Then you need to update the SMR root path mapping here,
Note that align.txt
is a transformation matrix (translation only) to shift the origin to approximately np.min(verts, axis=0)
. You can simply save it when creating the mesh.
And extract them under $EXP_DIR
. Then run the following command:
python sequential_runner_slamcore.py --exp_dir $EXP_DIR --scene kitchen1 --model_type LPN_rgb --mapping_every 20
This will save the results under $EXP_DIR/slamcore/kitchen1