Repository providing the source code for the paper
Perception Matters: Enhancing Embodied AI with Uncertainty-Aware Semantic Segmentation Sai Prasanna, Daniel Honerkamp* Kshitij Sirohi*, Tim Welschehold, Wolfram Burgard and Abhinav Valada
Please cite the paper as follows:
@article{prasanna2024perception,
title={Perception Matters: Enhancing Embodied AI with Uncertainty-Aware Semantic Segmentation},
author={Sai Prasanna and Daniel Honerkamp and Kshitij Sirohi and Tim Welschehold and Wolfram Burgard and Abhinav Valada},
journal={Proceedings of the International Symposium on Robotics Research (ISRR)},
year={2024}
}
- You have to obtain the API user name and token for hm3d dataset from matterport by following their instructions. Set these as environment variables
export USERNAME=<API_TOKEN_USER_ID>
export PASSWORD=<API_TOKEN>
. - Run the setup.sh to create the conda environment.
- Download the EMSANet checkpoint from
https://drive.google.com/uc?id=1LD4_g-jL4KJPRUmCGgXxx2xGQ7TNZ_o2
and extract ittar -xvf checkpoint.tar.gz -C ./third_party/trained_models/
To evaluate the aggregation approaches with the shortest path policy, run
./scripts/eval_sp_policy_emsanet.sh
./scripts/eval_sp_policy_maskrcnn.sh
./scripts/eval_sp_policy_segformer.sh
To train the RL policy on ground truth semantics and evaluate it with different semantic models and aggregation approaches, run
./scripts/train_rl_policy.sh
./scripts/eval_rl_policy_emsanet.sh
./scripts/eval_rl_policy_maskrcnn.sh
./scripts/eval_rl_policy_segformer.sh
- Collect the data for calibrating the perception model. Run
python -m sem_objnav.obj_nav.collect_seg_data --output_dir calibation_dataset
- Check the notebooks
sem_objnav/notebooks/emsanet_scaling_temp.ipynb
andsem_objnav/notebooks/segformer_scaling_temp.ipynb
for calibation.
To collect data and train the models used in stubborn, run ./scripts/train_stubborn.sh
.
To find optimal hyperparameters for the aggregation strategies, run ./scripts/htune.sh
.