MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer (CVPR 2022) [paper]
Kuan-Chih Huang, Tsung-Han Wu, Hung-Ting Su, Winston H. Hsu.
The code for the KITTI-360 dataset is now available in the kitti360 branch, and the results can be viewed on the KITTI-360 leaderboard.
Please refer to INSTALL.md for installation and to DATA.md for data preparation.
Move to root and train the network with $EXP_NAME
:
cd #MonoDTR_ROOT
./launcher/train.sh config/config.py 0 $EXP_NAME
Note: this repo only supports single GPU training. Also, the training randomness for monocular 3D object detection may cause the variance of ±1 AP3D.
To evaluate on the validation set using checkpoint $CHECKPOINT_PATH
:
./launcher/eval.sh config/config.py 0 $CHECKPOINT_PATH validation
We provide a good checkpoint for the car category on train/val split here.
If you find our work useful in your research, please consider citing:
@inproceedings{huang2022monodtr,
author = {Kuan-Chih Huang and Tsung-Han Wu and Hung-Ting Su and Winston H. Hsu},
title = {MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer},
booktitle = {CVPR},
year = {2022}
}
Our codes are mainly based on visualDet3D, and also benefits from CaDDN, MonoDLE, and LoFTR. Thanks for their contributions!
This project is released under the MIT License.