Note that the mIoU is silighter higher than those in our paper (see logs for more details). Bellow are several changes in the training phase:
- We use SyncBN instead of BN/GN, which enhances the mIoU by ~0.3.
- The total training epoch is 15 instead of 24, which significantly reduces the training time.
- The models are trained with annotation-v0.1 (with less occupancy artifacts).
Subset | Checkpoint | Logs | Note |
---|---|---|---|
Camera-based baseline | link (code:tlif) | link (code:ahqs) | train on 8 RTX3090 |
LiDAR-based baseline | link (code:qdsl) | link (code:p3ra) | train on 8 RTX3090 |
Multimodal baseline | link (code:d3vl) | link (code:f5qq) | train on 8 RTX3090 |
Camera-based CONet | link (code:630w) | link (code:jb9o) | train on 8 A100 |
LiDAR-based CONet | link (code:hnaf) | link (code:hqto) | train on 8 RTX3090 |
Multimodal CONet | link (code:k9p9) | link (code:t7c5) | train on 8 A100 |
Google Drive link