This is the implementation of LargeKernel3D (CVPR 2023). Large kernels are important but expensive in 3D CNNs. We propose spatial-wise partition to conv enable 3D large kernels. High performance on 3D semantic segmentation & object detection. For more details, please refer to:
LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs [Paper]
Yukang Chen, Jianhui Liu, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia
nuScenes Object Detection | Set | mAP | NDS | Download |
---|---|---|---|---|
LargeKernel3D | val | 63.3 | 69.1 | Pre-trained |
LargeKernel3D | test | 65.4 | 70.6 | Pre-trained Submission |
+test aug | test | 68.7 | 72.8 | Submission |
LargeKernel3D-F | test | - | - | Pre-trained |
+test aug | test | 71.1 | 74.2 | Submission |
ScanNetv2 Semantic Segmentation | Set | mIoU | Download |
---|---|---|---|
LargeKernel3D | val | 73.5 | [Pre-trained] |
LargeKernel3D | test | 73.9 | [Submission] |
If you find this project useful in your research, please consider citing:
@inproceedings{chen2023largekernel3d,
title={LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs},
author={Yukang Chen and Jianhui Liu and Xiangyu Zhang and Xiaojuan Qi and Jiaya Jia},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2023}
}
- This work is built upon FocalsConv for object detection.
- This work is built upon Stratified-Transformer for semantic segmentation.
- VoxelNeXt (CVPR 2023) [Paper] [Code] Fully Sparse VoxelNet for 3D Object Detection and Tracking.
- Focal Sparse Conv (CVPR 2022 Oral) [Paper] [Code] Dynamic sparse convolution for high performance.
- Spatial Pruned Conv (NeurIPS 2022) [Paper] [Code] 50% FLOPs saving for efficient 3D object detection.
- LargeKernel3D (CVPR 2023) [Paper] [Code] Large-kernel 3D sparse CNN backbone.
- SphereFormer (CVPR 2023) [Paper] [Code] Spherical window 3D transformer backbone.
- spconv-plus A library where we combine our works into spconv.
- SparseTransformer A library that includes high-efficiency transformer implementations for sparse point cloud or voxel data.
This project is released under the Apache 2.0 license.