"DenoDet: Attention as Deformable Multi-Subspace Feature Denoising for Target Detection in SAR Images" at: https://arxiv.org/pdf/2406.02833
"SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection" at: https://arxiv.org/pdf/2403.06534.pdf
Yuxuan Li, Xiang Li*, Weijie Li, Qibin Hou, Li Liu, Ming-ming Cheng, Jian Yang*
李宇轩,李翔*,李玮杰,侯淇彬,刘丽,程明明,杨健*
面对2024年的科研战场,你是否感到了前所未有的科研压力?计算机视觉的各大任务仿佛已经达到了饱和点,榜单一次又一次被那些拥有海量数据和计算资源的大模型所主宰。对于我们这些还在校园里奋斗的学生来说,手头的资源始终显得那么微不足道,似乎永远也竞争不过那些垄断科研的巨头。每一个方向似乎都被人挖得满满的,想发表一篇paper变得越来越难,快毕业的你是否也在焦虑地寻找一个真正属于自己的研究方向?多少次,我们都幻想能回到15年前ImageNet发布的那个夜晚,那时候没有群魔乱舞的注意力机制,也没有数以Billion计参数的大模型,每一个任务都是一片待开垦的蓝海……
但是今天,告诉你,有这么一个方向,曾一度因为缺乏大规模数据集和寥寥无几的开源代码而默默无闻,它的发展似乎一直滞后。但随着开源大规模数据集的诞生和完善的代码库的出现,之前的难题统统烟消云散。现在,就如同被一道闪电击中,穿越回了数年前,还有一片未被充分开发的蓝海正展现在你的眼前,等待你去探索、去征服。
所以,你想知道我说的是什么领域吗?今天,你不需要支付998,也不需要98,只需在 GitHub 上给我们一个Star,SAR目标检测的大礼包就能免费带回家!这里有你需要的一切:从大规模数据集到详尽的实现代码,我们都为你准备好了。是的,我在说的就是遥感目标检测的掌上明珠:SAR(合成孔径雷达)目标检测!可能听起来有些神秘,但这个领域的潜力和含金量,真正懂的人都心照不宣,DDDD,YYDS!它不仅完美契合当前的国家战略需求,而且无论是在科研界还是工业界,都有着广阔的应用前景和无限的可能性。近年来,在SAR检测领域发表文章变得更加容易,这预示着这一领域的迅速发展和对新思想、新技术的渴求。现在就加入SAR目标检测的行列吧!它就像15年前的比特币,20年前的房地产,现在入局不后悔!让我们一起驰骋在这片广袤的蓝海之中,探索它的未知之处,共同开拓它的无限可能!
Facing the scientific research battlefield of 2024, do you feel unprecedented pressure in research? The major tasks in Computer Vision seem to have reached a saturation point, with the charts being dominated time and again by large models that have access to massive data and computational resources. For those of us still striving in academia, our resources always seem so insignificant, as if we could never compete with those giants monopolizing research. Every direction seems to be thoroughly explored, making it increasingly difficult to publish a paper. Are you, about to graduate, also anxiously looking for a research direction that truly belongs to you? How many times have we fantasized about going back to the night when ImageNet was released 15 years ago, when there was no chaotic dance of attention mechanisms, no models with billions of parameters, and every task was an uncharted blue ocean...
But today, let me tell you my friend, about a direction that once lingered in obscurity due to the lack of large-scale datasets and scarce open-source codes, its development seemed to be always lagging. But with the emergence of open-source large-scale datasets and comprehensive code libraries, all previous problems have vanished like smoke. Now, as if struck by lightning and transported back several years, there is an undeveloped blue ocean in front of you, waiting for you to explore and conquer.
So, do you want to know what field I am talking about? Today, you don't need to pay 998, not even 98, just give us a Star on GitHub, and the SAR object detection big gift pack can be taken home for free! Here you have everything you need: from large-scale datasets to detailed implementation codes, we have prepared everything for you. Yes, I am talking about the jewel in the crown of remote sensing target detection: SAR (Synthetic Aperture Radar) object detection! It might sound mysterious, but the potential and value of this field are well understood by those in the know. It not only perfectly matches the current national strategic needs but also has a wide range of applications and limitless possibilities in both the scientific and industrial communities. In recent years, publishing articles in the SAR detection field has become easier, indicating the rapid development of this field and its thirst for new ideas and technologies. Join the ranks of SAR object detection now! It's like Bitcoin 15 years ago, real estate 20 years ago, getting involved now is something you won't regret! Let's gallop together in this vast blue ocean, explore its unknowns, and jointly tap into its limitless possibilities!
Synthetic Aperture Radar (SAR) object detection has gained significant attention recently due to its irreplaceable all-weather imaging capabilities. However, this research field suffers from both limited public datasets (mostly comprising <2K images with only mono-category objects) and inaccessible source code. To tackle these challenges, we establish a new benchmark dataset and an open-source method for large-scale SAR object detection. Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets, providing a large-scale and diverse dataset for research purposes. To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created. With this high-quality dataset, we conducted comprehensive experiments and uncovered a crucial challenge in SAR object detection: the substantial disparities between the pretraining on RGB datasets and finetuning on SAR datasets in terms of both data domain and model structure. To bridge these gaps, we propose a novel Multi-Stage with Filter Augmentation (MSFA) pretraining framework that tackles the problems from the perspective of data input, domain transition, and model migration. The proposed MSFA method significantly enhances the performance of SAR object detection models while demonstrating exceptional generalizability and flexibility across diverse models. This work aims to pave the way for further advancements in SAR object detection.
This repository is the official site for "SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection" at: https://arxiv.org/pdf/2403.06534.pdf
DATASET DOWNLOAD at:
(Train, Val, Test)
Model Weights DOWNLOAD at:
(Only Train and Val sets are released so far.)
Dataset | Images | Instances | Ins/Img | ||||||
Train | Val | Test | ALL | Train | Val | Test | ALL | ||
AIR_SARShip 1* | 438 | 23 | 40 | 501 | 816 | 33 | 209 | 1,058 | 2.11 |
AIR_SARShip 2 | 270 | 15 | 15 | 300 | 1,819 | 127 | 94 | 2,040 | 6.80 |
HRSID | 3,642 | 981 | 981 | 5,604 | 11,047 | 2,975 | 2,947 | 16,969 | 3.03 |
MSAR* | 27,159 | 1,479 | 1,520 | 30,158 | 58,988 | 3,091 | 3,123 | 65,202 | 2.16 |
SADD | 795 | 44 | 44 | 883 | 6,891 | 448 | 496 | 7,835 | 8.87 |
SAR-AIRcraft* | 13,976 | 1,923 | 2,989 | 18,888 | 27,848 | 4,631 | 5,996 | 38,475 | 2.04 |
ShipDataset | 31,784 | 3,973 | 3,972 | 39,729 | 40,761 | 5,080 | 5,044 | 50,885 | 1.28 |
SSDD | 928 | 116 | 116 | 1,160 | 2,041 | 252 | 294 | 2,587 | 2.23 |
OGSOD | 14,664 | 1,834 | 1,833 | 18,331 | 38,975 | 4,844 | 4,770 | 48,589 | 2.65 |
SIVED | 837 | 104 | 103 | 1,044 | 9,561 | 1,222 | 1,230 | 12,013 | 11.51 |
SARDet-100k | 94,493 | 10,492 | 11,613 | 116,598 | 198,747 | 22,703 | 24,023 | 245,653 | 2.11 |
Datasets | Target | Res. (m) | Band | Polarization | Satellites |
AIR_SARShip | S | 1,3m | C | VV | GF-3 |
HRSID | S | 0.5~3m | C/X | HH, HV, VH, VV | S-1B,TerraSAR-X,TanDEMX |
MSAR | A, T, B, S | < 1m | C | HH, HV, VH, VV | HISEA-1 |
SADD | A | 0.5~3m | X | HH | TerraSAR-X |
SAR-AIRcraft | A | 1m | C | Uni-polar | GF-3 |
ShipDataset | S | 3~25m | C | HH, VV, VH, HV | S-1,GF-3 |
SSDD | S | 1~15m | C/X | HH, VV, VH, HV | S-1,RadarSat-2,TerraSAR-X |
OGSOD | B, H, T | 3m | C | VV/VH | GF-3 |
SIVED | C | 0.1,0.3m | Ka,Ku,X | VV/HH | Airborne SAR synthetic slice |
This repository is the official implementation of Multi-Stage with Filter Augmentation (MSFA) pretraining framework in "SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection"
Filter Augmentation code is placed under MSFA/msfa/models/backbones/MSFA.py
.
The code about SARDet-100K dataset is placed under MSFA/msfa/datasets/SAR_Det.py
.
The train/test configure files used in the main paper are placed under local_configs
.
Model Input | Pretrain | mAP | Config | Weight | ||
Multi-stage | Dataset | Component | ||||
SAR
(Raw pixels) |
✕
| ImageNet | Backbone | 49.0 | config | weight |
√ | ImageNet + DIOR | Framework | 49.5 | config | weight | |
√ | ImageNet + DOTA | Backbone | 49.3 | config | weight | |
Framework | 50.2 | config | weight | |||
SAR+WST
(Filter Augmented) | ✕ | ImageNet | Backbone | 49.2 | config | weight |
√ | ImageNet + DIOR | Framework | 50.1 | config | weight | |
√ | ImageNet + DOTA | Backbone | 49.6 | config | weight | |
Framework | 51.1 | config | weight |
Framework | Pretrain/Model | Test | Config | Weight | ||||||
mAP | @50 | @75 | @s | @m | @l | |||||
Two
Stage | Faster RCNN | IMP | 49.0 | 82.2 | 52.9 | 43.5 | 60.6 | 55.0 | config | weight |
MSFA | 51.1 (+2.1) | 83.9 | 54.7 | 45.2 | 62.3 | 57.5 | config | weight | ||
Cascade RCNN | IMP | 51.1 | 81.9 | 55.8 | 44.9 | 62.9 | 60.3 | config | weight | |
MSFA | 53.9 (+2.8) | 83.4 | 59.8 | 47.2 | 66.1 | 63.2 | config | weight | ||
Grid RCNN | IMP | 48.8 | 79.1 | 52.9 | 42.4 | 61.9 | 55.5 | config | weight | |
MSFA | 51.5 (+2.7) | 81.7 | 56.3 | 45.1 | 64.1 | 60.0 | config | weight | ||
Single
Stage | RetinaNet | IMP | 47.4 | 79.3 | 49.7 | 40.0 | 59.2 | 57.5 | config | weight |
MSFA | 49.0 (+1.6) | 80.1 | 52.6 | 41.3 | 61.1 | 59.4 | config | weight | ||
GFL | IMP | 49.8 | 80.9 | 53.3 | 42.3 | 62.4 | 58.1 | config | weight | |
MSFA | 53.7 (+3.9) | 84.2 | 57.8 | 47.8 | 66.2 | 59.5 | config | weight | ||
DenoDet | 55.4 (+5.6) | 84.7 | 58.3 | 49.5 | 67.6 | 63.2 | config | weight | ||
FCOS | IMP | 46.5 | 80.9 | 49.0 | 41.1 | 59.2 | 50.4 | config | weight | |
MSFA | 48.5 (+2.0) | 82.1 | 51.4 | 42.9 | 60.4 | 56.0 | config | weight | ||
End to
End | DETR | IMP | 31.8 | 62.3 | 30.0 | 22.2 | 44.9 | 41.1 | config | weight |
MSFA | 47.2 (+15.4) | 77.5 | 49.8 | 37.9 | 62.9 | 58.2 | config | weight | ||
Deformable DETR | IMP | 50.0 | 85.1 | 51.7 | 44.0 | 65.1 | 61.2 | config | weight | |
MSFA | 51.3 (+1.3) | 85.3 | 54.0 | 44.9 | 65.6 | 61.7 | config | weight | ||
Sparse RCNN | IMP | 38.1 | 68.8 | 38.8 | 29.0 | 51.3 | 48.7 | config | weight | |
MSFA | 41.4 (+3.3) | 74.1 | 41.8 | 33.6 | 53.9 | 53.4 | config | weight | ||
Dab-DETR | IMP | 45.9 | 79.0 | 47.9 | 38.0 | 61.1 | 55.0 | config | weight | |
MSFA | 48.2 (+2.3) | 81.1 | 51.0 | 41.2 | 63.1 | 55.4 | config | weight |
Framework | #P(M) | Pretrain | Test | Config | Weight | |||||
mAP | @50 | @75 | @s | @m | @l | |||||
R50 | 25.6 | IMP | 49.0 | 82.2 | 52.9 | 43.5 | 60.6 | 55.0 | config | weight |
MSFA | 51.1 (+2.1) | 83.9 | 54.7 | 45.2 | 62.3 | 57.5 | config | weight | ||
R101 | 44.7 | IMP | 51.2 | 84.1 | 55.6 | 45.9 | 61.9 | 56.3 | config | weight |
MSFA | 52.0 (+0.8) | 84.6 | 56.6 | 46.6 | 63.4 | 57.7 | config | weight | ||
R152 | 60.2 | IMP | 51.9 | 85.2 | 55.9 | 46.4 | 62.5 | 57.9 | config | weight |
MSFA | 52.4 (+0.5) | 85.4 | 57.2 | 47.4 | 63.3 | 58.7 | config | weight | ||
ConvNext-T | 28.6 | IMP | 53.2 | 86.3 | 58.1 | 47.2 | 65.2 | 59.6 | config | weight |
MSFA | 54.8 (+1.6) | 87.1 | 59.8 | 48.8 | 66.7 | 62.1 | config | weight | ||
ConvNext-S | 50.1 | IMP | 54.2 | 87.8 | 59.2 | 49.2 | 65.8 | 59.8 | config | weight |
MSFA | 55.4 (+1.2) | 87.6 | 60.7 | 50.1 | 67.1 | 61.3 | config | weight | ||
ConvNext-B | 88.6 | IMP | 55.1 | 87.8 | 59.5 | 48.9 | 66.9 | 61.1 | config | weight |
MSFA | 56.4 (+1.3) | 88.2 | 61.5 | 51.1 | 68.3 | 62.4 | config | weight | ||
VAN-T | 4.1 | IMP | 45.8 | 79.8 | 48.0 | 38.6 | 57.9 | 53.3 | config | weight |
MSFA | 47.6 (+1.8) | 81.4 | 50.6 | 40.5 | 59.4 | 56.7 | config | weight | ||
VAN-S | 13.9 | IMP | 49.5 | 83.8 | 52.8 | 43.2 | 61.6 | 56.4 | config | weight |
MSFA | 51.5 (+2.0) | 85.0 | 55.6 | 44.8 | 63.4 | 60.4 | config | weight | ||
VAN-B | 26.6 | IMP | 53.5 | 86.8 | 58.0 | 47.3 | 65.5 | 60.6 | config | weight |
MSFA | 55.1 (+1.6) | 87.7 | 60.2 | 48.8 | 67.3 | 62.2 | config | weight | ||
Swin-T | 28.3 | IMP | 48.4 | 83.5 | 50.8 | 42.8 | 59.7 | 55.7 | config | weight |
MSFA | 50.2 (+1.8) | 84.1 | 53.9 | 44.1 | 61.3 | 58.8 | config | weight | ||
Swin-S | 49.6 | IMP | 53.1 | 87.3 | 57.8 | 47.4 | 63.9 | 60.6 | config | weight |
MSFA | 54.0 (+0.9) | 87.0 | 59.2 | 48.2 | 64.5 | 61.9 | config | weight | ||
Swin-B | 87.8 | IMP | 53.8 | 87.8 | 59.0 | 49.1 | 64.6 | 60.0 | config | weight |
MSFA | 55.7 (+1.9) | 87.8 | 61.4 | 50.5 | 66.5 | 62.5 | config | weight |
Our code depends on PyTorch, MMCV and MMDetection. Below are quick steps for installation. Please refer to Install Guide for more detailed instruction.
# change directory into the project main code
cd MSFA
# create env
conda create -y -n MSFA python=3.8
conda activate MSFA
# install pytorch
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.8 -c pytorch -c nvidia
# or
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
# install dependencies of openmmlab
pip install -U openmim
mim install "mmengine==0.8.4"
mim install "mmcv==2.0.1"
mim install "mmdet==3.1.0"
# install other dependencies
pip install -r requirements.txt
# install MSFA
pip install -v -e .
Please see get_started.md for the basic usage of MMDetection.
We extend our deepest gratitude to Bo Zhang, Chenglong Li, Tian Tian, Tianwen Zhang, Xiaoling Zhang (ordered alphabetically by first name) and numerous other researchers for permitting us to integrate their datasets. Their contributions have significantly advanced and promoted research in this field
If you use this toolbox or benchmark in your research, please cite this project.
@inproceedings{li2024sardet100k,
title={SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection},
author={Yuxuan Li and Xiang Li and Weijie Li and Qibin Hou and Li Liu and Ming-Ming Cheng and Jian Yang},
year={2024},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS)},
}
@article{dai2024denodet,
title={DenoDet: Attention as Deformable Multi-Subspace Feature Denoising for Target Detection in SAR Images},
author={Dai, Yimian and Zou, Minrui and Li, Yuxuan and Li, Xiang and Ni, Kang and Yang, Jian},
journal={arXiv preprint arXiv:2406.02833},
year={2024}
}
This project is released under the Attribution-NonCommercial 4.0 International.