The 2nd place solution of 2021 google landmark retrieval on kaggle.
We use cuda 11.1/python 3.7/torch 1.9.1/torchvision 0.8.1 for training and testing.
Download imagenet pretrained model ResNeXt101ibn and SEResNet101ibn from IBN-Net. ResNest101 and ResNeSt269 can be found in ResNest.
-
Download GLDv2 full version from the official site.
-
Run
python tools/generate_gld_list.py
. This will generateclean
,c2x
,trainfull
andall
data for different stage of training. -
Validation annotation comes from all 1129 images in GLDv2. We expand the competition index set to index_expand. Each query could find all its GTs in the expanded index set and the validation could be more accurate.
We use 8 GPU (32GB/16GB) for training. The evaluation metric in landmark retrieval is different from person re-identification. Due to the validation scale, we skip the validation stage during training and just use the model from last epoch for evaluation.
To make quick experiments, we provide scripts for R50_256
trained for clean
subset. This setting trains very fast and is helpful for debug.
python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node=8 --master_port 55555 --max_restarts 0 train.py --config_file configs/GLDv2/R50_256.yml
The whole training pipeline for SER101ibn backbone is listed below. Other backbones and input size can be modified accordingly.
python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node=8 --master_port 55555 --max_restarts 0 train.py --config_file configs/GLDv2/SER101ibn_384.yml
python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node=8 --master_port 55555 --max_restarts 0 train.py --config_file configs/GLDv2/SER101ibn_384_finetune.yml
python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node=8 --master_port 55555 --max_restarts 0 train.py --config_file configs/GLDv2/SER101ibn_512_finetune.yml
python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node=8 --master_port 55555 --max_restarts 0 train.py --config_file configs/GLDv2/SER101ibn_512_all.yml
-
With four models trained, cd to
submission/code/
and modifysettings
inlandmark_retrieval.py
properly. -
Then run
eval_retrieval.sh
to get submission file and evaluate on validation set offline.
REID_EXTRACT_FLAG: Skip feature extraction when using offline code.
FEAT_DIR: Save cached features.
IMAGE_DIR: competition image dir. We make a soft link for competition data at submission/input/landmark-retrieval-2021/
RAW_IMAGE_DIR: origin GLDv2 dir
MODEL_DIR: the latest models for submission
META_DIR: saves meta files for rerank purpose
LOCAL_MATCHING and KR_FLAG disabled for our submission.
Use R50_256
model trained from clean
subset correspongding to the fast train script. Set CATEGORY_RERANK
and REF_SET_EXTRACT
to False. You will get about mAP=32.84% for the validation set.
-
Copy
cache_all_list.pkl
,cache_index_train_list.pkl
andcache_full_list.pkl
fromcache
tosubmission/input/meta-data-final
-
Set
REF_SET_EXTRACT
toTrue
to extract features forall
images of GLDv2. This will save about 4.9 million 512 dim features for each model insubmission/input/meta-data-final
. -
Set
REF_SET_EXTRACT
toFalse
andCATEGORY_RERANK
tobefore_merge
. This will load the precomputed features and run the proposed Landmark-Country aware rerank. -
The notebooks on kaggle is exactly the same file as in
base_landmark.py
andlandmark_retrieval.py
. We also upload the same notebooks as in kaggle inkaggle.ipynb
.
-
The challenge is held on kaggle and the leaderboard can be found here. We rank 2nd(2/263) in this challenge.
-
Kaggle Discussion post link here
The code is motivated by AICITY2021_Track2_DMT, 2020_1st_recognition_solution, 2020_2nd_recognition_solution, 2020_1st_retrieval_solution.
If you find our work useful in your research, please consider citing:
@inproceedings{zhang2021landmark,
title={2nd Place Solution to Google Landmark Retrieval 2021},
author={Zhang, Yuqi and Xu, Xianzhe and Chen, Weihua and Wang, Yaohua and Zhang, Fangyi and Wang Fan and Li Hao},
journal={arXiv preprint arXiv:2110.04294},
year={2021}
}