This is the implementation of the paper "Hypercorrelation Squeeze for Few-Shot Segmentation" by Juhong Min, Dahyun Kang, and Minsu Cho. Implemented on Python 3.7 and Pytorch 1.5.1.
For more information, check out project [website] and the paper on [arXiv].
- Python 3.7
- PyTorch 1.5.1
- cuda 10.1
- tensorboard 1.14
Conda environment settings:
conda create -n hsnet python=3.7
conda activate hsnet
conda install pytorch=1.5.1 torchvision cudatoolkit=10.1 -c pytorch
conda install -c conda-forge tensorflow
pip install tensorboardX
Download following datasets:
Download PASCAL VOC2012 devkit (train/val data):
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tarDownload PASCAL VOC2012 SDS extended mask annotations from our [Google Drive].
Download COCO2014 train/val images and annotations:
wget http://images.cocodataset.org/zips/train2014.zip wget http://images.cocodataset.org/zips/val2014.zip wget http://images.cocodataset.org/annotations/annotations_trainval2014.zipDownload COCO2014 train/val annotations from our Google Drive: [train2014.zip], [val2014.zip]. (and locate both train2014/ and val2014/ under annotations/ directory).
Download FSS-1000 images and annotations from our [Google Drive].
Create a directory '../Datasets_HSN' for the above three few-shot segmentation datasets and appropriately place each dataset to have following directory structure:
../ # parent directory
├── ./ # current (project) directory
│ ├── common/ # (dir.) helper functions
│ ├── data/ # (dir.) dataloaders and splits for each FSSS dataset
│ ├── model/ # (dir.) implementation of Hypercorrelation Squeeze Network model
│ ├── README.md # intstruction for reproduction
│ ├── train.py # code for training HSNet
│ └── test.py # code for testing HSNet
└── Datasets_HSN/
├── VOC2012/ # PASCAL VOC2012 devkit
│ ├── Annotations/
│ ├── ImageSets/
│ ├── ...
│ └── SegmentationClassAug/
├── COCO2014/
│ ├── annotations/
│ │ ├── train2014/ # (dir.) training masks (from Google Drive)
│ │ ├── val2014/ # (dir.) validation masks (from Google Drive)
│ │ └── ..some json files..
│ ├── train2014/
│ └── val2014/
└── FSS-1000/ # (dir.) contains 1000 object classes
├── abacus/
├── ...
└── zucchini/
python train.py --backbone {vgg16, resnet50, resnet101} --fold {0, 1, 2, 3} --benchmark pascal --lr 1e-3 --bsz 20 --logpath "your_experiment_name"
- Training takes approx. 2 days until convergence (trained with four 2080 Ti GPUs).
python train.py --backbone {resnet50, resnet101} --fold {0, 1, 2, 3} --benchmark coco --lr 1e-3 --bsz 40 --logpath "your_experiment_name"
- Training takes approx. 1 week until convergence (trained four Titan RTX GPUs).
python train.py --backbone {vgg16, resnet50, resnet101} --benchmark fss --lr 1e-3 --bsz 20 --logpath "your_experiment_name"
- Training takes approx. 3 days until convergence (trained with four 2080 Ti GPUs).
Use tensorboard to babysit training progress:
- For each experiment, a directory that logs training progress will be automatically generated under logs/ directory.
- From terminal, run 'tensorboard --logdir logs/' to monitor the training progress.
- Choose the best model when the validation (mIoU) curve starts to saturate.
Pretrained models with tensorboard logs are available on our [Google Drive].
python test.py --backbone {vgg16, resnet50, resnet101} --fold {0, 1, 2, 3} --benchmark pascal --nshot {1, 5} --load "path_to_trained_model/best_model.pt"
Pretrained models with tensorboard logs are available on our [Google Drive].
python test.py --backbone {resnet50, resnet101} --fold {0, 1, 2, 3} --benchmark coco --nshot {1, 5} --load "path_to_trained_model/best_model.pt"
Pretrained models with tensorboard logs are available on our [Google Drive].
python test.py --backbone {vgg16, resnet50, resnet101} --benchmark fss --nshot {1, 5} --load "path_to_trained_model/best_model.pt"
- To reproduce the results in Tab.1 of our main paper, COMMENT OUT line 51 in hsnet.py: support_feats = self.mask_feature(support_feats, support_mask.clone())
Pretrained models with tensorboard logs are available on our [Google Drive].
python test.py --backbone resnet101 --fold {0, 1, 2, 3} --benchmark pascal --nshot {1, 5} --load "path_to_trained_model/best_model.pt"
- To visualize mask predictions, add command line argument --visualize: (prediction results will be saved under vis/ directory)
python test.py '...other arguments...' --visualize
If you use this code for your research, please consider citing:
@InProceedings{min2021hypercorrelation,
title={Hypercorrelation Squeeze for Few-Shot Segmentation},
author={Juhong Min and Dahyun Kang and Minsu Cho},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
year={2021}
}