DIRV: Dense Interaction Region Voting for End-to-End Human-Object Interaction Detection

Official code implementation for the paper "DIRV: Dense Interaction Region Voting for End-to-End Human-Object Interaction Detection" (AAAI 2021) paper.

The code is developed based on the architecture of zylo117/Yet-Another-EfficientDet-Pytorch. We also follow some data pre-processing and model evaluation methods in BigRedT/no_frills_hoi_det and vt-vl-lab/iCAN. We sincerely thank the authors for the excellent work.

Checklist

Training and Test for V-COCO dataset
Training and Test for HICO-DET dataset
Demonstration on images
Demonstration on videos
More efficient voting strategy for inference using GPU

Prerequisites

The code was tested with python 3.6, pytorch 1.5.1, torchvision 0.6.1, CUDA 10.2, and Ubuntu 18.04.

Installation

Clone this repository:

git clone https://github.com/MVIG-SJTU/DIRV.git

Install pytorch and torchvision:

pip install torch==1.5.1 torchvision==0.6.1

Install other necessary packages:

pip install pycocotools numpy opencv-python tqdm tensorboard tensorboardX pyyaml webcolors

Data Preparation

V-COCO Dataset:

Download V-COCO dataset following the official instructions.

You can find the files new_prior_mask.pkl here. Each element inside it refers to the prior probability that a verb (e.g. eat) is associated with an object category (e.g. apple). You should also download the combined training and valdataion sets annotations instances_trainval2014.json here, and put it in datasets/vcoco/coco/annotations.

HICO-DET Dataset:

Download HICO-DET dataset from the official website.

We transform the annotations of HICO-DET dataset to JSON format following BigRedT/no_frills_hoi_det. You can directly download the processed annotations from here.

We count the training sample number of each category in hico_processed/hico-det_verb_count.json. It serves as a weight when calculating loss.

Dataset Structure:

Make sure to put the files in the following structure:

|-- datasets
|   |-- vcoco
|	|	|-- data
|	|	|	|-- splits
|	|	|	|-- vcoco
|	|	|
|	|	|-- coco
|	| 	|	|-- images
|	|	|	|-- annotations
|	|	|-- new_prior_mask.pkl   
|   |-- hico_20160224_det
|	|	|-- images
|	|	|-- hico_processed

Demonstration

Demonstration on Images

CUDA_VISIBLE_DEVICES=0 python demo.py --image_path /path/to/a/single/image

Demonstration on Videos

Coming soon.

Pre-trained Weights

You can download the pre-trained weights for V-COCO dataset (vcoco_best.pth) and HICO-DET dataset (hico-det_best.pth) here.

Training

Download the pre-trained weight of our backbone (efficientdet-d3_vcoco.pth and efficientdet-d3_hico-det.pth) here, and save it in weights/ directory.

Training on V-COCO Dataset

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py -p vcoco --batch_size 32 --load_weights weights/efficientdet-d3_vcoco.pth

Training on HICO-DET Dataset

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python train.py -p hico-det --batch_size 48 --load_weights weights/efficientdet-d3_hico-det.pth

You may also adjust the saving directory and GPU number in projects/vcoco.yaml and projects/hico-det.yaml or create your own projects in projects/.

Test

Test on V-COCO Dataset

CUDA_VISIBLE_DEVICES=0 python test_vcoco.py -w $path to the checkpoint$

Test on HICO-DET Dataset

CUDA_VISIBLE_DEVICES=0 python test_hico-det.py -w $path to the checkpoint$

Then please follow the same procedures in vt-vl-lab/iCAN to evaluate the result on HICO-DET dataset.

Citation

If you found our paper or code useful for your research, please cite the following paper:

@inproceedings{fang2020dirv,
      title={DIRV: Dense Interaction Region Voting for End-to-End Human-Object Interaction Detection}, 
      author={Fang, Hao-Shu and Xie, Yichen and Shao, Dian and Lu, Cewu},
      year={2021},
      booktitle = {The AAAI Conference on Artificial Intelligence (AAAI)}
}

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
__pycache__		__pycache__
efficientdet		efficientdet
efficientnet		efficientnet
projects		projects
test		test
utils		utils
.gitignore		.gitignore
Generate_HICO_detection.py		Generate_HICO_detection.py
README.md		README.md
backbone.py		backbone.py
coco_eval.py		coco_eval.py
compare.png		compare.png
demo.py		demo.py
test_hico-det.py		test_hico-det.py
test_vcoco.py		test_vcoco.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DIRV: Dense Interaction Region Voting for End-to-End Human-Object Interaction Detection

Checklist

Prerequisites

Installation

Data Preparation

V-COCO Dataset:

HICO-DET Dataset:

Dataset Structure:

Demonstration

Demonstration on Images

Demonstration on Videos

Pre-trained Weights

Training

Training on V-COCO Dataset

Training on HICO-DET Dataset

Test

Test on V-COCO Dataset

Test on HICO-DET Dataset

Citation

About

Releases

Packages

Contributors 2

Languages

MVIG-SJTU/DIRV

Folders and files

Latest commit

History

Repository files navigation

DIRV: Dense Interaction Region Voting for End-to-End Human-Object Interaction Detection

Checklist

Prerequisites

Installation

Data Preparation

V-COCO Dataset:

HICO-DET Dataset:

Dataset Structure:

Demonstration

Demonstration on Images

Demonstration on Videos

Pre-trained Weights

Training

Training on V-COCO Dataset

Training on HICO-DET Dataset

Test

Test on V-COCO Dataset

Test on HICO-DET Dataset

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages