Skip to content

Latest commit

 

History

History
38 lines (25 loc) · 2.11 KB

README.md

File metadata and controls

38 lines (25 loc) · 2.11 KB

Code for ANetQA baselines

This repository contains code for our baselines, namely HCRN, ClipBERT, and All-in-one, which is migrated from their original implementations to fit our data structure.

Dataset Preparation

see dataset for details.

Model Setup, Training, and Testing

see the HCRN, ClipBERT, and All-in-one folders for details.

Results

The above baseline models are trained on the train set and evaluated on the val,test-dev and test sets, respectively.

model val set test-tiny test-dev test weights
HCRN 41.69 41.57 41.18 41.13 ckpt
ClipBERT 44.34 43.90 44.00 43.91 ckpt
All-in-one 45.44 44.27 44.57 44.53 ckpt

License

This project is licensed under the Apache License 2.0.

Citation

If you use ANetQA in your research, we appreciate it if you cite our paper in the following.

@inproceedings{yu2023anetqa,
title={ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos},
   author={Yu, Zhou and Zheng, Lixiang and Zhao, Zhou and Wu, Fei and Fan, Jianping and Ren, Kui and Yu, Jun},
   booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
   year={2023}
}