Skip to content

oncescuandreea/QuerYD_downloader

Repository files navigation

QuerYD_download

This is a tool to allow for easy download of the videos forming the QuerYD dataset.

Installing necessary libraries:

  • argparse
  • pytube python -m pip install git+https://github.com/nficano/pytube
  • pathlib
  • logging
  • multiprocessing
  • requests
  • tqdm
  • json
  • zsvision

Version 2 of the dataset has been added on 8th of April

Downloading videos
To test the download videos script for the QuerYD dataset simply run:

python download_queryd.py --txt_file relevant-video-links-test.txt --task download_videos

This will create a folder called videos in your current folder and videos will be saved there. To fully run the download_videos script run:

python download_queryd.py --txt_file relevant-video-links-{version either v1 or v2}.txt --task download_videos

To only download videos with non-english descriptions run the download_videos script run:

python download_queryd.py --txt_file relevant-non-en-links.txt --task download_videos

To attempt downloading videos multiple times, set the --tries flag to the desired value. By default the value is 2. Eg:

python download_queryd.py --txt_file relevant-video-links-{version either v1 or v2}.txt --tries 3 --task download_videos

To re-download all files use the --refresh flag. Eg:

python download_queryd.py --txt_file relevant-video-links-{version either v1 or v2}.txt --refresh --task download_videos

Downloading json metadata
To download the .json file containing information about the described videos run:

wget http://www.robots.ox.ac.uk/~vgg/research/collaborative-experts/data/QuerYD/json_metadata-{version either v1 or v2}.zip
mv json_metadata-{version either v1 or v2}.zip json_metadata.zip
unzip json_metadata.zip

Downloading audio description files
Audio files can be downloaded only after downloading the .json metadata using the previous step.
To download the audio description files corresponding to each video, run:

python download_queryd.py --txt_file relevant-video-links-{version either v1 or v2}.txt --task download_wavs

To download only the non-english audio descriptions run:

python download_queryd.py --txt_file relevant-non-en-links.txt --task download_wavs

To use more processes add the --processes flag with the number of CPUs available. eg:

python download_queryd.py --txt_file relevant-video-links-{version either v1 or v2}.txt --task download_wavs --processes 2

Downloading transcribed descriptions and corresponding time-stamps
The transcribed version of the audio descriptions can be downloaded as a pickle file by accessing the following link:

http://www.robots.ox.ac.uk/~vgg/research/collaborative-experts/data/QuerYD/raw_captions_combined_filtered-{version either v1 or v2}.pkl
mv raw_captions_combined_filtered-{version either v1 or v2}.pkl raw_captions_combined_filtered.pkl

The corresponding time-stamps in the same order are provided in this pickle file:

http://www.robots.ox.ac.uk/~vgg/research/collaborative-experts/data/QuerYD/times_captions_combined_filtered-{version either v1 or v2}.pkl
mv times_captions_combined_filtered-{version either v1 or v2}.pkl times_captions_combined_filtered.pkl

The confidence of the transcriptions in the same order as transcriptions are found here:

http://www.robots.ox.ac.uk/~vgg/research/collaborative-experts/data/QuerYD/confidence_captions_combined_filtered-{version either v1 or v2}.pkl
mv confidence_captions_combined_filtered-{version either v1 or v2}.pkl confidence_captions_combined_filtered.pkl

Downloading video features, descriptions and train/val/test splits
To download QuerYD data:

wget http://www.robots.ox.ac.uk/~vgg/research/collaborative-experts/data/features-v2/QuerYD-experts-{version either v1 or v2}.tar.gz
mv QuerYD-experts-{version either v1 or v2}.tar.gz QuerYD-experts.tar.gz

To download QuerYDSegments data (localised clips and their descriptions):

wget http://www.robots.ox.ac.uk/~vgg/research/collaborative-experts/data/features-v2/QuerYDSegments-experts-{version either v1 or v2}.tar.gz
mv QuerYDSegments-experts-{version either v1 or v2}.tar.gz QuerYDSegments-experts.tar.gz

More info and scripts used can be found at https://github.com/albanie/collaborative-experts#queryd and training and test steps can be followed from https://github.com/albanie/collaborative-experts#evaluating-a-pretrained-model where MSVD should be replaced by QuerYD or QuerYDSegments. Model names should be taken from retrieval results tables at https://github.com/albanie/collaborative-experts#queryd or https://github.com/albanie/collaborative-experts#querydsegments .

References

[1] If you find this code useful, please consider citing:

@misc{oncescu2021queryd,
      title={QuerYD: A video dataset with high-quality text and audio narrations}, 
      author={Andreea-Maria Oncescu and João F. Henriques and Yang Liu and Andrew Zisserman and Samuel Albanie},
      year={2021},
      eprint={2011.11071},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

[2] If you find this code useful or use the extracted features, please consider citing:

@inproceedings{Liu2019a,
  author    = {Liu, Y. and Albanie, S. and Nagrani, A. and Zisserman, A.},
  booktitle = {arXiv preprint arxiv:1907.13487},
  title     = {Use What You Have: Video retrieval using representations from collaborative experts},
  date      = {2019},
}

Acknowledgements

This work is supported by the EP-SRC (VisualAI EP/T028572/1 and DTA Studentship), and the Royal Academy of Engineering (DFR05420). We are gratefulto Sophia Koepke for her helpful comments and suggestions.

2nd Version of QuerYD retrieval results:

QuerYD

MODEL study on QUERYD

Importance of the model:

Model Task R@1 R@5 R@10 R@50 MdR MnR Geom params Links
HowTo100m S3D t2v 10.2(0.0) 24.5(0.0) 32.7(0.0) 54.3(0.0) 38.0(0.0) 82.1(0.0) 20.2(0.0) 1 config, model, log
CE - P,CG t2v 29.8(0.3) 63.8(0.5) 74.9(0.3) 93.0(0.1) 3.0(0.0) 15.1(0.4) 52.3(0.3) 57.75M config, model, log
CE t2v 31.9(1.5) 64.5(1.4) 76.1(0.8) 93.8(0.9) 3.0(0.0) 13.1(0.8) 53.9(0.7) 30.82M config, model, log
HowTo100m S3D v2t 10.0(0.0) 25.7(0.0) 32.3(0.0) 53.2(0.0) 42.0(0.0) 81.7(0.0) 20.2(0.0) 1 config, model, log
CE - P,CG v2t 28.6(1.1) 62.4(0.5) 73.6(0.8) 92.9(0.1) 3.0(0.0) 14.7(0.4) 50.8(0.7) 57.75M config, model, log
CE v2t 32.9(1.7) 64.9(1.1) 76.7(1.1) 93.6(0.6) 3.0(0.0) 12.8(0.6) 54.7(1.1) 30.82M config, model, log

The influence of different pretrained experts for the performance of the CE model trained on QuerYD is studied. The value and cumulative effect of different experts for scene clas-sification (SCENE), ambient sound classification (AUDIO),image classification (OBJECT), and action recognition (ACTION) are presented. PREV. denotes the experts used in the previous row.

Experts Task R@1 R@5 R@10 R@50 MdR MnR Geom params Links
Scene t2v 17.0(0.7) 47.0(2.4) 60.8(1.1) 85.4(1.6) 6.3(0.6) 27.2(1.1) 36.5(1.0) 7.51M config, model, log
Prev. + Audio t2v 21.4(0.2) 53.0(1.3) 63.9(0.4) 88.6(0.3) 5.0(0.0) 22.2(0.7) 41.7(0.4) 17.25M config, model, log
Prev. + Inst t2v 32.3(1.6) 65.5(1.0) 76.7(0.9) 93.6(0.2) 3.0(0.0) 13.0(0.3) 54.5(0.3) 24.63M config, model, log
Prev. + R2P1D t2v 31.9(1.5) 64.2(1.4) 76.1(0.7) 93.8(0.9) 3.0(0.0) 13.1(0.8) 53.8(0.7) 30.82M config, model, log
Scene v2t 20.3(0.5) 47.4(0.8) 60.0(0.4) 85.5(1.6) 6.0(0.0) 27.0(0.7) 38.7(0.3) 7.51M config, model, log
Prev. + Audio v2t 23.6(0.9) 52.2(1.1) 63.9(1.3) 89.2(0.3) 5.0(0.0) 21.6(0.8) 42.8(0.5) 17.25M config, model, log
Prev. + Inst. v2t 32.6(1.3) 65.6(0.3) 77.2(0.3) 93.7(0.9) 3.0(0.0) 12.5(0.1) 54.8(0.6) 24.63M config, model, log
Prev. + R2P1D v2t 32.9(1.7) 65.0(1.0) 76.7(1.0) 93.6(0.6) 3.0(0.0) 12.8(0.6) 54.7(1.1) 30.82M config, model, log

For QuerYDSegments updated results are

MODEL study on QUERYDSEGMENTS

Importance of the model:

Model Task R@1 R@5 R@10 R@50 MdR MnR Geom params Links
HowTo100m S3D t2v 6.4(0.0) 13.8(0.0) 19.9(0.0) 36.3(0.0) 131.0(0.0) 340.0(0.0) 12.1(0.0) 1 config, model, log
CE - P,CG t2v 21.9(0.5) 44.5(0.9) 53.5(0.0) 72.0(0.8) 8.3(0.6) 107.7(2.6) 37.4(0.4) 57.75M config, model, log
CE t2v 19.2(0.1) 40.8(1.6) 49.4(1.0) 68.7(0.5) 11.0(1.0) 125.0(4.7) 33.8(0.6) 30.82M config, model, log
HowTo100m S3D v2t 7.2(0.0) 15.1(0.0) 19.5(0.0) 34.3(0.0) 160.0(0.0) 361.4(0.0) 12.9(0.0) 1 config, model, log
CE - P,CG v2t 20.8(0.7) 43.8(1.2) 53.2(0.8) 72.6(1.1) 8.3(0.6) 102.6(2.9) 36.5(0.7) 57.75M config, model, log
CE v2t 18.5(0.5) 40.1(0.6) 49.5(0.2) 69.0(0.6) 11.0(0.0) 112.1(4.3) 33.2(0.4) 30.82M config, model, log

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages