Skip to content

T2 Guiding contains 1000 images. Each image is annotated with three Visual Genome objects obtained from a FRCNN and three image labels obtained from the Google Cloud Vision API. More information about this dataset can be found in the following paper: Edwin G. Ng, Bo Pang, Piyush Sharma and Radu Soricut. 2020. Understanding Guided Image Captionin…

License

Notifications You must be signed in to change notification settings

google-research-datasets/T2-Guiding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

T2-Guiding

T2 Guiding is a dataset of 1000 images, each with six image labels. The images are from the Open Images Dataset (OID) and we provide 2 sets of machine-generated labels for these images.

  1. Object labels: Three random object labels generated by a FRCNN model trained on Visual Genome.
  2. Image labels: Three random image labels obtained from Google Cloud Vision API.

This dataset is used as the test set in the paper: "Understanding Guided Image Captioning Performance across Domains".

More details are available in this paper (please cite the paper if you use or discuss this dataset in your work):

@article{ng2020understanding,
  title={Understanding Guided Image Captioning Performance across Domains},
  author={Edwin G. Ng and Bo Pang and Piyush Sharma and Radu Soricut},
  journal={arXiv preprint arXiv:2012.02339},
  year={2020}
}

Data Format

The released data is provided as a TSV (tab-separated values) text file with the following columns:

Table 1: Columns in TSV files.

Column Description
1 Image key. The unique identifier of the image in the Open Images Dataset (a hexadecimal number. e.g., 0000d67245642c5f).
2 Visual Genome objects. Comma-separated list of object labels generated by a FRCNN trained on Visual Genome.
3 Image labels. Comma-separated list of image labels obtained from Google Cloud Vision API.

Downloads

The dataset is available for download here. The mapping from the image key to the image URL can be found in the cvpr2019.tsv.meta file of the original T2 dataset download link.

About

T2 Guiding contains 1000 images. Each image is annotated with three Visual Genome objects obtained from a FRCNN and three image labels obtained from the Google Cloud Vision API. More information about this dataset can be found in the following paper: Edwin G. Ng, Bo Pang, Piyush Sharma and Radu Soricut. 2020. Understanding Guided Image Captionin…

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published