Skip to content
/ maaf Public

Modality-Agnostic Attention Fusion for visual search with text feedback

License

Notifications You must be signed in to change notification settings

yahoo/maaf

Repository files navigation

(Residual) Modality Agnostic Attention Fusion for visual search with text feedback

The methods in this repository were used for the experiments described in two papers from Yahoo's Visual Intelligence Research team. The more recent paper's main results can be reproduced using the scripts in the experiment_scripts directory. If you find this code useful please cite

@article{dodds2022training,
  title = {Training and challenging models for text-guided fashion image retrieval},
  author = {Dodds, Eric and Culpepper, Jack and Srivastava, Gaurav},
  journal={arXiv preprint arXiv:2204.11004}
  year = {2022},
  doi = {10.48550/ARXIV.2204.11004},
}

We also recommend using the latest version of the code if you wish to build upon our general methods. However if you are interested specifically in reproducing the results in our earlier paper or using datasets discussed there, it will likely be easier to start from commit 49a0df9. The earlier paper can be cited as:

@article{dodds2020modality,
  title={Modality-Agnostic Attention Fusion for visual search with text feedback},
  author={Dodds, Eric and Culpepper, Jack and Herdade, Simao and Zhang, Yang and Boakye, Kofi},
  journal={arXiv preprint arXiv:2007.00145},
  year={2020}
}

This codebase was originally adapted from TIRG code written by the authors of Composing Text and Image for Image Retrieval - An Empirical Odyssey. The core model and training code is based on. Transformer code is adapted from The Annotated Transformer. Further modifications are our own. We use YACS for configurations.

Setup

The code is tested on Python 3.6 with PyTorch 1.5 and should also work on newer versions. Installing with pip should install the requirements.

Datasets

Challenging Fashion Queries (CFQ)

The Challenging Fashion Queries dataset described in our paper can be found here and used for research purposes.

We do not own any of other datasets used in our experiments here. Below we link to the datasets where we acquired them.

Fashion IQ

Download the dataset from here.

About

Modality-Agnostic Attention Fusion for visual search with text feedback

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published