Evaluating Visual Conversational Agents via Cooperative Human-AI Games
Prithvijit Chattopadhyay*, Deshraj Yadav*, Viraj Prabhu, Arjun Chandrashekharan, Abhishek Das, Stefan Lee, Dhruv Batra, Devi Parikh
HCOMP 2017
This repository contains code for setting up the GuessWhich Game along with Amazon Mechinical Turk (AMT) integration for real time data collection. The data collection settings can be changed easily by modifying certain configurations defined here.
As AI continues to advance, human-AI teams are inevitable. However, progress in AI is routinely measured in isolation, without a human in the loop. It is important to measure how progress in AI translates to humans being able to accomplish tasks better; i.e., the performance of human-AI teams. In this work, we design a cooperative game – GuessWhich to measure human-AI team performance in the specific context of the AI being a visual conversational agent. The AI, which we call ALICE, is provided an image which is unseen by the human. The human then asks ALICE questions aboutthis secret image to identify it from a fixed pool of images.
We measure performance of the human-ALICE team by the number of guesses it takes the human to correctly identify the secret image after a fixed number of dialog rounds with ALICE. We compare performance of the human-ALICE teams for two versions of ALICE. While AI literature shows that one version outperforms the other when paired with another AI, we find that this improvement in AI-AI performance does not translate to improved human-AI performance.
sudo apt-get install -y git python-pip python-dev
sudo apt-get install -y python-dev
sudo apt-get install -y autoconf automake libtool curl make g++ unzip
sudo apt-get install -y libgflags-dev libgoogle-glog-dev liblmdb-dev
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch; bash install-deps;
./install.sh
source ~/.bashrc
git clone https://github.com/hughperkins/pytorch.git
cd pytorch
source ~/torch/install/bin/torch-activate
./build.sh
sudo apt-get install -y redis-server rabbitmq-server
sudo rabbitmq-plugins enable rabbitmq_management
sudo service rabbitmq-server restart
sudo service redis-server restart
luarocks install loadcaffe
The below two dependencies are only required if you are going to use GPU
luarocks install cudnn
luarocks install cunn
Note: CUDA and cuDNN is only required if you are going to use GPU
Download and install CUDA and cuDNN from nvidia website
git clone https://github.com/Cloud-CV/GuessWhich.git
cd GuessWhich
sh download_models.sh
pip install -r requirements.txt
python manage.py makemigrations amt
python manage.py migrate
Open 3 different terminal sessions and run the following commands:
cd chatbot && python sl_worker.py
cd chatbot && python rl_worker.py
python manage.py runserver
You are all set now. Visit http://127.0.0.1:8000 and you will have your demo running successfully.
If you find this code useful, consider citing our work:
@inproceedings{visdial_eval,
title={Evaluating Visual Conversational Agents via Cooperative Human-AI Games},
author={Prithvijit Chattopadhyay and Deshraj Yadav and Viraj Prabhu and Arjun Chandrasekaran and Abhishek Das and Stefan Lee and Dhruv Batra and Devi Parikh},
booktitle={Proceedings of the Fifth AAAI Conference on Human Computation and Crowdsourcing (HCOMP)},
year={2017}
}
BSD
- Vicki Image: "Robot-clip-art-book-covers-feJCV3-clipart" by Wikimedia Commons is licensed under CC BY-SA 4.0