Multimodal Image Retrieval

🚦⚠️👷‍♂️🏗️🚦⚠️👷‍♂️🏗️🚦⚠️👷‍♂️🏗️ Repo Under Construction 🚦⚠️👷‍♂️🏗️🚦⚠️👷‍♂️🏗️🚦⚠️👷‍♂️🏗️

Note: Our model hasn't been trained sufficiently and the results are nowhere close to our expectations. We'll be improving the model as we find time and more GPU resources. Until then, play around with this (not so great) model.
Things we're looking to try:

Improve preprocessing

Replace special characters with space

Play around with embedding dimensions

Use the entire InstaNY100K Dataset

Train Word2Vec again

Use different CNNs for regressing Word2Vec embeddings from images.

Try different post-processing strategies for embeddings.

Train with MSELoss

Experiment with other distance functions

A deep learning application to retrieve images by searching with text.

Try out the application here: https://share.streamlit.io/koushikvikram/multimodal-image-retrieval/main/app.py

Project Workflow

Dataset

Download the InstaNY100K dataset from this Google Drive link

Extract the dataset in the path, ./datasets/raw/. You folder structure should look like the one below:

./datasets/raw/
|
|-- InstaNY100K
    |
    |-- captions
    |   |
    |   |-- newyork
    |      | 1487768220566960691.txt
    |      | 1490727714071958379.txt
    |      | ...
    |   
    |-- img_resized
        |
        |-- newyork
            | 1480879485913200243.jpg
            | 1480879539524935620.jpg
            | ...

GitHub Actions for this Repository

Pylint - Code Quality Check

Pytest - Functionality and Behavioral Tests for Classes and Models

Exploring the Word2Vec Model

We recommend using the TensorFlow Embedding Projector to visualize our Word2Vec model.

Load the tensor and metadata tsv files provided in the model directory and visualize words that interest you!

Samples from TensorFlow Embedding Projector:

You can also use models/explore_word2vec.ipynb to explore words of interest.

Samples from the Jupyter Notebook:

Acknowledgment

Articles used as reference during development are documented in the references directory.

If you run into issues while using the repo, please create an issue on this GitHub repository at the following link and I'll be glad to fix it: https://github.com/koushikvikram/multimodal-image-retrieval/issues

If you'd like to collaborate with me or hire me, please feel free to send an email to [email protected]

Make sure to check out other repositories on my homepage.

Name		Name	Last commit message	Last commit date
Latest commit History 285 Commits
.github/workflows		.github/workflows
application-files		application-files
config		config
datasets		datasets
images		images
model		model
references		references
src		src
tests		tests
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal Image Retrieval

Project Workflow

Dataset

GitHub Actions for this Repository

Exploring the Word2Vec Model

Acknowledgment

About

Languages

koushikvikram/multimodal-image-retrieval

Folders and files

Latest commit

History

Repository files navigation

Multimodal Image Retrieval

Project Workflow

Dataset

GitHub Actions for this Repository

Exploring the Word2Vec Model

Acknowledgment

About

Topics

Resources

Stars

Watchers

Forks

Languages