Image Captioning Bot, as the name suggests will predict the caption for any image provided to it on the basis of the trained dataset. Generate word after another, uses Deep Learning to generate words.
- Numpy
- Pandas
- cv2
- Matplotlib
- Keras
- Re
- NLTK
- string
- json
- time
- pickle
- Tensorflow
- collections
- Resnet50
Resnet50 pre-trained model is used on images first to generate 2048 output size which will be further feed to created model
(Image) (NLP) Input (2048) Input (30) | | | | Dropout (0.3) Embedding (input_dim=2574, ouput_dim=50) | | | | Dense (256) Dropout (0.3) | | | | | LSTM (256) |________________ _______________| | | | | Add (256 x 2) | | Dense (256) | | Dense (2574) (output)
Caption: two dogs are running on beach
Refer to this Link for the dataset and trained models data!