On window i open anaconda prompt and there create a new environment named chatbot.
conda create -n chatbot python=3.6 anaconda
After that i activate a virtual environment. For windows anaconda prompt type
activate chatbot
Now iam in chatbot environment. Here iam installing Tensor flow as
pip install tensorflow
Ok, then close the prompt and goto anaconda navigator and switch to chatbot environment. Here i use spyder to do all the coding.
I found a dataset which is called Cornell Movie Dialogs Corpus.This corpus contains a large metadata-rich collection of fictional conversations extracted from raw movie scripts:
DESCRIPTION:
-
220,579 conversational exchanges between 10,292 pairs of movie characters
-
involves 9,035 characters from 617 movies
-
in total 304,713 utterances(a line said by a character)
-
movie metadata included:
-
genres
-
release year
-
IMDB rating
-
number of IMDB votes
-
IMDB rating
-
-
character metadata included:
-
gender (for 3,774 characters)
-
position on movie credits (3,321 characters)
-
From this corpus i get data and metadata files, out of these i only need movie_conversations.txt and movie_lines.txt. Next i created a dictionary that map each line and its id.