For this project, we will take the book, Arsène Lupin by Maurice Leblanc, and extract the most frequently used words.
The book used can be found at: https://www.gutenberg.org/files/6133/6133-0.txt
We will makes use of stop words (words that aren't very interesting, such as: 'and', 'the', 'I', 'a',...), and won't include them in our final output.
Notes:
- This was a small day project inspired by the show I'm watching called Lupin!
- I made use of a lot of data structures (created many new data structures as changes went on), rather than altering the original data structure. To be more efficient and to save on memory, I should change this.
- For the future, I'd like to explore the different graphs matplotlib offers and find better ways to represent my findings!