Skip to content

Taking a book (Arsène Lupin by Maurice Leblanc), and extracting the top 10 most popular words.

Notifications You must be signed in to change notification settings

David-Quan00/frequentlyUsedWords

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project: frequentlyUsedWords

For this project, we will take the book, Arsène Lupin by Maurice Leblanc, and extract the most frequently used words.

The book used can be found at: https://www.gutenberg.org/files/6133/6133-0.txt

We will makes use of stop words (words that aren't very interesting, such as: 'and', 'the', 'I', 'a',...), and won't include them in our final output.

Notes:

  • This was a small day project inspired by the show I'm watching called Lupin!
  • I made use of a lot of data structures (created many new data structures as changes went on), rather than altering the original data structure. To be more efficient and to save on memory, I should change this.
  • For the future, I'd like to explore the different graphs matplotlib offers and find better ways to represent my findings!

About

Taking a book (Arsène Lupin by Maurice Leblanc), and extracting the top 10 most popular words.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages