This assignment gives you the opportunity to practice the techniques of this module: tokenization, normalization, and calculating descriptive statistics. You will practice these skills here, then use them in every subsequent assignment.
Instructions:
- Create a repository under your GitHub account from this template: https://github.com/37chandler/ads-tm-token-norm. Instructions can be found here. Make your repository public or add your instructor’s Github account as a collaborator.
- If you did not complete the assignment for Module 1, download “M1 Assignment Data.zip” from Canvas and extract it into your repository. This folder includes not only what you did in the last module, but also information about the followers of the singers on Twitter.
- The “Lyrics and Description EDA.ipynb” file within the repository holds the starting code and instructions for the assignment.
- Work through the notebook, performing the steps asked of you. Use and extend the code from the chapters of your textbook.
- Part of the data in the folder named Twitter is obtained using Twitter API.
Assignment Materials:
- Tokenization, Normalization, and Descriptive Statistics Repository
- M1 Assignment Data.zip
Deliverables:
- When you have finished your code, print both of your notebooks as PDFs and upload these documents to Canvas.
- Commit your code and push the changes to GitHub so your instructor has access to the ipynb notebook files and any other code you create.
This repository is originally created by https://github.com/37chandler.