Student: Henry Goodman
ID: 13032204
Institution: University of Technology Sydney (UTS)
This project explores the impact of Digital Rights Management (DRM) on user experiences within various online media platforms. Utilizing Natural Language Processing (NLP), the study sifts through user reviews to extract sentiments pertaining to DRM and its implementations.
The primary goal is to conduct a comparative analysis on the effects of DRM across popular online media products, with a focus on identifying the potential negative experiences users may encounter due to DRM practices.
Data was strategically gathered from multiple digital platforms known for implementing DRM, including e-books, audio streaming services, video streaming platforms, and gaming applications. A custom-built web scraper was developed to automate the extraction of essential information such as the review date, user's rating, and the full text of each review, ensuring a broad and representative dataset.
The collected data underwent preprocessing, including tokenization, normalization, and lemmatization, to facilitate detailed sentiment analysis focused on DRM-related aspects. An extensive set of DRM-related keywords was identified to pinpoint relevant sentiments in the reviews.
Advanced data visualization techniques were employed to generate graphical representations of the data, highlighting the prevalence of DRM discussions and their correlation with user ratings over time.
- Programming Language: Python
- Libraries:
- pandas: Used for data manipulation and analysis.
- matplotlib and seaborn: Employed for generating static, interactive, and animated visualizations.
- Natural Language Toolkit (NLTK): Applied for comprehensive NLP tasks.
- Custom Modules: Developed for specific tasks like web scraping, data cleaning, and sentiment analysis.
The project offers a command-line interface (CLI) for executing various functions. Ensure all dependencies are installed before running the commands:
# Install project requirements
pip install -r requirements.txt
# Generate reviews.json from specified URLs
python main.py --generate
# Analyze sentiments in the collected reviews
python main.py --analyze
# Generate plots based on the analyzed data
python main.py --plot
# Generate rankings .csv file from DRM Complexity toml file
python main.py --rank
# Generate correlation analyses from generated .csv files
python main.py --gencor
Visualizations provide insight into DRM discussions within user reviews and their impact on product ratings: