Spark ML Recommender System

Project Objective:

Develop big data recommender system using Spark ML on AWS EMR cluster. Repo consist of scripts to run Spark ML.

Steps and process:

First notebook sandbox codes tested on Google colab.
PY file was converted from notebook which was used on AWS EMR. Script was part of the Data Pipeline that would automatically train 1.5M data of beer reviews and generate Top 10 recommendations for each users.
Data was to be pushed to AWS S3 for storage which also triggers Lambda to automatically save the information in DynamoDB.
DynamoDB was used to for data retrieval directly from web via Gateway.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Image		Image
README.md		README.md
Spark ML RecSys AWS EMR.py		Spark ML RecSys AWS EMR.py
Spark ML Recommender System.ipynb		Spark ML Recommender System.ipynb

Provide feedback