Modeling Apache Cassandra database

Introduction

This Udacity Data Engineering nanodegree project creates an Apache Cassandra database for a music app, Sparkify.

The Goal

The purpose of this project is, data modeling with Apache Cassandra. The data model includes a table for each of the following queries:

Give me the artist, song title and song's length in the music app history that was heard during sessionId = 338, and itemInSession = 4
Give me only the following: name of artist, song (sorted by itemInSession) and user (first and last name) for userid = 10, sessionid = 182
Give me every user name (first and last) in my music app history who listened to the song 'All Hands Against His Own'

Database Source

Source files are :

From these files a denormalized dataset has been created:

event_datafile_new.csv contains the following columns:

Data Pipeline

select artist, song_title, song_length from song_list_by_sessionId WHERE sessionId=338 AND itemInSession=4
select artist, song_title, user_firstname, user_lastname from artist_by_userId_and_sessionId WHERE userId=10 AND sessionId=182
select user_firstname, user_lastname from user_by_song WHERE song_title='All Hands Against His Own'

To run the program

Run each portion of Project_1B_Project_Template.ipynb.

Hazal Ciplak

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Project_1B_ Project_Template.ipynb		Project_1B_ Project_Template.ipynb
README.md		README.md
event_datafile_new.csv		event_datafile_new.csv

Provide feedback