SparkOnAzure

Code and Slides from Spark on Azure presentation at Silicon Valley Code Camp 2015 A set of Jupyter notebooks could be run sequentially (1-3) to progressively generate better submissions for the TItanic competition on kaggle.com

The last two notebooks illustrate Spark SQL and Dataframes, as well as (incomplete) Spark ML pipeline, which will be the topic of future presentations at Bay Area Azure meetup (http://www.meetup.com/bayazure)

@EugeneChuvyrov

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
1 titanic_logistic_regression.ipynb		1 titanic_logistic_regression.ipynb
2 titanic_decision_tree.ipynb		2 titanic_decision_tree.ipynb
3 titanic_randomforest_grid_search.ipynb		3 titanic_randomforest_grid_search.ipynb
4 titanic_dataframes.ipynb		4 titanic_dataframes.ipynb
5 titanic_pipelines.ipynb		5 titanic_pipelines.ipynb
README.md		README.md
spark_on_azure.pptx		spark_on_azure.pptx
svcc_submission1.csv		svcc_submission1.csv
svcc_submission2.csv		svcc_submission2.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SparkOnAzure

About

Releases

Packages

echuvyrov/SparkOnAzure

Folders and files

Latest commit

History

Repository files navigation

SparkOnAzure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages