Code and Slides from Spark on Azure presentation at Silicon Valley Code Camp 2015 A set of Jupyter notebooks could be run sequentially (1-3) to progressively generate better submissions for the TItanic competition on kaggle.com
The last two notebooks illustrate Spark SQL and Dataframes, as well as (incomplete) Spark ML pipeline, which will be the topic of future presentations at Bay Area Azure meetup (http://www.meetup.com/bayazure)
@EugeneChuvyrov