Skip to content

Latest commit

 

History

History
70 lines (40 loc) · 3.8 KB

index.md

File metadata and controls

70 lines (40 loc) · 3.8 KB

| Home | Lectures | Labs | Assignments | Project| Contact |

Course Description

Welcome to Data Science IFT6758 Graduate level course on introduction to data science. The course focuses on the analysis of messy, real life data to perform predictions using statistical and machine learning methods.

The material of the course will integrate the five key facets of an investigation using data:

  • data collection ‐ data wrangling, cleaning, and sampling
  • data management ‐ accessing data quickly and reliably
  • exploratory data analysis – generating hypotheses and building intuition
  • prediction or statistical learning
  • communication – summarizing results through visualization, stories, and interpretable summaries

In this course, we focus on statistical methods and introduce techniques in different domains to make you familiar with various type of data. Our goal is to educate you to become not only knowledgable but also responsible data scientist by the end of this course!

Feedback

Please use this form to provide feedback about the course.

Announcements

  • There is no lab/class on Tuesday October 29.
  • Mid-term exam will be based on the content covered in the class: week 0 till the end of week 6 (the feature selection slides that are covered in the class are: slides 1 - 17 and 30 - 59). The exam will not have any questions from the guest lectures of week 7. Note that the exam will be closed book. No cheat sheet or computer is allowed during the exam.
  • Mid-term project presentation will take place on Tuesday November 5. All the expectations for the project have been posted to the project page.
  • Expectations for homework submissions have been posted to the assignments page.
  • Check the weekly evaluations of the project on the scoreboard page.
  • If you have personal issues with the mid-term and final exam dates, send us an email.

Room & Time

Theory

  • Tuesday, 11:30AM-12:30PM, Z310 Pavillon Claire-McNicoll, Université de Montréal

  • Thursday, 4:30PM-6:30PM, G-415-511A Pavillon Roger-Gaudry, Université de Montréal

Labs

  • Tuesday, 12:30AM-14:30PM, X-115 Pavillon Roger-Gaudry, Université de Montréal

Prerequisites

Basic knowledge of statistics, and Python programming is encouraged.

Grading

Your final score for the course will be computed using the following weights:

  • Project: 35%
  • Assignments: 25%
  • Mid-term: 15%
  • Final: 25%

ATTENTION regarding fraud and plagiarism: The University of Montreal now has a strict policy in case of fraud or plagiarism. If an infraction is found, the professor is required to report to the director of the department. An administrative procedure is then automatically triggered with the following consequences: the offense is noted in your file, and a sanction is decided (which can be serious and go to dismissal in case of recidivism). It is important that you do the work yourself!

Reading

  • Jake VanderPlas, Python Data Science Handbook, O'Reilly Media; 1 edition (2016) - Free book

  • Russell Jurney, Agile Data Science 2.0: Building Full-Stack Data Analytics Applications with Spark, O'Reilly Media; 1st edition (2017).

  • Foster Provost and Tom Fawcett, Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking, O'Reilly Media; 1st edition (2013)

  • A. Rajaraman, J. Leskovec and J. Ullman, Mining of Massive Datasets, Cambridge University Press, 3rd version.

  • James, Witten, Hastie, Tibshirani. An Introduction to Statistical Learning