Python Programming for Data Science

General Info

Welcome to Python Programming for Data Science!

This is a first-year course of the MSc in Data Science of the University of Padova. Indeed, it is one of the three modules which the course "Fundamentals of Information Systems" is made of.

This repository contains lecture materials (in the form of Jupyter Notebook and PDF slides) as well as exercises from the 2018-19 examination sessions (with solutions).

Course Goal

The goal of this module is to teach the basics of the Python programming language along with a special focus on Data Science. In particular, students will become familiar with Python packages that are widely used by the community of data scientists and machine learning practicioners, such as numpy, scipy, pandas, seaborn, and scikit-learn, just to name a few.
Eventually, at the end of this module students are expected to be able to implement all the stages of a typical machine learning pipeline: from collecting data to building predictive models for solving either a regression or a classification problem.
A full detailed description of the course is available here.

Course Syllabus

Python Programming for Data Science provides students with the foundational coding skills they need as data scientists.

We start our journey with an exhaustive tutorial on how to properly set up your environment, which is used throughout the class. Essentially, this consists of:

Installing Python 3.x (we will be using Python 3.6 installed via Anaconda in this class)
Installing and setting up Jupyter Notebook

Then, we move to discussing the basics of the Python programming language:

Python object model
built-in data types
fuctions
I/O

Finally, we will dig into a set of the most up-to-date data science Python packages, such as:

numpy/scipy (for numerical/scientific computing)
pandas (for data manipulation)
matplotlib/seaborn (for data visualization)
scikit-learn (for machine learning tasks like regression and classification).

Class Schedule

Lecture #	Topics	Class Material
Lecture 0	Preliminary computer science concepts	Notebook, Slides
Lecture 1	Introduction and environment setup	Notebook, Slides
Lecture 2	Python basics	Notebook, Slides
Lecture 3	Python's built-in data types (Part I)	Notebook, Slides
Lecture 4	Python's built-in data types (Part II)	Notebook, Slides
Lecture 5	Functions & I/O	Notebook, Slides
Lecture 6	`numpy` package	Notebook, Slides
Lecture 6b	Review of linear algebra basics	Notebook, Slides
Lecture 7	Introduction to `pandas` package	Notebook, Slides
Lecture 8	I/O with `pandas`	Notebook, Slides
Lecture 9	Data preparation with `pandas`	Notebook, Slides
Lecture 10	Data visualization with `matplotlib`	Notebook, Slides
Lecture 11	A Machine Learning Primer (seminar)	Notebook, Slides
Lecture 12	The Regression Problem: Example (Part I)	Notebook
Lecture 13	The Regression Problem: Example (Part II)	Notebook
Lecture 14	The Classification Problem: Example (Part I)	Notebook
Lecture 15	The Classification Problem: Example (Part II)	Notebook
Lecture 16	Logistic Regression Demystified (seminar)	Slides

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
exams		exams
lectures		lectures
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python Programming for Data Science

General Info

Course Goal

Course Syllabus

Class Schedule

About

Releases

Packages

Languages

gtolomei/python-for-datascience

Folders and files

Latest commit

History

Repository files navigation

Python Programming for Data Science

General Info

Course Goal

Course Syllabus

Class Schedule

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages