Welcome to this repository! Here you will find a comprehensive collection of exercises done during various courses on Kaggle, designed to help you develop the skills needed for independent data science projects.
Each exercise in this repository reflects my commitment to mastering the fundamental skills needed to tackle real-world challenges in data science. Courses on Kaggle have provided me with a solid foundation, and these exercises represent my learning journey from basic concepts to advanced techniques.
- Intro to Machine Learning
- Pandas
- Intermediate Machine Learning
- Data Visualization
- Feature Engineering
- Intro to SQL
- Advanced SQL
- Intro to Deep Learning
- Computer Vision
- Time Series
- Data Cleaning
- Intro to AI Ethics
- Geospatial Analysis
- Machine Learning Explainability
- Intro to Game AI and Reinforcement Learning
Each folder contains detailed exercises, solutions and annotations created during the learning journey. I hope this collection will be inspiring and helpful to anyone who wants to dive into the fascinating world of data science and machine learning.
Feel free to explore, clone the repository and use the exercises for your own learning. If you have suggestions, corrections or new exercises to share, they will be more than welcome! This space is also designed for collaboration and knowledge exchange.
Happy coding! 🚀📊✨
-
How Models Work
The first step if you're new to machine learning. -
Basic Data Exploration
Load and understand your data. -
Your First Machine Learning Model
Building your first model. Hurray! -
Model Validation
Measure the performance of your model, so you can test and compare alternatives. -
Underfitting and Overfitting
Fine-tune your model for better performance. -
Random Forests
Using a more sophisticated machine learning algorithm. -
Machine Learning Competitions
Enter the world of machine learning competitions to keep improving and see your progress.
Intro to Machine Learning - Certificate |
-
Creating, Reading and Writing
You can't work with data if you can't read it. Get started here. -
Indexing, Selecting & Assigning
Pro data scientists do this dozens of times a day. You can, too! -
Renaming and Combining
Data comes in from many sources. Help it all make sense together. -
Summary Functions and Maps
Extract insights from your data. -
Grouping and Sorting
Scale up your level of insight. The more complex the dataset, the more this matters. -
Data Types and Missing Values
Deal with the most common progress-blocking problems.
Pandas - Certificate |
-
Introduction
Review what you need for this Micro-Course. -
Missing Values
Missing values happen. Be prepared for this common challenge in real datasets. -
Categorical Variables
There's a lot of non-numeric data out there. Here's how to use it for machine learning. -
Pipelines
A critical skill for deploying (and even testing) complex models with pre-processing. -
Cross-Validation
A better way to test your models. -
XGBoost
The most accurate modeling technique for structured data. -
Data Leakage
Find and fix this problem that ruins your model in subtle ways.
Intermediate Machine Learning - Certificate |
-
Hello, Seaborn
Your first introduction to coding for data visualization. -
Line Charts
Visualize trends over time. -
Bar Charts and Heatmaps
Use color or length to compare categories in a dataset. -
Scatter Plots
Leverage the coordinate plane to explore relationships between variables. -
Distributions
Create histograms and density plots. -
Choosing Plot Types and Custom Styles
Customize your charts and make them look snazzy. -
Final Project
Practice for real-world application.
Data Visualization - Certificate |
-
What Is Feature Engineering
Learn the steps and principles of creating better features -
Mutual Information
Locate features with the most potential. -
Creating Features
Transform features with Pandas to suit your model. -
Clustering With K-Means
Untangle complex spatial relationships with cluster labels. -
Principal Component Analysis
Discover new features by analyzing variation. -
Target Encoding
Boost any categorical feature with this powerful technique.
Feature Engineering - Certificate |
-
Getting Started With SQL and BigQuery
Learn the workflow for handling big datasets with BigQuery and SQL. -
Select, From & Where
The foundational compontents for all SQL queries. -
Group By, Having & Count
Get more interesting insights directly from your SQL queries. -
Order By
Order your results to focus on the most important data for your use case. -
As & With
Organize your query for better readability. This becomes especially important for complex queries. -
Joining Data
Combine data sources. Critical for almost all real-world data problems.
Intro to SQL - Certificate |
-
JOINs and UNIONs
Combine information from multiple tables. -
Analytic Functions
Perform complex calculations on groups of rows. -
Nested and Repeated Data
Learn to query complex datatypes in BigQuery. -
Writing Efficient Queries
Write queries to run faster and use less data.
Advanced SQL - Certificate |
-
A Single Neuron
Learn about linear units, the building blocks of deep learning. -
Deep Neural Networks
Add hidden layers to your network to uncover complex relationships. -
Stochastic Gradient Descent
Use Keras and Tensorflow to train your first neural network. -
Overfitting and Underfitting
Improve performance with extra capacity or early stopping. -
Dropout and Batch Normalization
Add these special layers to prevent overfitting and stabilize training. -
Binary Classification
Apply deep learning to another common task.
Intro to Deep Learning - Certificate |
-
The Convolutional Classifier
Create your first computer vision model with Keras. -
Convolution and ReLU
Discover how convnets create features with convolutional layers. -
Maximum Pooling
Learn more about feature extraction with maximum pooling. -
The Sliding Window
Explore two important parameters: stride and padding. -
Custom Convnets
Design your own convnet. -
Data Augmentation
Boost performance by creating extra training data.
Computer Vision - Certificate |
-
Linear Regression With Time Series
Use two features unique to time series: lags and time steps. -
Trend
Model long-term changes with moving averages and the time dummy. -
Seasonality
Create indicators and Fourier features to capture periodic change. -
Time Series as Features
Predict the future from the past with a lag embedding. -
Hybrid Models
Combine the strengths of two forecasters with this powerful technique. -
Forecasting With Machine Learning
Apply ML to any forecasting task with these four strategies.
Time Series - Certificate |
-
Handling Missing Values
Drop missing values, or fill them in with an automated workflow. -
Scaling and Normalization
Transform numeric variables to have helpful properties. -
Parsing Dates
Help Python recognize dates as composed of day, month, and year. -
Character Encodings
Avoid UnicodeDecodeErrors when loading CSV files. -
Inconsistent Data Entry
Efficiently fix typos in your data.
Data Cleaning - Certificate |
-
Introduction to AI Ethics
Learn what to expect from the course. -
Human-Centered Design for AI
Design systems that serve people’s needs. Navigate issues in several real-world scenarios. -
Identifying Bias in AI
Bias can creep in at any stage in the pipeline. Investigate a simple model that identifies toxic text. -
AI Fairness
Learn about four different types of fairness. Assess a toy model trained to judge credit card applications. -
Model Cards
Increase transparency by communicating key information about machine learning models.
Intro to AI Ethics - Certificate |
-
Your First Map
Get started with plotting in GeoPandas. -
Coordinate Reference Systems
It's pretty amazing that we can represent the Earth's surface in 2 dimensions! -
Interactive Maps
Learn how to make interactive heatmaps, choropleth maps, and more! -
Manipulating Geospatial Data
Find locations with just the name of a place. And, learn how to join data based on spatial relationships. -
Proximity Analysis
Measure distance, and explore neighboring points on a map.
Geospatial Analysis - Certificate |
-
Use Cases for Model Insights
Why and when do you need insights? -
Permutation Importance
What features does your model think are important? -
Partial Plots
How does each feature affect your predictions? -
SHAP Values
Understand individual predictions. -
Advanced Uses of SHAP Values
Aggregate SHAP values for even more detailed model insights.
Machine Learning Explainability - Certificate |
-
Play the Game
Write your first game-playing agent. -
One-Step Lookahead
Make your agent smarter with a few simple changes. -
N-Step Lookahead
Use the minimax algorithm to dramatically improve your agent. -
Deep Reinforcement Learning
Explore advanced techniques for creating intelligent agents.
Intro to Game AI and Reinforcement Learning - Certificate |