Machine learning (ML) is an important aspect of data science that can be used to create predictions, make classifications, and uncover insights in data that can be difficult to detect. {tidymodels} is a collection of R packages that can be used for various aspects of machine learning pipelines, including sampling data, building and fitting models, and performance evaluation. {tidymodels} provides a consistent, user-friendly approach to fitting machine learning models in R.
This interactive workshop will introduce some common machine learning techniques such as random forests and support vector machines, and demonstrate how to fit these models in R using {tidymodels}. We'll also cover some of the common concepts of machine learning such as cross-validation, hyperparameter tuning and model evaluation. No previous knowledge of machine learning is required for this workshop, though familiarity with some statistical concepts such as correlation, variability, and simple linear regression may be helpful. Being reasonably comfortable with data wrangling using {dplyr} and {tidyr} would be beneficial to attendees.
Please bring a laptop with internet access to the workshop.
- Comfortable with common statistical concepts, including regression models.
- Comfortable with simple data manipulation in R.
- Able to access R on your laptop either via local installation or through Posit Cloud.
- tidymodels
- glmnet
- ranger
- openintro (for data)
- dplyr, tidyr, ggplot2, forcats (optional)