- This material will make sense if you have a good understanding of the basics of data analysis: sorting, filtering and grouping, whether in spreadsheets or in a database manager program.
- You need the latest version of R and RStudio installed to your computer (if you're at a conference, this has already been done for you!)
For this class, we’ll be learning the basics of the R programming language, and interacting with R through RStudio which is an efficient and user-friendly tool.
A bit of background: R is a language that was created for statistical analysis and graphics; it does these two things very well. Over time, additional functionality has been added to R (because it is an open standard language, so anyone can contribute packages or libraries) so that it now works very well for data analysis, web scraping, app building, and even natural language processing.
Learning a programming language requires some investment of time. We're going to cover the basics of what you need to know to get started, but I can't teach you everything in 3 hours (and frankly I don't know everything anyway). You will be able to do data analysis in R after this class, but you will also need to keep exploring, learning, and troubleshooting. I will provide resources to help with all that.
- Learn how to set things up to make working with R easy and efficient: we'll use R Project files (.RProj) and talk about a standard folder structure for every project.
- Import a CSV, an excel file, and some data from the web.
- Use R to sort, filter, group, summarize, and join your data tables.
- Leave with resources to learn more and troubleshoot problems.
- Introduction-to-R.html (explains set up and basic terminology)
- first-r-notebook.Rmd (start here to practice working in notebooks)
- analysis-in-tidyverse.Rmd (explains the main tidyverse functions we use for analysis)
- bloomington-salaries-analysis.Rmd (practice tidyverse functions, create new columns)
- tidyverse-joins.Rmd (learn how to join tables)
- wnba-exercise-answers.Rmd (code to answer the practice questions at the bottom of tidyverse-joins.Rmd)
All the data we'll use is in the data folder (where you should always keep data!)