Week 3: Overview
Week 3 - Cleaning Data 
Sorry, but it comes with the territory. If you are heading towards a life where you work with data on a regular basis, this week of class may be kind of awful but also may be the most valuable week of your entire college career, and that's not an exaggeration. Thankfully, the tidyverse is about as painless as it gets for data cleaning, and it's what we're using. The tidyverse allows you to subset variables and observations so that you can quickly examine the relationship between variables.
Read & Watch this Week
- R4DS, Chapters 5 Links to an external site. and 18 Links to an external site.
- Video material (a bit more than normal this week):
- Slides (that accompany the video): Data Cleaning in the Tidyverse Links to an external site.. Even if you don't normally, you'll probably want to have the slides open as you review the videos this week, as you'll occasionally want to copy/paste code out of the slides.
- To bookmark for after class is over: the videos for this week of class are available on YouTube Links to an external site., so you can come back to re-watch them after the term is over. There are also slides and video available for earlier, "workshop" versions of this material, but done in R's speedy data.table package Links to an external site., or Python's Pandas Links to an external site., instead of the tidyverse. Slides are linked in video descriptions.
Turn In
Week 3 Data Cleaning Assignment | |
Week 3 Discussion (Ungraded) |
Learning Objectives
- Learning to manipulate and clean data, including overall data structure and working with individual variable types like strings and dates
Guiding Questions
- How should we think about preparing data for analysis?