class: center, middle, inverse, title-slide # Data wrangling ## Data transformation verbs ### Ben Baumer ### SDS 192Sep 28, 2020(
http://beanumber.github.io/sds192/lectures/mdsr_wrangling_01-dplyr.html
) --- class: center, middle, inverse #  --- ## `dplyr` highlights .footnote[https://r4ds.had.co.nz/transform.html] .pull-left[ The Five Verbs: - `select()` - `filter()` - `mutate()` - `arrange()` - `summarize()` ] -- .pull-right[ Plus: - `group_by()` - `rename()` - `inner_join()` - `left_join()` ] --- ## Philosophy - Each *verb* takes a data frame and returns a data frame - actually a `tbl_df` (more on that later) - allows chaining with `%>%` (more on that later) - Idea: - master a few simple commands - use your creativity to combine them - Cheat Sheet: - https://www.rstudio.com/resources/cheatsheets/ --- background-image: url("../gfx/dplyr_cheatsheet.png") background-size: contain --- ## What is a tibble? .footnote[https://r4ds.had.co.nz/tibbles.html] .pull-left[ .center[] ] .pull-right[ - object of class `tbl` - a re-imagining of a `data.frame` - it looks and acts like a `data.frame` - but it's even better... - `tidyverse` works with tibbles ] --- ## `select()`: take a subset of the **columns**  --- ## `filter()`: take a subset of the **rows**  --- ## `mutate()`: add or modify a **column**  --- ## `arrange()`: sort the **rows**  --- ## `summarize()`: collapse to **a single row** 