-
Notifications
You must be signed in to change notification settings - Fork 13
/
Copy pathtidyverse.Rmd
79 lines (42 loc) · 3.97 KB
/
tidyverse.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
# Data analysis with the tidyverse {#tidyverse}
```{r setup, include=FALSE}
source("knitr-options.R")
WORDS_TO_IGNORE <- c("dbplyr", "RStudio's")
# source("spelling-check.R")
```
The tidyverse is an opinionated [collection of R packages](https://www.tidyverse.org/packages/) designed for data science. All packages share an underlying design philosophy, grammar, and data structures.
For learning how to do data analysis from importing data and tidying it to analyzing it and reporting results, we will use [book R for Data Science](http://r4ds.had.co.nz/). You can find most of the exercise solutions [there](https://jrnold.github.io/r4ds-exercise-solutions/).
## Program
1. [my {R Markdown} presentation](https://privefl.github.io/R-presentation/rmarkdown.html) (also see https://r4ds.had.co.nz/r-markdown.html)
1. [my {ggplot2} presentation](https://privefl.github.io/R-presentation/ggplot2.html) + exercises from [data visualization with {ggplot2}](http://r4ds.had.co.nz/data-visualisation.html)
1. [tibbles](https://r4ds.had.co.nz/tibbles.html)
1. [data transformation with {dplyr}](http://r4ds.had.co.nz/transform.html)
1. [tidy data](http://r4ds.had.co.nz/tidy-data.html) will rationalize the concept of "tidy" data that is used in the tidyverse and that is easier to work with
1. [relational data](http://r4ds.had.co.nz/relational-data.html) will give you tools to *join* information from several datasets
1. more if time allows it (see below)
## Other chapters from this book
**The other chapters of R for Data Science book are very interesting and you should read them.**
Unfortunately, we won't have time to cover them in class. A brief introduction of what you could learn:
1. [data import](http://r4ds.had.co.nz/data-import.html) will give you tools to import data (e.g. as a replacement of `read.table`)
1. [strings](http://r4ds.had.co.nz/strings.html) will help you work with strings and regular expressions
1. [factors](http://r4ds.had.co.nz/factors.html) will help you work with factors
1. [dates and times](http://r4ds.had.co.nz/dates-and-times.html) will help you work with dates and times
1. [many models](http://r4ds.had.co.nz/many-models.html) will introduce the concept of *list-columns* that enable you to store complex objects in a structured way inside a data frame
1. databases: packages [{DBI}](https://github.com/r-dbi/DBI) and [{dbplyr}](https://dbplyr.tidyverse.org/articles/dbplyr.html) + [RStudio's webpage](https://solutions.rstudio.com/db/r-packages/dplyr/)
## Other resources
- [[IN FRENCH] introduction à R et au tidyverse](https://juba.github.io/tidyverse/index.html)
- [package {tidylog}](https://github.com/elbersb/tidylog) provides verbose feedback about {dplyr} and {tidyr} operations
- [comparing dplyr functions to their base R equivalents](https://dplyr.tidyverse.org/articles/base.html)
- [summarize and mutate multiple columns](http://dplyr.tidyverse.org/reference/summarise_all.html)
- [why use purrr::map instead of lapply?](https://stackoverflow.com/q/45101045/6103040)
- [reorder those bars](https://github.com/jtr13/codehelp/blob/master/R/reorder.md)
- [the lesser known stars of the tidyverse](https://www.rstudio.com/resources/rstudioconf-2018/the-lesser-known-stars-of-the-tidyverse/)
- [summary statistics of variables](https://github.com/ropenscilabs/skimr)
- [live data analysis by Hadley](https://youtu.be/go5Au01Jrvs)
## Other "tidy" packages
- analysis of text data: [package {tidytext}](https://github.com/juliasilge/tidytext) with [the associated book](https://www.tidytextmining.com/),
- analysis of financial data: [package {tidyquant}](https://business-science.github.io/tidyquant/index.html),
- analysis of time series data: [package {tidytime}](https://github.com/jayhesselberth/tidytime),
- a collection of packages for modeling and machine learning using tidyverse principles: [package {tidymodels}](https://www.tidymodels.org/),
- a tidy API for graph manipulation: [package {tidygraph}](https://github.com/thomasp85/tidygraph),
- many other packages..