-
Notifications
You must be signed in to change notification settings - Fork 11
/
Copy pathREADME.Rmd
365 lines (279 loc) · 14.4 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
---
output: github_document
editor_options:
markdown:
wrap: sentence
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>", dev = "CairoPNG",
fig.path = "man/figures/README-",
out.width = "100%"
)
scholarUrl <- "https://scholar.google.com/scholar?oi=bibs&hl=en&cites=5439940108464463894"
scholarPage <- rvest::read_html(scholarUrl)
citeElem <- rvest::html_elements(scholarPage, "#gs_ab_md .gs_ab_mdw")
citations <- paste0("~", stringr::str_extract(rvest::html_text(citeElem), "[0-9]{1,3}"))
```
# microViz <a href='https://david-barnett.github.io/microViz/index.html'><img src="man/figures/logo.png" align="right" height="180" width="156"/></a>
<!-- badges: start -->
[![R-CMD-check](https://github.com/david-barnett/microViz/workflows/R-CMD-check/badge.svg)](https://github.com/david-barnett/microViz/actions)
[![codecov](https://codecov.io/gh/david-barnett/microViz/branch/main/graph/badge.svg?token=C1EoVkhnxA)](https://codecov.io/gh/david-barnett/microViz)
![GitHub R package version](https://img.shields.io/github/r-package/v/david-barnett/microViz?label=Latest)
![GitHub release (including pre-releases)](https://img.shields.io/github/v/release/david-barnett/microViz?include_prereleases&label=Release)
![Docker Image Version (latest by date)](https://img.shields.io/docker/v/barnettdavid/microviz-rocker-verse?color=blue&label=Docker)
[![microViz status badge](https://david-barnett.r-universe.dev/badges/microViz)](https://david-barnett.r-universe.dev/microViz)
[![JOSS article](https://joss.theoj.org/papers/4547b492f224a26d96938ada81fee3fa/status.svg)](https://joss.theoj.org/papers/4547b492f224a26d96938ada81fee3fa)
[![Citations](https://img.shields.io/badge/Citations-`r citations`-blueviolet)](https://scholar.google.com/scholar?hl=en&as_sdt=2005&sciodt=0,5&cites=5439940108464463894&scipsc=&q=&scisbd=1)
[![Zenodo DOI](https://zenodo.org/badge/307119750.svg)](https://zenodo.org/badge/latestdoi/307119750)
<!-- badges: end -->
## Overview
:package: `microViz` is an R package for analysis and visualization of microbiome sequencing data.
:hammer: `microViz` functions are intended to be beginner-friendly but flexible.
:microscope: `microViz` extends or complements popular microbial ecology packages, including `phyloseq`, `vegan`, & `microbiome`.
## Learn more
:paperclip: This website is the best place for documentation and examples: <https://david-barnett.github.io/microViz/>
- [**This ReadMe**](https://david-barnett.github.io/microViz/) shows a few example analyses
- **[The Getting Started guide](https://david-barnett.github.io/microViz/articles/shao19-analyses.html)** shows more example analyses and gives advice on using microViz with your own data
- **[The Reference page](https://david-barnett.github.io/microViz/reference/index.html)** lists all functions and links to help pages and examples
- **[The News page](https://david-barnett.github.io/microViz/news/index.html)** describes important changes in new microViz package versions
- **The Articles pages** give tutorials and further examples
- [Working with phyloseq objects](https://david-barnett.github.io/microViz/articles/web-only/phyloseq.html)
- [Fixing your taxa table with tax_fix](https://david-barnett.github.io/microViz/articles/web-only/tax-fixing.html)
- [Creating ordination plots](https://david-barnett.github.io/microViz/articles/web-only/ordination.html) (e.g. PCA or PCoA)
- [Interactive ordination plots with ord_explore](https://david-barnett.github.io/microViz/articles/web-only/ordination-interactive.html)
- [Visualising taxonomic compositions with comp_barplot](https://david-barnett.github.io/microViz/articles/web-only/compositions.html)
- [Heatmaps of microbiome composition and correlation](https://david-barnett.github.io/microViz/articles/web-only/heatmaps.html)
- [Modelling and plotting individual taxon associations with taxatrees](https://david-barnett.github.io/microViz/articles/web-only/modelling-taxa.html)
- Post on [GitHub discussions](https://github.com/david-barnett/microViz/discussions) if you have questions/requests
## Installation
microViz is not (yet) available from CRAN.
You can install microViz from R Universe, or from GitHub.
I recommend you first install the Bioconductor dependencies using the code below.
``` r
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::install(c("phyloseq", "microbiome", "ComplexHeatmap"), update = FALSE)
```
### Installation of microViz from R Universe
``` r
install.packages(
"microViz",
repos = c(davidbarnett = "https://david-barnett.r-universe.dev", getOption("repos"))
)
```
I also recommend you install the following suggested CRAN packages.
``` r
install.packages("ggtext") # for rotated labels on ord_plot()
install.packages("ggraph") # for taxatree_plots()
install.packages("DT") # for tax_fix_interactive()
install.packages("corncob") # for beta binomial models in tax_model()
```
### Installation of microViz from GitHub
``` r
# Installing from GitHub requires the remotes package
install.packages("remotes")
# Windows users will also need to have RTools installed! http://jtleek.com/modules/01_DataScientistToolbox/02_10_rtools/
# To install the latest version:
remotes::install_github("david-barnett/microViz")
# To install a specific "release" version of this package, e.g. an old version
remotes::install_github("david-barnett/[email protected]")
```
### Installation notes
**:apple: macOS** **users** - might need to install [xquartz](https://www.xquartz.org/) to make the heatmaps work (to do this with homebrew, run the following command in your mac's Terminal: `brew install --cask xquartz`
:package: I highly recommend using [renv](https://rstudio.github.io/renv/index.html) for managing your R package installations across multiple projects.
:whale: For Docker users an image with the main branch installed is available at: <https://hub.docker.com/r/barnettdavid/microviz-rocker-verse>
:date: microViz is tested to work with R version 4.* on Windows, MacOS, and Ubuntu 20.
R version 3.6.\* should probably work, but I don't fully test this.
## Interactive ordination exploration
```{r}
library(microViz)
```
microViz provides a Shiny app for an easy way to start exploring your microbiome data: all you need is a phyloseq object.
```{r example ord_explore}
# example data from corncob package
pseq <- microViz::ibd %>%
tax_fix() %>%
phyloseq_validate()
```
``` r
ord_explore(pseq) # gif generated with microViz version 0.7.4 (plays at 1.75x speed)
```
![](vignettes/web-only/images/20210429_ord_explore_x175.gif)
## Example analyses (on HITChip data)
```{r packages, message=FALSE}
library(phyloseq)
library(dplyr)
library(ggplot2)
```
```{r data setup}
# get some example data
data("dietswap", package = "microbiome")
# create a couple of numerical variables to use as constraints or conditions
dietswap <- dietswap %>%
ps_mutate(
weight = recode(bmi_group, obese = 3, overweight = 2, lean = 1),
female = if_else(sex == "female", true = 1, false = 0),
african = if_else(nationality == "AFR", true = 1, false = 0)
)
# add a couple of missing values to show how microViz handles missing data
sample_data(dietswap)$african[c(3, 4)] <- NA
```
### Looking at your data
You have quite a few samples in your phyloseq object, and would like to visualize their compositions.
Perhaps these example data differ by participant nationality?
```{r, bars, fig.height=5, fig.width=6, dpi=120}
dietswap %>%
comp_barplot(
tax_level = "Genus", n_taxa = 15, other_name = "Other",
taxon_renamer = function(x) stringr::str_remove(x, " [ae]t rel."),
palette = distinct_palette(n = 15, add = "grey90"),
merge_other = FALSE, bar_outline_colour = "darkgrey"
) +
coord_flip() +
facet_wrap("nationality", nrow = 1, scales = "free") +
labs(x = NULL, y = NULL) +
theme(axis.text.y = element_blank(), axis.ticks.y = element_blank())
```
```{r, compheatmap, fig.width=5.5, fig.height=3.5, dpi=120}
htmp <- dietswap %>%
ps_mutate(nationality = as.character(nationality)) %>%
tax_transform("log2", add = 1, chain = TRUE) %>%
comp_heatmap(
taxa = tax_top(dietswap, n = 30), grid_col = NA, name = "Log2p",
taxon_renamer = function(x) stringr::str_remove(x, " [ae]t rel."),
colors = heat_palette(palette = viridis::turbo(11)),
row_names_side = "left", row_dend_side = "right", sample_side = "bottom",
sample_anno = sampleAnnotation(
Nationality = anno_sample_cat(
var = "nationality", col = c(AAM = "grey35", AFR = "grey85"),
box_col = NA, legend_title = "Nationality", size = grid::unit(4, "mm")
)
)
)
ComplexHeatmap::draw(
object = htmp, annotation_legend_list = attr(htmp, "AnnoLegends"),
merge_legends = TRUE
)
```
### Example ordination plot workflow
Ordination methods can also help you to visualize if overall microbial ecosystem composition differs markedly between groups, e.g.
BMI.
Here is one option as an example:
1. Aggregate the taxa into bacterial families (for example) - use `tax_agg()`
2. Transform the microbial data with the centered-log-ratio transformation - use `tax_transform()`
3. Perform PCA with the clr-transformed features (equivalent to Aitchison distance PCoA) - use `ord_calc()`
4. Plot the first 2 axes of this PCA ordination, colouring samples by group and adding taxon loading arrows to visualize which taxa generally differ across your samples - use `ord_plot()`
5. Customise the theme of the ggplot as you like and/or add features like ellipses or annotations
```{r ordination-plot, dpi=120}
# perform ordination
unconstrained_aitchison_pca <- dietswap %>%
tax_agg("Family") %>%
tax_transform("clr") %>%
ord_calc()
# ord_calc will automatically infer you want a "PCA" here
# specify explicitly with method = "PCA", or you can pick another method
# create plot
pca_plot <- unconstrained_aitchison_pca %>%
ord_plot(
plot_taxa = 1:6, colour = "bmi_group", size = 1.5,
tax_vec_length = 0.325,
tax_lab_style = tax_lab_style(max_angle = 90, aspect_ratio = 1),
auto_caption = 8
)
# customise plot
customised_plot <- pca_plot +
stat_ellipse(aes(linetype = bmi_group, colour = bmi_group), linewidth = 0.3) + # linewidth not size, since ggplot 3.4.0
scale_colour_brewer(palette = "Set1") +
theme(legend.position = "bottom") +
coord_fixed(ratio = 1, clip = "off") # makes rotated labels align correctly
# show plot
customised_plot
```
### PERMANOVA
You visualised your ordinated data in the plot above.
Below you can see how to perform a PERMANOVA to test the significance of BMI's association with overall microbial composition.
This example uses the Family-level Aitchison distance to correspond with the plot above.
```{r permanova}
# calculate distances
aitchison_dists <- dietswap %>%
tax_transform("identity", rank = "Family") %>%
dist_calc("aitchison")
# the more permutations you request, the longer it takes
# but also the more stable and precise your p-values become
aitchison_perm <- aitchison_dists %>%
dist_permanova(
seed = 1234, # for set.seed to ensure reproducibility of random process
n_processes = 1, n_perms = 99, # you should use at least 999!
variables = "bmi_group"
)
# view the permanova results
perm_get(aitchison_perm) %>% as.data.frame()
# view the info stored about the distance calculation
info_get(aitchison_perm)
```
### Constrained partial ordination
You could visualise the effect of the (numeric/logical) variables in your permanova directly using the `ord_plot` function with constraints (and conditions).
```{r constrained-ord}
perm2 <- aitchison_dists %>%
dist_permanova(variables = c("weight", "african", "sex"), seed = 321)
```
We'll visualise the effect of nationality and bodyweight on sample composition, after first removing the effect of sex.
```{r constrained-ord-plot, fig.height=6, fig.width=6, dpi=120}
perm2 %>%
ord_calc(constraints = c("weight", "african"), conditions = "female") %>%
ord_plot(
colour = "nationality", size = 2.5, alpha = 0.35,
auto_caption = 7,
constraint_vec_length = 1,
constraint_vec_style = vec_constraint(1.5, colour = "grey15"),
constraint_lab_style = constraint_lab_style(
max_angle = 90, size = 3, aspect_ratio = 0.8, colour = "black"
)
) +
stat_ellipse(aes(colour = nationality), linewidth = 0.2) +
scale_color_brewer(palette = "Set1", guide = guide_legend(position = "inside")) +
coord_fixed(ratio = 0.8, clip = "off", xlim = c(-4, 4)) +
theme(legend.position.inside = c(0.9, 0.1), legend.background = element_rect())
```
### Correlation Heatmaps
microViz heatmaps are powered by `ComplexHeatmap` and annotated with taxa prevalence and/or abundance.
```{r, corheatmap, dpi=120, fig.width=7, fig.height=5.5}
# set up the data with numerical variables and filter to top taxa
psq <- dietswap %>%
ps_mutate(
weight = recode(bmi_group, obese = 3, overweight = 2, lean = 1),
female = if_else(sex == "female", true = 1, false = 0),
african = if_else(nationality == "AFR", true = 1, false = 0)
) %>%
tax_filter(
tax_level = "Genus", min_prevalence = 1 / 10, min_sample_abundance = 1 / 10
) %>%
tax_transform("identity", rank = "Genus")
# randomly select 30 taxa from the 50 most abundant taxa (just for an example)
set.seed(123)
taxa <- sample(tax_top(psq, n = 50), size = 30)
# actually draw the heatmap
cor_heatmap(
data = psq, taxa = taxa,
taxon_renamer = function(x) stringr::str_remove(x, " [ae]t rel."),
tax_anno = taxAnnotation(
Prev. = anno_tax_prev(undetected = 50),
Log2 = anno_tax_box(undetected = 50, trans = "log2", zero_replace = 1)
)
)
```
## Citation
:innocent: If you find any part of microViz useful to your work, please consider citing the JOSS article:
Barnett et al., (2021).
microViz: an R package for microbiome data visualization and statistics.
Journal of Open Source Software, 6(63), 3201, <https://doi.org/10.21105/joss.03201>
## Contributing
Bug reports, questions, suggestions for new features, and other contributions are all welcome.
Feel free to create a [GitHub Issue](https://github.com/david-barnett/microViz/issues) or write on the [Discussions](https://github.com/david-barnett/microViz/discussions) page.
This project is released with a [Contributor Code of Conduct](https://david-barnett.github.io/microViz/CODE_OF_CONDUCT.html) and by participating in this project you agree to abide by its terms.
## Session info
```{r session}
sessionInfo()
```