The goal of canaper
is to enable categorical analysis of neo- and
paleo-endemism (CANAPE) in R.
The stable version can be installed from CRAN:
install.packages("canaper")
The development version can be installed from r-universe or github:
# r-universe
options(repos = c(
ropensci = "https://ropensci.r-universe.dev/",
CRAN = "https://cran.rstudio.com/"
))
install.packages("canaper", dep = TRUE)
# OR
# github (requires `remotes` or `devtools`)
remotes::install_github("ropensci/canaper")
These examples use the dataset from Phylocom. The dataset includes a community (site x species) matrix and a phylogenetic tree.
library(canaper)
data(phylocom)
# Example community matrix including 4 "clumped" communities,
# one "even" community, and one "random" community
phylocom$comm
#> sp1 sp10 sp11 sp12 sp13 sp14 sp15 sp17 sp18 sp19 sp2 sp20 sp21 sp22
#> clump1 1 0 0 0 0 0 0 0 0 0 1 0 0 0
#> clump2a 1 2 2 2 0 0 0 0 0 0 1 0 0 0
#> clump2b 1 0 0 0 0 0 0 2 2 2 1 2 0 0
#> clump4 1 1 0 0 0 0 0 2 2 0 1 0 0 0
#> even 1 0 0 0 1 0 0 1 0 0 0 0 1 0
#> random 0 0 0 1 0 4 2 3 0 0 1 0 0 1
#> sp24 sp25 sp26 sp29 sp3 sp4 sp5 sp6 sp7 sp8 sp9
#> clump1 0 0 0 0 1 1 1 1 1 1 0
#> clump2a 0 0 0 0 1 1 0 0 0 0 2
#> clump2b 0 0 0 0 1 1 0 0 0 0 0
#> clump4 0 2 2 0 0 0 0 0 0 0 1
#> even 0 1 0 1 0 0 1 0 0 0 1
#> random 2 0 0 0 0 0 2 0 0 0 0
# Example phylogeny
phylocom$phy
#>
#> Phylogenetic tree with 32 tips and 31 internal nodes.
#>
#> Tip labels:
#> sp1, sp2, sp3, sp4, sp5, sp6, ...
#> Node labels:
#> A, B, C, D, E, F, ...
#>
#> Rooted; includes branch lengths.
The main “workhorse” function of canaper
is cpr_rand_test()
, which
conducts a randomization test to determine if observed values of
phylogenetic diversity (PD) and phylogenetic endemism (PE) are
significantly different from random. It also calculates the same values
on an alternative phylogeny where all branch lengths have been set equal
(alternative PD, alternative PE) as well as the ratio of the original
value to the alternative value (relative PD, relative PE).
set.seed(071421)
rand_test_results <- cpr_rand_test(
phylocom$comm, phylocom$phy,
null_model = "swap"
)
#> Warning: Abundance data detected. Results will be the same as if using
#> presence/absence data (no abundance weighting is used).
#> Warning: Dropping tips from the tree because they are not present in the community data:
#> sp16, sp23, sp27, sp28, sp30, sp31, sp32
cpr_rand_test
produces a lot of columns (nine per metric), so
let’s just look at a subset of them:
rand_test_results[, 1:9]
#> pd_obs pd_rand_mean pd_rand_sd pd_obs_z pd_obs_c_upper
#> clump1 0.3018868 0.4692453 0.03214267 -5.206739 0
#> clump2a 0.3207547 0.4762264 0.03263836 -4.763465 0
#> clump2b 0.3396226 0.4681132 0.03462444 -3.710978 0
#> clump4 0.4150943 0.4667925 0.03180131 -1.625660 3
#> even 0.5660377 0.4660377 0.03501739 2.855724 100
#> random 0.5094340 0.4733962 0.03070539 1.173662 79
#> pd_obs_c_lower pd_obs_q pd_obs_p_upper pd_obs_p_lower
#> clump1 100 100 0.00 1.00
#> clump2a 100 100 0.00 1.00
#> clump2b 100 100 0.00 1.00
#> clump4 91 100 0.03 0.91
#> even 0 100 1.00 0.00
#> random 6 100 0.79 0.06
This is a summary of the columns:
*_obs
: Observed value*_obs_c_lower
: Count of times observed value was lower than random values*_obs_c_upper
: Count of times observed value was higher than random values*_obs_p_lower
: Percentage of times observed value was lower than random values*_obs_p_upper
: Percentage of times observed value was higher than random values*_obs_q
: Count of the non-NA random values used for comparison*_obs_z
: Standard effect size (z-score)*_rand_mean
: Mean of the random values*_rand_sd
: Standard deviation of the random values
The next step in CANAPE is to classify endemism types according to the
significance of PE, alternative PE, and relative PE. This adds a column
called endem_type
.
canape_results <- cpr_classify_endem(rand_test_results)
canape_results[, "endem_type", drop = FALSE]
#> endem_type
#> clump1 not significant
#> clump2a not significant
#> clump2b not significant
#> clump4 not significant
#> even mixed
#> random mixed
This data set is very small, so it doesn’t include all possible endemism types. In total, they include:
paleo
: paleoendemicneo
: neoendemicnot significant
(what it says)mixed
: mixture of both paleo and neosuper
: mixed and highly significant (p < 0.01)
For a more complete example, please see the vignette
Several other R packages are available to calculate diversity metrics
for ecological communities. The non-exhaustive summary below focuses on
alpha diversity metrics in comparison with canaper
, and is not a
comprehensive description of each package.
- PhyloMeasures: Calculates
phylogenetic community diversity metrics including MPD, MNTD, PD,
phylosor, and unifrac. Null models for matrix randomization include
uniform
,frequency.by.richness
, andsequential
. - phyloregion: Calculates PD
but not MPD or MNTD. Implements sparse matrix encoding to increase
computing speed, which is used by
canaper
. Null models for matrix randomization includetipshuffle
,rowwise
, andcolwise
. Also performs regionalization based on taxonomic or phylogenetic beta diversity. - picante: Calculates MPD, MNTD,
PD, etc. Null models for community matrix randomization include
frequency
,richness
,independentswap
, andtrialswap
. - vegan: Performs a large range of
mostly non-phylogenetic diversity analyses. Includes the largest
selection of null models (> 20), according to data type (binary
vs. quantitative).
canaper
usesvegan
to randomize community matrices. - biodiverse: Not an R
package, but software written in perl with a GUI. Performs all of the
calculations needed for CANAPE, and many other metrics (> 300).
Includes
rand_structured
null model as well as spatially structured null models. None of these null models are currently available in any R packages AFAIK, except forindependentswap
.
Poster at Botany 2021
If you use this package, please cite it! Here is an example:
- Nitta JH, Laffan SW, Mishler BD, Iwasaki W. (2021) canaper: Categorical analysis of neo- and paleo-endemism in R. doi: 10.5281/zenodo.5094032
The example DOI above is for the overall package.
Here is the latest DOI, which you should use if you are using the latest version of the package:
You can find DOIs for older versions by viewing the “Releases” menu on the right.
- van Galen et al. 2023. “Correlated evolution in an ectomycorrhizal host-symbiont system”. New Phytologist https://doi.org/10.1111/nph.18802
- Naranjo et al. 2023. “Ancestral area analyses reveal Pleistocene-influenced evolution in a clade of coastal plain endemic plants”. Journal of Biogeography 50, 393-405 https://doi.org/10.1111/jbi.14541
- Ellepola et al. 2022. “The role of climate and islands in species diversification and reproductive-mode evolution of Old World tree frogs”. Communications Biology 5, 347 https://doi.org/10.1038/s42003-022-03292-1
- Lu et al. 2022 “A comprehensive evaluation of flowering plant diversity and conservation priority for national park planning in China”. Fundamental Research https://doi.org/10.1016/j.fmre.2022.08.008
- Nitta et al. 2022 “Spatial phylogenetics of Japanese ferns: Patterns, processes, and implications for conservation”. American Journal of Botany 109, 727-745 https://doi.org/10.1002/ajb2.1848
Contributions to canaper
are welcome! For more information, please see
CONTRIBUTING.md
Please note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
roxyglobals is used to
maintain R/globals.R
, but is not available on CRAN. You
will need to install this package from github and use the @autoglobal
or @global
roxygen tags to develop functions with globals.
- Code: MIT
- Example datasets
acacia
,biod_example
: GNU General Public License v3.0phylocom
: BSD-3-Clause
Mishler, B., Knerr, N., González-Orozco, C. et al. Phylogenetic measures of biodiversity and neo- and paleo-endemism in Australian Acacia. Nat Commun 5, 4473 (2014). https://doi.org/10.1038/ncomms5473