-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path96-packages-used.qmd
2907 lines (1943 loc) · 126 KB
/
96-packages-used.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Packages used {#sec-annex-a}
```{r}
#| label: setup
#| results: hold
#| include: false
base::source(file = paste0(here::here(), "/R/helper.R"))
```
## autoplotly {#sec-autoplotly}
::::: my-package
::: my-package-header
Package Profile: {autoplotly}
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Automatic Generation of Interactive Visualizations for Statistical Results](https://github.com/terrytangyuan/autoplotly) [@autoplotly; @ggfortify]</center>
------------------------------------------------------------------------
Functionalities to automatically generate interactive visualizations for statistical results supported by {**ggfortify**}, such as time series, PCA, clustering and survival analysis, with [plotly.js](https://plotly.com/) and {**ggplot2**} style. The generated visualizations can also be easily extended using {**ggplot2**} and {**plotly**} syntax while staying interactive.
:::
:::::
## BSDA {#sec-BSDA}
::::: my-package
::: my-package-header
Package Profile: BSDA
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Basic Statistics and Data Analysis](https://alanarnholt.github.io/BSDA) [@BSDA]</center>
------------------------------------------------------------------------
Data sets for the book "Basic Statistics and Data Analysis" by Larry J. Kitchens. (BSDA) [@kitchens2002].
:::
:::::
## broom {#sec-broom}
:::::: my-package
::: my-package-header
Package Profile: broom
:::
:::: my-package-container
<center>[Convert Statistical Objects in Tidy Tibbles](https://broom.tidymodels.org/) [@broom]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap09/logoi/logo-broom-min.png){width="176"}
Summarizes key information about statistical objects in tidy tibbles. This makes it easy to report results, create plots and consistently work with large numbers of models at once.
:::
------------------------------------------------------------------------
{**broom**} provides three verbs to make it convenient to interact with model objects:
- `tidy()` summarizes information about model components
- `glance()` reports information about the entire model
- `augment()` adds information about observations to a dataset
For a detailed introduction, please see [Introduction to broom](https://broom.tidymodels.org/articles/broom.html).
{**broom**} tidies 100+ models from popular modelling packages and almost all of the model objects in the stats package that comes with base R.
The vignette [Available methods](https://broom.tidymodels.org/articles/available-methods.html) lists the available methods.
::::
::::::
## car {#sec-car}
:::::: my-package
::: my-package-header
Package Profile: car
:::
:::: my-package-container
<center>[Companion to Applied Regression](https://www.john-fox.ca/Companion/index.html) [@car]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap06/logoi/logo-car-min.png){width="176"}
Functions to Accompany J. Fox and S. Weisberg, An R Companion to Applied Regression, Third Edition, Sage, 2019. [@fox2018]
:::
------------------------------------------------------------------------
An R Companion to Applied Regression is a broad introduction to the R statistical computing environment in the context of applied regression analysis. The book provides a step-by-step guide to using the free statistical software R, and emphasizes integrating statistical computing in R with the practice of data analysis. The R packages car and effects, written to facilitate the application and interpretation of regression analysis, are extensively covered in the book.
::::
::::::
## colorblindcheck {#sec-colorblindcheck}
::::: my-package
::: my-package-header
Package Profile: colorblindcheck
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Check Color Palettes for Problems with Color Vision Deficiency (CVD)](https://jakubnowosad.com/colorblindcheck/) [@colorblindcheck])</center>
------------------------------------------------------------------------
{*There is no hexagon sticker available for {**colorblindcheck**}.*}
Compare color palettes with simulations of color vision deficiencies - deuteranopia, protanopia, and tritanopia. It includes calculation of distances between colors, and creating summaries of differences between a color palette and simulations of color vision deficiencies.
Deciding if a color palette is a colorblind friendly is a hard task. This cannot be done in an entirely automatic fashion, as the decision needs to be confirmed by visual judgments. The goal of {**colorblindcheck**} is to provide tools to decide if the selected color palette is colorblind friendly, including:
- `palette_dist()` - Calculation of the distances between the colors in the input palette and between the colors in simulations of the color vision deficiencies: deuteranopia, protanopia, and tritanopia.
- `palette_plot()` - Plotting of the original input palette and simulations of color vision deficiencies: deuteranopia, protanopia, and tritanopia.
- `palette_check()` - Creating summary statistics comparing the original input palette and simulations of color vision deficiencies: deuteranopia, protanopia, and tritanopia.
:::
:::::
## colorblindr {#sec-colorblindr}
::::: my-package
::: my-package-header
Package Profile: colorblindr
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Simulate colorblindness in R figures](https://github.com/clauswilke/colorblindr) [@colorblindr]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**colorblindr**}.*)
Provides a variety of functions that are helpful to simulate the effects of colorblindness in R figures. Complete figures can be modified to simulate the effects of various types of colorblindness. The resulting figures are standard grid objects and can be further manipulated or outputted as usual.
:::
:::::
## colorspace {#sec-colorspace}
:::::::: my-package
::: my-package-header
Package Profile: colorspace
:::
:::::: my-package-container
------------------------------------------------------------------------
<center>[A Toolbox for Manipulating and Assessing Colors and Palettes](https://main_package_URL) [colorspace](#sec-colorspace)</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**colorspace**}.*)
The colorspace package provides a broad toolbox for selecting individual colors or color palettes, manipulating these colors, and employing them in various kinds of visualizations.
At the core of the package there are various utilities for computing with color spaces (as the name of the package conveys). Thus, the package helps to map various three-dimensional representations of color to each other. A particularly important mapping is the one from the perceptually-based and device-independent color model HCL (Hue-Chroma-Luminance) to standard Red-Green-Blue (sRGB) which is the basis for color specifications in many systems based on the corresponding hex codes (e.g., in HTML but also in R). For completeness further standard color models are included as well in the package: polarLUV() (= HCL), LUV(), polarLAB(), LAB(), XYZ(), RGB(), sRGB(), HLS(), HSV().
The HCL space (= polar coordinates in CIELUV) is particularly useful for specifying individual colors and color palettes as its three axes match those of the human visual system very well: Hue (= type of color, dominant wavelength), chroma (= colorfulness), luminance (= brightness).
There is extensive documentation available. See also the website on [HCL Color Space](https://hclwizard.org/):
> The hclwizard provides tools for manipulating and assessing colors and palettes based on the underlying colorspace software (available in R and Python). It leverages the HCL color space: a color model that is based on human color perception and thus makes it easy to choose good color palettes by varying three color properties: Hue (= type of color, dominant wavelength) - Chroma (= colorfulness) - Luminance (= brightness). As shown in the color swatches below each property can be varied while keeping the other two properties fixed.
::::: my-remark
::: my-remark-header
{colorspace}: My personal evaluation
:::
::: my-remark-container
This toolbox package is very important: All of the other color palette related package uses {**colorspace**} as a bases for their functionality.
:::
:::::
::::::
::::::::
## cowplot {#sec-cowplot}
:::::: my-package
::: my-package-header
Package Profile: cowplot
:::
:::: my-package-container
<center>[Streamlined Plot Theme and Plot Annotations for {**ggplot2**}](https://wilkelab.org/cowplot/) [@cowplot]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap03/logoi/logo-cowplot-min.png){width="176"}
The {**cowplot**} package provides various features that help with creating publication-quality figures, such as a set of themes, functions to align plots and arrange them into complex compound figures, and functions that make it easy to annotate plots and or mix plots with images. The package was originally written for internal use in the Wilke lab, hence the name (Claus O. Wilke’s plot package). It has also been used extensively in the book [Fundamentals of Data Visualization](https://www.amazon.com/gp/product/1492031089).
:::
------------------------------------------------------------------------
There are several packages that can be used to align plots. The most widely used ones beside {**cowplot**} are {**egg**} and {**patchwork**} (see @sec-patchwork). All these packages use slightly different approaches to plot alignment, and the respective approaches have different strengths and weaknesses. If you cannot achieve your desired result with one of these packages try another one.
Most importantly, while {**egg**} and {**patchwork**} align and arrange plots at the same time, {**cowplot**} aligns plots independently of how they are arranged. This makes it possible to align plots and then reproduce them separately, or even overlay them on top of each other.
The {**cowplot**} package now provides a set of complementary themes with different features. I now believe that there isn’t one single theme that works for all figures, and therefore I recommend that you always explicitly set a theme for every plot you make.
::::
::::::
## cranlogs {#sec-cranlogs}
:::::: my-package
::: my-package-header
Package Profile: cranlogs
:::
:::: my-package-container
<center>[Download Logs from RStudio CRAN Mirror](https://r.hub.github.io/cranlogs) [@cranlogs]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap05/logoi/logo-cranlogs-min.png){width="176"}
`r glossary("APIx", "API")` to the database of `r glossary("CRAN")` package downloads from the RStudio CRAN mirror. The database itself is at <http://cranlogs.r-pkg.org>, see <https://github.com/r-hub/cranlogs.app> for the raw API.
:::
------------------------------------------------------------------------
RStudio publishes the download logs from their CRAN package mirror daily at <http://cran-logs.rstudio.com>.
This R package queries a web API maintained by R-hub that contains the daily download numbers for each package.
The RStudio CRAN mirror is not the only CRAN mirror, but it’s a popular one: it’s the default choice for RStudio users. The actual number of downloads over all CRAN mirrors is unknown.
::::
::::::
## crosstable {#sec-crosstable}
::::::::: my-package
::: my-package-header
Package Profile: crosstable
:::
::::::: my-package-container
<center>[Crosstables for Descriptive Analysis](https://danchaltiel.github.io/crosstable) [@crosstable]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap05/logoi/logo-crosstable-min.png){width="176"}
Crosstable is a package centered on a single function, crosstable, which easily computes descriptive statistics on datasets. It can use the {**tidyverse**} syntax and is interfaced with the package {**officer**} to create automatized reports.
:::
------------------------------------------------------------------------
Create descriptive tables for continuous and categorical variables. Apply summary statistics and counting function, with or without a grouping variable, and create beautiful reports using {**rmarkdown**} or {**officer**}. You can also compute effect sizes and statistical tests if needed.
::::: my-remark
::: my-remark-header
{crosstable}: Personal evaluation
:::
::: my-remark-container
I believe that the main usage for this package is to prepare ready-to-print tables. Similar like {**gtsummary**} (see @sec-gtsummary) it provides some descriptive statistics with many display options. But I got the impression that analysis of data is not the main usage of these packages.
For instance you could use `crosstable::display_test(chisq.test(x, y))` to get as result a string, for instance: "p value: \<0.0001 \n(Pearson's Chi-squared test)". This is nice to include for a table, but for the analysis one would also need the values of the different cells.
:::
:::::
:::::::
:::::::::
## curl {#sec-curl}
::::: my-package
::: my-package-header
Package Profile: curl
:::
::: my-package-container
------------------------------------------------------------------------
<center>[A Modern and Flexible Weblient for R](https://cran.r-project.org/web/packages/curl/vignettes/intro.html) [@curl]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**curl**}.*)
The `curl()` and `curl_download(`) functions provide highly configurable drop-in replacements for base `url()` and `download.file()` with better performance, support for encryption (https, ftps), gzip compression, authentication, and other 'libcurl' goodies.
The core of the package implements a framework for performing fully customized requests where data can be processed either in memory, on disk, or streaming via the callback or connection interfaces. Some knowledge of 'libcurl' is recommended; for a more-user-friendly web client see the 'httr' package which builds on this package with http specific tools and logic.
:::
:::::
## data.table {#sec-data-table}
::::::::: my-package
::: my-package-header
Package Profile: data.table
:::
::::::: my-package-container
<center>[Extension of `data.frame`](https://rdatatable.gitlab.io/data.table/) [@datatable]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap01/logoi/logo-data.table-min.png){width="176"}
{**data.table**} provides a high-performance version of base R’s data.frame with syntax and feature enhancements for ease of use, convenience and programming speed.\
\
Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development.
:::
------------------------------------------------------------------------
**Features**
- fast and friendly delimited file reader: `data.table::fread()`, see also convenience features for small data
- fast and feature rich delimited file writer: `data.table::fwrite()`
- low-level parallelism: many common operations are internally parallelized to use multiple CPU threads
- fast and scalable aggregations; e.g. 100GB in RAM (see benchmarks on up to two billion rows)
- fast and feature rich joins: ordered joins (e.g. rolling forwards, backwards, nearest and limited staleness), overlapping range joins (similar to IRanges::findOverlaps), non-equi joins (i.e. joins using operators \>, \>=, \<, \<=), aggregate on join (by=.EACHI), update on join
- fast add/update/delete columns by reference by group using no copies at all
- fast and feature rich reshaping data: `data.table::dcast()` (pivot/wider/spread) and `data.table::melt()` (unpivot/longer/gather)
- any R function from any R package can be used in queries not just the subset of functions made available by a database backend, also columns of type list are supported
- has no dependencies at all other than base R itself, for simpler production/maintenance
- the R dependency is as old as possible for as long as possible, dated April 2014, and we continuously test against that version; e.g. v1.11.0 released on 5 May 2018 bumped the dependency up from 5 year old R 3.0.0 to 4 year old R 3.1.0
::::: my-remark
::: my-remark-header
{**data.table**}: Personal evaluation
:::
::: my-remark-container
I believe the most important application of {**data.table**} is working with huge amount of data (several GB). In the book SwR it is used in this first chapter with the `data.table::fread()` function. I have used there the `readr::read_csv()` as part of the {**tidyverse**} collection, because the dataset is very small (29 kB).
With {**DT**} there is a similar package that seems important. It is a wrapper of the JavaScript library 'DataTables' (See @sec-DT). I was using already {**DT**} to display interactive tables on websites, but at time I didn't understand completely the difference between {**data.table**} and {**DT**}. As far as I understand it now the differences are:
- {**data.table**}: A package for efficient data manipulation and analysis, focusing on speed, memory efficiency, and flexibility. It provides a powerful data structure for handling large datasets.
- {**DT**} (datatable): A package for rendering R data frames as interactive HTML tables, focusing on visualization and user interaction. It provides a simple way to create web-based tables with filtering, sorting, and editing capabilities.
:::
:::::
:::::::
:::::::::
## datawizard {#sec-datawizard}
:::::: my-package
::: my-package-header
Package Profile: datawizard
:::
:::: my-package-container
<center>[Easy Data Wrangling and Statistical Transformations](https://easystats.github.io/datawizard/) [@datawizard]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap06/logoi/logo-datawizard-min.png){width="176"}
{**datawizard**} is a lightweight package to easily manipulate, clean, transform, and prepare your data for analysis. It is part of the {**easystats**} ecosystem, a suite of R packages to deal with your entire statistical analysis, from cleaning the data to reporting the results.
:::
------------------------------------------------------------------------
{**datawizard**} covers two aspects of data preparation:
- **Data manipulation**: datawizard offers a very similar set of functions to that of the tidyverse packages, such as a {**dplyr**} and {**tidyr**}, to select, filter and reshape data, with a few key differences.
1) All data manipulation functions start with the prefix `data_*` (which makes them easy to identify).
2) Although most functions can be used exactly as their tidyverse equivalents, they are also string-friendly (which makes them easy to program with and use inside functions).
3) Finally, datawizard is super lightweight (no dependencies, similar to {**poorman**}), which makes it awesome for developers to use in their packages.
- Statistical transformations: {**datawizard**} also has powerful functions to easily apply common data transformations, including standardization, normalization, rescaling, rank-transformation, scale reversing, recoding, binning, etc.
::::
::::::
## descr {#sec-descr}
::::: my-package
::: my-package-header
Package Profile: descr
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Descriptive Statistics](https://github.com/jalvesaq/descr) [@descr]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**descr**}.*)
Weighted frequency and contingency tables of categorical variables and of the comparison of the mean value of a numerical variable by the levels of a factor, and methods to produce xtable objects of the tables and to plot them. There are also functions to facilitate the character encoding conversion of objects, to quickly convert fixed width files into csv ones, and to export a data.frame to a text file with the necessary R and SPSS codes to reread the data. [@descr]
:::
:::::
## DescTools {#sec-DescTools}
::::: my-package
::: my-package-header
Package Profile: DescTool
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Tools for Descriptive Statistics](https://andrisignorell.github.io/DescTools/) [@DescTools]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**DescTools**}.*)
A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results.
The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.
:::
:::::
## dfidx {#sec-dfidx}
::::: my-package
::: my-package-header
Package Profile: dfidx
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Indexed Data Frames](https://cran.r-project.org/package=dfidx) [@dfidx]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**dfidx**}.*)
Provides extended data frames, with a special data frame column which contains two indexes, with potentially a nesting structure.
:::
:::::
## dichromat {#sec-dichromat}
::::: my-package
::: my-package-header
Package Profile: dichromat
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Color Schemes for Dichromats](https://cran.r-project.org/package=dichromat) [@dichromat]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**dichromat**}.*)
Collapse red-green or green-blue distinctions to simulate the effects of different types of color-blindness.
:::
:::::
## dplyr {#sec-dplyr}
:::::: my-package
::: my-package-header
Package Profile: dplyr
:::
:::: my-package-container
<center>[A Grammar of Data Manipulation](https://dplyr.tidyverse.org/) [@dplyr]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap01/logoi/logo-dplyr-min.png){width="176"}
{**dplyr**} is a grammar of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges: - mutate() adds new variables that are functions of existing variables - select() picks variables based on their names. - filter() picks cases based on their values. - summarise() reduces multiple values down to a single summary. - arrange() changes the ordering of the rows.
:::
------------------------------------------------------------------------
These all combine naturally with `group_by()` which allows you to perform any operation “by group”. You can learn more about them in [vignette("dplyr")](https://dplyr.tidyverse.org/articles/dplyr.html). As well as these single-table verbs, dplyr also provides a variety of two-table verbs, which you can learn about in [vignette("two-table")](https://dplyr.tidyverse.org/articles/two-table.html). [@dplyr]
::::
::::::
## DT {#sec-DT}
::::: my-package
::: my-package-header
Package Profile: DT
:::
::: my-package-container
------------------------------------------------------------------------
<center>[A Wrapper of the JavaScript Library 'DataTables'](https://rstudio.github.io/DT/) [@DT]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**DT**}.*)
Data objects in R can be rendered as HTML tables using the JavaScript library [DataTables](https://datatables.net/) (typically via {**R Markdown**} or {**Shiny**}). The 'DataTables' library has been included in this R package. The package name {**DT**} is an abbreviation of 'DataTables'.
:::
:::::
## dunn.test {#sec-dunn.test}
::::: my-package
::: my-package-header
Package Profile: dunn.test
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Dunn's Test of Multiple Comparisons Using Rank Sums](https://cran.r-project.org/package=dunn.test) [@dunn.test]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**dunn.test**}.*)
Computes Dunn's test [@dunn1964] for stochastic dominance and reports the results among multiple pairwise comparisons after a Kruskal-Wallis test for stochastic dominance among k groups [@kruskal1952]. The interpretation of stochastic dominance requires an assumption that the CDF of one group does not cross the CDF of the other.
{**dunn.test**} makes k(k-1)/2 multiple pairwise comparisons based on Dunn's z-test-statistic approximations to the actual rank statistics. The null hypothesis for each pairwise comparison is that the probability of observing a randomly selected value from the first group that is larger than a randomly selected value from the second group equals one half; this null hypothesis corresponds to that of the `r glossary("Mann-Whitney", "Wilcoxon-Mann-Whitney rank-sum test")`. Like the rank-sum test, if the data can be assumed to be continuous, and the distributions are assumed identical except for a difference in location, Dunn's test may be understood as a test for median difference. {**dunn.test**} accounts for tied ranks.
:::
:::::
## e1071 {#sec-e1071}
::::: my-package
::: my-package-header
Package Profile: e1071
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Misc. functions](https://cran.r-project.org/web/packages/e1071/index.html) [@e1071]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**e1071**}.*)
Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation, bagged clustering, naive Bayes classifier, generalized k-nearest neighbor ...
:::
:::::
## effectsize {#sec-effectsize}
:::::: my-package
::: my-package-header
Package Profile: effectsize
:::
:::: my-package-container
<center>[Indices of Effect Size](https://easystats.github.io/effectsize/) [@effectsize]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap06/logoi/logo-effectsize-min.png){width="176"}
The goal of this package is to provide utilities to work with indices of effect size and standardized parameters, allowing computation and conversion of indices such as Cohen’s d, r, odds-ratios, etc.
:::
------------------------------------------------------------------------
Provide utilities to work with indices of effect size for a wide variety of models and hypothesis tests (see list of supported models using the function 'insight::supported_models()'), allowing computation of and conversion between indices such as Cohen's d, r, odds, etc.
::::
::::::
## fmsb {#sec-fmsb}
::::: my-package
::: my-package-header
Package Profile: fmsb
:::
::: my-package-container
------------------------------------------------------------------------
<center>Functions for Medical Statistics Book with some Demographic Data\](https://cran.r-project.org/package=fmsb) [@fmsb]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**fmsb**}.*)
Several utility functions for the book entitled "Practices of Medical and Health Data Analysis using R" (Pearson Education Japan, 2007) with Japanese demographic data and some demographic analysis related functions.
:::
:::::
## forcats {#sec-forcats}
:::::: my-package
::: my-package-header
Package Profile: forcats
:::
:::: my-package-container
<center>[Tools for Working with Categorical Variables (Factors)](https://forcats.tidyverse.org/) [@forcats]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap01/logoi/logo-forcats-min.png){width="176"}
{**forcats**} provide a suite of useful tools that solve common problems with factors. "Forcats" is an anagram of "factors" and part of the {**tidyverse**} suite of packages.
:::
(1) reordering factor levels
- moving specified levels to front,
- ordering by first appearance,
- reversing, and
- randomly shuffling
(2) tools for modifying factor levels
- collapsing rare levels into other,
- 'anonymizing', and
- manually 'recoding'
::::
::::::
## GGally {#sec-GGally}
::::: my-package
::: my-package-header
Package Profile: GGally
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Extension to {**ggplot2**}](https://ggobi.github.io/ggally/) [@GGally]</center>
------------------------------------------------------------------------
(*There is no hexagon logo for {**GGally**} available*)
The R package {**ggplot2**} is a plotting system based on the grammar of graphics. {**GGally**} extends {**ggplot2**} by adding several functions to reduce the complexity of combining geometric objects with transformed data. Some of these functions include
- a pairwise plot matrix,
- a two group pairwise plot matrix,
- a parallel coordinates plot,
- a survival plot,
- and several functions to plot networks.
:::
:::::
## ggfortify {#sec-ggfortify}
::::: my-package
::: my-package-header
Package Profile: ggfortify
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Data Visualization Tools for Statistical Analysis Results](https://github.com/sinhrks/ggfortify) [@ggfortify]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**ggfortify**}.*)
Unified plotting tools for statistics commonly used, such as GLM, time series, PCA families, clustering and survival analysis. The package offers a single plotting interface for these analysis results and plots in a unified style using {**ggplot2**}.
This package offers `fortify()` and `autoplot()` functions to allow automatic {**ggplot2**} to visualize statistical result of popular R packages. Check out our R Journal paper for more details on the overall architecture design and a gallery of visualizations created with this package. Also check out autoplotly package that could automatically generate interactive visualizations with plotly.js style based on ggfortify. The generated visualizations can also be easily extended using ggplot2 syntax while staying interactive.
:::
:::::
## ggmosaic {#sec-ggmosaic}
:::::: my-package
::: my-package-header
Package Profile: ggmosaic
:::
:::: my-package-container
<center>[Mosaic Plots in the {**ggplot2**} Framework](https://haleyjeppson.github.io/ggmosaic/) [@ggmosaic]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap03/logoi/logo-ggmosaic-min.png){width="176"}
{**ggmosaic**} is designed to create visualizations of categorical data and is capable of producing bar charts, stacked bar charts, mosaic plots, and double decker plots and therefore offers a wide range of potential plots.
:::
------------------------------------------------------------------------
Furthermore, {**ggmosaic**} allows various features to be customized:
- the order of the variables,
- the formula setup of the plot,
- faceting,
- the type of partition, and
- the space between the categories.
::::
::::::
## ggokabeito {#sec-ggokabeito}
::::: my-package
::: my-package-header
Package Profile: ggokabeito
:::
::: my-package-container
------------------------------------------------------------------------
<center>['Okabe-Ito' Scales for {**ggplot2**} and {**ggraph**}](https://malcolmbarrett.github.io/ggokabeito/index.html) [@ggokabeito]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**ggokabeito**}.*)
Discrete scales for the colorblind-friendly `Okabe-Ito` palette, including 'color', 'fill', and 'edge_colour'. {**ggokabeito**} provides {**ggplot2**} and {**ggraph**} scales to easily use the discrete, colorblind-friendly ‘Okabe-Ito’ palette in your data visualizations.
Currently, {**ggokabeito**} provides the following scales:
- `scale_color_okabe_ito(`)/`scale_colour_okabe_ito()`
- `scale_fill_okabe_ito()`
- `scale_edge_color_okabe_ito()`/`scale_edge_colour_okabe_ito()`
:::
:::::
## ggplot2 {#sec-ggplot2}
:::::: my-package
::: my-package-header
Package Profile: ggplot2
:::
:::: my-package-container
<center>[Create Elegant Data Visualisations Using the Grammar of Graphics](https://ggplot2.tidyverse.org/) [@ggplot2]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap01/logoi/logo-ggplot2-min.png){width="176"}
{**ggplot2**} is a system for declaratively creating graphics, based on [The Grammar of Graphics](https://link.springer.com/book/10.1007/0-387-28695-0). You provide the data, tell {**ggplot2**} how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details. [@ggplot2]
:::
------------------------------------------------------------------------
It’s hard to succinctly describe how {**ggplot2**} works because it embodies a deep philosophy of visualization. However, in most cases you start with `ggplot()`, supply a dataset and aesthetic mapping (with `aes()`). You then add on layers (like `geom_point()` or `geom_histogram()`), scales (like `scale_colour_brewer()`), faceting specifications (like `facet_wrap()`) and coordinate systems (like `coord_flip()`).
::::
::::::
## gplots {#sec-gplots}
::::: my-package
::: my-package-header
Package Profile: gplots
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Various R Programming Tools for Plotting Data](https://github.com/talgalili/gplots) [@gplots]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**gplots**}.*)
Various R programming tools for plotting data, including:
- calculating and plotting locally smoothed summary function as ('bandplot', 'wapply'),
- enhanced versions of standard plots ('barplot2', 'boxplot2', 'heatmap.2', 'smartlegend'),
- manipulating colors ('col2hex', 'colorpanel', 'redgreen', 'greenred', 'bluered', 'redblue', 'rich.colors'),
- calculating and plotting two-dimensional data summaries ('ci2d', 'hist2d'),
- enhanced regression diagnostic plots ('lmplot2', 'residplot'),
- formula-enabled interface to 'stats::lowess' function ('lowess'),
- displaying textual data in plots ('textplot', 'sinkplot'),
- plotting a matrix where each cell contains a dot whose size reflects the relative magnitude of the elements ('balloonplot'),
- plotting "Venn" diagrams ('venn'),
- displaying Open-Office style plots ('ooplot'),
- plotting multiple data on same region, with separate axes ('overplot'),
- plotting means and confidence intervals ('plotCI', 'plotmeans'),
- spacing points in an x-y plot so they don't overlap ('space').
:::
:::::
## gridExtra {#sec-gridExtra}
::::: my-package
::: my-package-header
Package Profile: gridExtra
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Miscellaneous Functions for "Grid" Graphics](https://cran.r-project.org/package=gridExtra) [@gridExtra]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**gridExtra**}.*)
Provides a number of user-level functions to work with "grid" graphics, notably to arrange multiple grid-based plots on a page, and draw tables.
The {**grid**) package (= part of the R system library) provides low-level functions to create graphical objects (`grobs`), and position them on a page in specific viewports. The {**gtable**} package introduced a higher-level layout scheme, arguably more amenable to user-level interaction. With the `gridExtra::arrangeGrob()` / `gridExtra::grid.arrange()` pair of functions, {**gridExtra**} builds upon {**gtable**} to arrange multiple `grobs` on a page.
:::
:::::
## ggrepel {#sec-ggrepel}
:::::: my-package
::: my-package-header
Package Profile: ggrepel
:::
:::: my-package-container
<center>[Automatically Position Non-Overlapping Text Labels with 'ggplot2'](https://ggrepel.slowkow.com/) [@ggrepel]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap03/logoi/logo-ggrepel-min.png){width="176"}
Provides text and label geoms for 'ggplot2' that help to avoid overlapping text labels. Labels repel away from each other and away from the data points.
:::
------------------------------------------------------------------------
{**ggrepel**} provides two geoms for {**ggplot2**} to repel overlapping text labels:
- `ggrepel::geom_text_repel()`
- `ggrepel::geom_label_repel()`
::::
::::::
## ggtext {#sec-ggtext}
::::: my-package
::: my-package-header
Package Profile: ggtext
:::
::: my-package-container
------------------------------------------------------------------------
<center>[Improved Text Rendering Support for 'ggplot2'](https://wilkelab.org/ggtext/) [@ggtext]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**ggtext**}.*)
The ggtext package provides simple Markdown and HTML rendering for {**ggplot2.**} Under the hood, the package uses the {**gridtext**} package for the actual rendering, and consequently it is limited to the [feature set provided by gridtext](https://wilkelab.org/gridtext/).
Support is provided for Markdown both in theme elements (plot titles, subtitles, captions, axis labels, legends, etc.) and in geoms (similar to `ggplot2::geom_text()`). In both cases, there are two alternatives, one for creating simple text labels and one for creating text boxes with word wrapping.
:::
:::::
## glue {#sec-glue}
:::::: my-package
::: my-package-header
Package Profile: glue
:::
:::: my-package-container
<center>[Interpreted String Literals](https://glue.tidyverse.org/) [@glue]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap02/logoi/logo-glue-min.png){width="176"}
An implementation of interpreted string literals, inspired by Python's Literal String Interpolation
------------------------------------------------------------------------
Glue offers interpreted string literals that are small, fast, and dependency-free. Glue does this by embedding R expressions in curly braces which are then evaluated and inserted into the argument string.
:::
::::
::::::
## gssr {#sec-gssr}
::::: my-package
::: my-package-header
Package Profile: gssr
:::
::: my-package-container
------------------------------------------------------------------------
<center>[US General Social Survey (GSS) Data for R](https://kjhealy.github.io/gssr/) [@gssr]</center>
------------------------------------------------------------------------
(*There is no hexagon sticker available for {**gssr**}.*)
[GSSR Package](https://kjhealy.github.io/gssr/): The General Social Survey Cumulative Data (1972-2022) and Panel Data files packaged for easy use in R. {**gssr**} is a data package, developed and maintained by [Kieran Healy](https://kieranhealy.org/), the author of [Data Visualization](https://kieranhealy.org/publications/dataviz/). The package bundles several datasets into a convenient format. Because of its large size {**gssr**} is not hosted on CRAN but as a [GitHub repository](https://github.com/kjhealy/gssr/).
Instead of browsing and examining the complex dataset with the [GSS Data Explorer](https://gssdataexplorer.norc.org/) or [download datasets directly](https://gss.norc.org/Get-The-Data) from the The National Opinion Research Center ([NORC](http://norc.org/)) you can now just work inside R. The current package 0.4 (see: [gssr Update](https://kieranhealy.org/blog/archives/2023/12/02/gssr-update/)) provides the GSS Cumulative Data File (1972-2022), three GSS Three Wave Panel Data Files (for panels beginning in 2006, 2008, and 2010, respectively), and the 2020 panel file.
Version 0.40 also integrates survey code book information about variables directly into R’s help system, allowing them to be accessed via the help browser or from the console with ?, as if they were functions or other documented objects.
:::
:::::
## gt {#sec-gt}
:::::: my-package
::: my-package-header
Package Profile: gt
:::
:::: my-package-container
<center>[Easily Create Presentation-Ready Display Tables](https://gt.rstudio.com) [@gt]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap02/logoi/logo-gt-min.png){width="176"}
With the {**gt**} package, anyone can make wonderful-looking tables using the R programming language. The gt philosophy: we can construct a wide variety of useful tables with a cohesive set of table parts. These include the table header, the stub, the column labels and spanner column labels, the table body, and the table footer.
:::
------------------------------------------------------------------------
It all begins with table data (be it a tibble or a data frame). You then decide how to compose your {**gt**} table with the elements and formatting you need for the task at hand. Finally, the table is rendered by printing it at the console, including it in an R Markdown document, or exporting to a file using `gtsave()`. Currently, {**gt**} supports the HTML, LaTeX, and RTF output formats.
::::
::::::
## gtsummary {#sec-gtsummary}
:::::: my-package
::: my-package-header
Package Profile: gtsummary
:::
:::: my-package-container
<center>[Presentation-Ready Data Summary and Analytic Result Tables](https://www.danieldsjoberg.com/gtsummary/) [@gtsummary]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap01/logoi/logo-gtsummary-min.png){width="176"}
Creates presentation-ready tables summarizing data sets, regression models, and more. The code to create the tables is concise and highly customizable. Data frames can be summarized with any function, e.g. mean(), median(), even user-written functions. Regression models are summarized and include the reference rows for categorical variables. Common regression models, such as logistic regression and Cox proportional hazards regression, are automatically identified and the tables are pre-filled with appropriate column headers.
:::
------------------------------------------------------------------------
- Summarize data frames or tibbles easily in R. Perfect for creating a `r glossary("Table 1")`.
- Summarize regression models in R and include reference rows for categorical variables.
- Customize {**gtsummary**} tables using a growing list of formatting/styling functions.
- Report statistics inline from summary tables and regression summary tables in R markdown. Make your reports completely reproducible!
By leveraging {**broom**}, {**gt**}, and {**labelled**} packages, {**gtsummary**} creates beautifully formatted, ready-to-share summary and result tables in a single line of R code!
::::
::::::
## haven {#sec-haven}
::::::::: my-package
::: my-package-header
Package Profile: haven
:::
::::::: my-package-container
<center>[Import and Export 'SPSS', 'Stata' and 'SAS' Files](https://haven.tidyverse.org/index.html) [@haven]</center>
------------------------------------------------------------------------
::: {layout="[10, 30]" layout-valign="center"}
![](img/chap01/logoi/logo-haven-min.png){width="176"}
{**haven**} enables R to read and write various data formats used by other statistical packages. Currently it supports [SAS](https://www.sas.com/en_us/home.html), [SPSS](https://www.ibm.com/spss) and [STATA](https://www.stata.com/). {**haven**} output object has four important features:
:::
------------------------------------------------------------------------