exercise.qmd

---
title: "練習問題の答えと解説"
code-fold: false
---

```{r}
#| include: false
source("scripts/color_palette.R")
```


## 1章: テストの点数がクラスの平均点よりも低くても上位に含まれる？

```{r}
#| include: false
set.seed(12)
x <- 
  sort(round(rnorm(n = 40, mean = 48, sd = 16), digits = 0))
x[32:40] <- x[32:40] + rpois(n = 9, lambda = 10)
```


<details><summary>答えと解説を表示</summary>

**40人のクラスで平均点が`r round(mean(x), digits = 0)`点のとき、点数が45点であっても上位20人の中に含まれることがある。**


平均値はデータの真ん中を表わす数値ではないよ。
</details>


```{r}
library(ggplot2)

# クラス中の40人のテストの点数（点数順）
x <- c(16, 24, 27, 31, 32, 32, 33, 33, 36, 36, 37, 38, 39, 40, 40, 
42, 43, 43, 43, 44, 44, 45, 46, 46, 48, 50, 50, 52, 52, 53, 54, 
65, 62, 66, 70, 75, 73, 82, 88, 89)
x
mean(x) # クラスの平均点
median(x) # クラスの点数の中央値

p <- 
  tibble::tibble(x = x) |> 
  ggplot(aes(x)) +
  geom_density() +
  xlim(0, 100) +
  labs(title = "40名のテスト点数の分布")

p +
  geom_vline(xintercept = median(x), color = course_colors[1]) +
  geom_vline(xintercept = mean(x), color = course_colors[2]) +
  geom_label(aes(mean(x), 0.01), 
             label = "平均値", 
             color = course_colors[2],
             show.legend = FALSE) +
  geom_label(aes(median(x), 0.02), 
             label = "中央値", 
             color = course_colors[1],
             show.legend = FALSE)
```

```{r}
x[1:20]

x[21:40]
```

## 3章: 動物データの代表値を計算しよう

ここで紹介する関数は一部の例です。

```{r}
#| code-fold: true
df_zoo <- 
  readr::read_csv("data-raw/tokushima_zoo_animals22.csv", 
                col_types = "ccdd_")

summary(df_zoo)
# 中央値
quantile(df_zoo$body_length_cm, na.rm = TRUE)[3]
```

```{r}
#| eval: false
#| echo: true
skimr::skim(df_zoo)
psych::describe(df_zoo)
summarytools::descr(df_zoo)
```