Plotting distributions: boxplots

Author

Francine Goh

Published

April 12, 2023

  1. Using the penguins data, create a boxplot that shows penguin flipper length by island without outliers.
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.0     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(palmerpenguins)
penguins %>%
  ggplot(aes(x = island, y = flipper_length_mm)) +
  geom_boxplot(outlier.shape = NA)
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_boxplot()`).

  1. Add the means and standard error for each boxplot.
penguins %>%
  ggplot(aes(x = island, y = flipper_length_mm)) +
  geom_boxplot(outlier.shape = NA) +
  stat_summary()
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_boxplot()`).
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_summary()`).
No summary function supplied, defaulting to `mean_se()`

  1. Switch from standard errors to confidence intervals, increase the size of the point, and color the box shading chocolate.
penguins %>%
  ggplot(aes(x = island, y = flipper_length_mm)) +
  geom_boxplot(outlier.shape = NA, fill = "chocolate") +
  stat_summary(fun.data = mean_cl_normal, size = 0.75)
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_boxplot()`).
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_summary()`).

  1. Fill the boxplots with color separately for each island and remove the legend.
penguins %>%
  ggplot(aes(x = island, y = flipper_length_mm, fill = island)) +
  geom_boxplot(outlier.shape = NA) +
  stat_summary(fun.data = mean_cl_normal, size = 0.75) +
  theme(legend.position = "none")
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_boxplot()`).
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_summary()`).

  1. Create a boxplot to show how flipper length differs for each species by island.
penguins %>%
  ggplot(aes(x = island, y = flipper_length_mm, fill = island)) +
  geom_boxplot(outlier.shape = NA) +
  stat_summary(fun.data = mean_cl_normal, size = 0.75) +
  facet_wrap(vars(species)) +
  theme(legend.position = "none")
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_boxplot()`).
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_summary()`).

  1. Recreate the boxplot #5 as a violin plot with a white background.
penguins %>%
  ggplot(aes(x = island, y = flipper_length_mm, fill = island)) +
  geom_violin(outlier.shape = NA) +
  stat_summary(fun.data = mean_cl_normal, size = 0.75) +
  facet_wrap(vars(species)) +
  theme(legend.position = "none",
        panel.background = element_rect(fill = "white"))
Warning in geom_violin(outlier.shape = NA): Ignoring unknown parameters:
`outlier.shape`
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_ydensity()`).
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_summary()`).