Annotating plots

Author

Jeffrey R. Stevens

Published

April 26, 2023

  1. Using the mpg data, create a scatterplot of highway and city fuel efficiencies. Create a title, subtitle, caption, and axes labels.
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.0     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
mpg |> 
  ggplot(aes(x = cty, y = hwy)) +
  geom_point() +
  labs(title = "Fuel efficiency", subtitle = "Highway vs. city miles per gallon", caption = "Source: mpg data", x = "City miles per gallon", y = "Highway miles per gallon")

  1. Repeat #1 adding a linear regression line. Use cor() to calculate the correlation coefficient for the correlation. Add it to the plot somewhere labeled and rounded to two decimals.
mpg_corr <- cor(mpg$hwy, mpg$cty)
mpg |> 
  ggplot(aes(x = cty, y = hwy)) +
  geom_smooth(method = "lm") +
  geom_point() +
  labs(title = "Fuel efficiency", subtitle = "Highway vs. city miles per gallon", caption = "Source: mpg data", x = "City miles per gallon", y = "Highway miles per gallon") +
  annotate(geom = "text", label = paste0("r = ", round(mpg_corr, 2)), x = 15, y = 40)
`geom_smooth()` using formula = 'y ~ x'

  1. Repeat #1. Find the manufacturer and model of the data point with the highest city fuel efficiency. Label this point by drawing a line from the point to the text label and include the manufacturer and model (broken across two lines).
mpg |> 
  ggplot(aes(x = cty, y = hwy)) +
  geom_point() +
  labs(title = "Fuel efficiency", subtitle = "Highway vs. city miles per gallon", caption = "Source: mpg data", x = "City miles per gallon", y = "Highway miles per gallon") +
  annotate(geom = "text", label = "Volkswagon\nBeetle", x = 33, y = 40) +
  annotate(geom = "segment", x = 33, xend = 34.8, y = 41, yend = 43.5)

  1. Repeat #1 drawing grey horizontal and vertical lines at 20 mpg for both axes underneath the data points. Add a lightpink rectangle under the points filling the upper right quandrant (>20 for both axes).
mpg |> 
  ggplot(aes(x = cty, y = hwy)) +
  geom_hline(yintercept = 20, color = "grey60") +
  geom_vline(xintercept = 20, color = "grey60") +
  annotate(geom = "rect", xmin = 20, xmax = 50, ymin = 20, ymax = 50, fill = "lightpink", alpha = 0.25) +
  geom_point() +
  labs(title = "Fuel efficiency", subtitle = "Highway vs. city miles per gallon", caption = "Source: mpg data", x = "City miles per gallon", y = "Highway miles per gallon") +
  coord_cartesian(xlim = c(9, 35), ylim = c(10, 45))

  1. Create boxplots of fuel efficiency by class but order the class levels by mean highway fuel efficiency. At y = 10, add the sample size for each box (e.g., N=5, N=47, etc.).
mpg |> 
  ggplot(aes(x = fct_reorder(class, hwy), y = hwy)) +
  geom_boxplot() +
  geom_text(stat = "count", aes(label = paste0("N=", after_stat(count))), y = 10) +
  ylim(10, 44)