# Annotating plots

Author

Jeffrey R. Stevens

Published

April 26, 2023

1. Using the `mpg` data, create a scatterplot of highway and city fuel efficiencies. Create a title, subtitle, caption, and axes labels.
``library(tidyverse)``
``````── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.0     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors``````
``````mpg |>
ggplot(aes(x = cty, y = hwy)) +
geom_point() +
labs(title = "Fuel efficiency", subtitle = "Highway vs. city miles per gallon", caption = "Source: mpg data", x = "City miles per gallon", y = "Highway miles per gallon")``````
1. Repeat #1 adding a linear regression line. Use `cor()` to calculate the correlation coefficient for the correlation. Add it to the plot somewhere labeled and rounded to two decimals.
``````mpg_corr <- cor(mpg\$hwy, mpg\$cty)
mpg |>
ggplot(aes(x = cty, y = hwy)) +
geom_smooth(method = "lm") +
geom_point() +
labs(title = "Fuel efficiency", subtitle = "Highway vs. city miles per gallon", caption = "Source: mpg data", x = "City miles per gallon", y = "Highway miles per gallon") +
annotate(geom = "text", label = paste0("r = ", round(mpg_corr, 2)), x = 15, y = 40)``````
```geom_smooth()` using formula = 'y ~ x'``
1. Repeat #1. Find the manufacturer and model of the data point with the highest city fuel efficiency. Label this point by drawing a line from the point to the text label and include the manufacturer and model (broken across two lines).
``````mpg |>
ggplot(aes(x = cty, y = hwy)) +
geom_point() +
labs(title = "Fuel efficiency", subtitle = "Highway vs. city miles per gallon", caption = "Source: mpg data", x = "City miles per gallon", y = "Highway miles per gallon") +
annotate(geom = "text", label = "Volkswagon\nBeetle", x = 33, y = 40) +
annotate(geom = "segment", x = 33, xend = 34.8, y = 41, yend = 43.5)``````
1. Repeat #1 drawing grey horizontal and vertical lines at 20 mpg for both axes underneath the data points. Add a lightpink rectangle under the points filling the upper right quandrant (>20 for both axes).
``````mpg |>
ggplot(aes(x = cty, y = hwy)) +
geom_hline(yintercept = 20, color = "grey60") +
geom_vline(xintercept = 20, color = "grey60") +
annotate(geom = "rect", xmin = 20, xmax = 50, ymin = 20, ymax = 50, fill = "lightpink", alpha = 0.25) +
geom_point() +
labs(title = "Fuel efficiency", subtitle = "Highway vs. city miles per gallon", caption = "Source: mpg data", x = "City miles per gallon", y = "Highway miles per gallon") +
coord_cartesian(xlim = c(9, 35), ylim = c(10, 45))``````
1. Create boxplots of fuel efficiency by class but order the class levels by mean highway fuel efficiency. At y = 10, add the sample size for each box (e.g., N=5, N=47, etc.).
``````mpg |>
ggplot(aes(x = fct_reorder(class, hwy), y = hwy)) +
geom_boxplot() +
geom_text(stat = "count", aes(label = paste0("N=", after_stat(count))), y = 10) +
ylim(10, 44)``````