Plotting x-y data: time series


Jeffrey R. Stevens


April 19, 2023

  1. Using the mpg data, calculate the mean highway fuel efficiency for each number of cylinders and plot a line graph of fuel efficiency by cylinder number.
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.0     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<>) to force all conflicts to become errors
mpg |> 
  group_by(cyl) |> 
  summarise(mean_hwy = mean(hwy)) |> 
  ggplot(aes(x = cyl, y = mean_hwy)) +

  1. Repeat the previous plot but also group by class and plot separately colored lines for different classes.
mpg |> 
  group_by(cyl, class) |> 
  summarise(mean_hwy = mean(hwy)) |> 
  ggplot(aes(x = cyl, y = mean_hwy, color = class)) +
`summarise()` has grouped output by 'cyl'. You can override using the `.groups`

  1. Create a new column called low_high that codes high fuel efficiency greater than or equal to 25 as 1 and less than 25 as 0. Plot low_high as a function of displacement with a bubble chart (no legend) and include a logistic regression curve and band.
mpg |> 
  mutate(low_high = ifelse(hwy > 25, 1, 0)) |> 
  ggplot(aes(x = displ, y = low_high)) +
  geom_count(show.legend = FALSE) +
  geom_smooth(method = "glm", formula = y ~ x, method.args = list(family = "binomial"))

  1. Plot highway fuel efficiency for each class as points first, then add jitter, finding an appropriate amount of jitter to add.
mpg |> 
  ggplot(aes(x = class, y = hwy)) +

mpg |> 
  ggplot(aes(x = class, y = hwy)) +
  geom_jitter(width = 0.1, height = 2)

  1. Repeat plot #4 with a beeswarm plot.
mpg |> 
  ggplot(aes(x = class, y = hwy)) +