Plotting distributions: histograms

Author

Jeffrey R. Stevens

Published

April 10, 2023

  1. Using the mtcars data, create a histogram of the fuel efficiency values.
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.0     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
mtcars |> 
  ggplot(aes(x = mpg)) +
  geom_histogram()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

  1. Not a great histogram. Mess with the number of bins until you get a nice histogram.
mtcars |> 
  ggplot(aes(x = mpg)) +
  geom_histogram(bins = 8)

  1. Now change the bin width to generate the same plot as #2.
mtcars |> 
  ggplot(aes(x = mpg)) +
  geom_histogram(binwidth = 3.35)

  1. Using the same binwidth from #3, plot a histogram with lightseagreen lines and aquamarine3 shaded areas. Then overlay a density plot with a aquamarine4 line with width 2.
mtcars |> 
  ggplot(aes(x = mpg)) +
  geom_histogram(aes(y = after_stat(density)), binwidth = 3.35, fill = "aquamarine3", color = "lightseagreen") +
  geom_density(bw = 3.35, color = "aquamarine4", linewidth = 2)

  1. What is the difference between a frequency polygon and a density plot?

  2. Make a density plot with bandwidth of 3 and separate line colors for different cylinder levels.

mtcars |> 
  mutate(cyl = as.factor(cyl)) |> 
  ggplot(aes(x = mpg, color = cyl)) +
  geom_density(bw = 3)

  1. Repeat #6 but also include separate colors for the shaded areas with a transparency of 0.5. Use viridis colors for both lines and shaded areas, and reverse the direction of the colors where 4 is yellow, 6 is greenish, and 8 is purplish.
mtcars |> 
  mutate(cyl = as.factor(cyl)) |> 
  ggplot(aes(x = mpg, color = cyl, fill = cyl)) +
  geom_density(bw = 3, alpha = 0.5) +
  scale_color_viridis_d(direction = -1) +
  scale_fill_viridis_d(direction = -1)