Plotting distributions: histograms


Jeffrey R. Stevens


April 10, 2023

  1. Using the mtcars data, create a histogram of the fuel efficiency values.
mtcars |> 
  ggplot(aes(x = mpg)) +
  1. Not a great histogram. Mess with the number of bins until you get a nice histogram.
mtcars |> 
  ggplot(aes(x = mpg)) +
  geom_histogram(bins = 8)

  1. Now change the bin width to generate the same plot as #2.
mtcars |> 
  ggplot(aes(x = mpg)) +
  geom_histogram(binwidth = 3.35)

  1. Using the same binwidth from #3, plot a histogram with lightseagreen lines and aquamarine3 shaded areas. Then overlay a density plot with a aquamarine4 line with width 2.
mtcars |> 
  ggplot(aes(x = mpg)) +
  geom_histogram(aes(y = after_stat(density)), binwidth = 3.35, fill = "aquamarine3", color = "lightseagreen") +
  geom_density(bw = 3.35, color = "aquamarine4", linewidth = 2)

  1. What is the difference between a frequency polygon and a density plot?

  2. Make a density plot with bandwidth of 3 and separate line colors for different cylinder levels.

mtcars |> 
  mutate(cyl = as.factor(cyl)) |> 
  ggplot(aes(x = mpg, color = cyl)) +
  geom_density(bw = 3)

  1. Repeat #6 but also include separate colors for the shaded areas with a transparency of 0.5. Use viridis colors for both lines and shaded areas, and reverse the direction of the colors where 4 is yellow, 6 is greenish, and 8 is purplish.
mtcars |> 
  mutate(cyl = as.factor(cyl)) |> 
  ggplot(aes(x = mpg, color = cyl, fill = cyl)) +
  geom_density(bw = 3, alpha = 0.5) +
  scale_color_viridis_d(direction = -1) +
  scale_fill_viridis_d(direction = -1)