For these exercises, we’ll use the dog breed traits data set.
- Load tidyverse, import
dog_breed_traits_clean.csv
to traits
.
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.0 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
traits <- read_csv(here::here("data/dog_breed_traits_clean.csv"), show_col_types = FALSE)
set.seed(12)
breeds <- sample(traits$breed)
- Convert both coat_type and coat_length into factors using
across()
and save as traits2
.
traits2 <- traits |>
mutate(across(contains("coat"), factor))
- Check the levels for both columns, one using a pipe and one without using a pipe.
levels(traits2$coat_type)
[1] "Corded" "Curly" "Double" "Hairless" "Rough" "Silky" "Smooth"
[8] "Wavy" "Wiry"
traits2 |>
pull(coat_length) |>
levels()
[1] "Long" "Medium" "Short"
- Reorder the levels for coat_length to be Short, Medium, Long (reassigned to
traits2
) and then check the levels.
traits2 <- traits2 |>
mutate(coat_length = fct_relevel(coat_length, "Short", "Medium", "Long"))
levels(traits2$coat_length)
[1] "Short" "Medium" "Long"
- Reorder the levels for coat_type to be in the order of the most to least frequent coat type and then check the levels.
traits2 <- traits2 |>
mutate(coat_type = fct_infreq(coat_type))
levels(traits2$coat_type)
[1] "Smooth" "Double" "Wiry" "Silky" "Curly" "Wavy" "Corded"
[8] "Rough" "Hairless"
- Relabel coat_length to be Stubby, Mid, and Lush rather than Short, Medium, and Long.
traits2 <- traits2 |>
mutate(coat_length = fct_recode(coat_length, "Stubby" = "Short",
"Mid" = "Medium",
"Lush" = "Long"))
levels(traits2$coat_length)
[1] "Stubby" "Mid" "Lush"
- The new AKC standard subsumes Rough coats with Wiry coats and Silky with Wavy. Please update the coat_type variable accordingly.
traits2 <- traits2 |>
mutate(coat_type = fct_collapse(coat_type, Wiry = c("Rough", "Wiry"),
Wavy = c("Silky", "Wavy")))
levels(traits2$coat_type)
[1] "Smooth" "Double" "Wiry" "Wavy" "Curly" "Corded" "Hairless"