Jeff Stevens

2023-02-06

Vectors

Vectors

vector = atomic vector

• elements with a single dimension of the same data type

Create vectors with c()

Numeric vectors

(myvec1 <- c(1, 5, 3, 6))
[1] 1 5 3 6
(myvec2 <- c(11, 14, 18, 12))
[1] 11 14 18 12
c(myvec1, myvec2)
[1]  1  5  3  6 11 14 18 12

Create vectors with c()

Character vectors

(myvec3 <- c("a", "b", "c"))
[1] "a" "b" "c"

Create vectors with c()

What do you think will happen if you combine myvec2 and myvec3?

myvec2
[1] 11 14 18 12
myvec3
[1] "a" "b" "c"
c(myvec2, myvec3)
[1] "11" "14" "18" "12" "a"  "b"  "c" 

Create sequences with seq()

seq(from = 0, to = 20, by = 5)
[1]  0  5 10 15 20
seq(from = 20, to = 0, by = -5)
[1] 20 15 10  5  0
seq(0, 1, 0.2)
[1] 0.0 0.2 0.4 0.6 0.8 1.0

Create sequences with :

Sequences with increments of 1

4:9
[1] 4 5 6 7 8 9
9:4
[1] 9 8 7 6 5 4

Try it!

Make a sequence from 0 to 100 in steps of 10.

Create repetitions with rep()

Repeat single numbers

rep(0, times = 10)
 [1] 0 0 0 0 0 0 0 0 0 0

Create repetitions with rep()

Repeat vectors

rep(myvec3, times = 3)
[1] "a" "b" "c" "a" "b" "c" "a" "b" "c"
rep(c("d", "e", "f"), times = 3)
[1] "d" "e" "f" "d" "e" "f" "d" "e" "f"

Create repetitions with rep()

Repeat sequences

rep(1:4, times = 3)
 [1] 1 2 3 4 1 2 3 4 1 2 3 4
rep(1:4, each = 3)
 [1] 1 1 1 2 2 2 3 3 3 4 4 4

Try it!

Create a repetition of “yes” and “no” with 10 instance of each, alternating between the two. Then make one with 10 “yes” and then 10 “no”.

Working with vectors

Find vector length with length()

myvec3
[1] "a" "b" "c"
length(myvec3)
[1] 3

Try it!

How long is the combined vector of myvec1 and myvec2?

Checking typeof() and str()

myvec2
[1] 11 14 18 12
typeof(myvec2)
[1] "double"
str(myvec2)
 num [1:4] 11 14 18 12
myvec3
[1] "a" "b" "c"
typeof(myvec3)
[1] "character"
str(myvec3)
 chr [1:3] "a" "b" "c"

Index with []

Tracks the content of a specific element (starting with 1)

myvec2
[1] 11 14 18 12
myvec2[2]
[1] 14

Allows subsetting

myvec2[2:4]
[1] 14 18 12
myvec2[c(4, 1, 3)]
[1] 12 11 18

Allows reassignment

myvec2[2] <- NA
myvec2
[1] 11 NA 18 12

Lists, data frames, and tibbles

Lists

Recursive vectors (vectors of vectors) potentially with different data types

(mylist <- list(a = 1:4, b = c(4, 3, 8, 5), c = LETTERS[10:15], d = c("yes", "yes")))
$a [1] 1 2 3 4$b
[1] 4 3 8 5

$c [1] "J" "K" "L" "M" "N" "O"$d
[1] "yes" "yes"

Working with lists

typeof(mylist)
[1] "list"
typeof(mylist$b) [1] "double" str(mylist) List of 4$ a: int [1:4] 1 2 3 4
$b: num [1:4] 4 3 8 5$ c: chr [1:6] "J" "K" "L" "M" ...
$d: chr [1:2] "yes" "yes" Data frames List of named vectors of the same length (rectangular) mydf <- data.frame( datetime = as.Date(c("2021-04-21 11:56:12", "2021-04-21 14:57:44", "2021-04-22 03:09:56", "2021-04-22 12:39:22")), session_complete = as.logical(c("TRUE", "TRUE", "TRUE", "FALSE")), condition = as.factor(c("control", "control", "experimental", "experimental")), mean_response = c(17.53, 24.45, 19.82, NA), age = c(19, 20, 19, NA), comments = c("none", "Great study", "toooo long", NA) ) Data frames List of named vectors of the same length (rectangular) mydf  datetime session_complete condition mean_response age comments 1 2021-04-21 TRUE control 17.53 19 none 2 2021-04-21 TRUE control 24.45 20 Great study 3 2021-04-22 TRUE experimental 19.82 19 toooo long 4 2021-04-22 FALSE experimental NA NA <NA> typeof(mydf) [1] "list" str(mydf) 'data.frame': 4 obs. of 6 variables:$ datetime        : Date, format: "2021-04-21" "2021-04-21" ...
$session_complete: logi TRUE TRUE TRUE FALSE$ condition       : Factor w/ 2 levels "control","experimental": 1 1 2 2
$mean_response : num 17.5 24.4 19.8 NA$ age             : num  19 20 19 NA
$comments : chr "none" "Great study" "toooo long" NA Creating data frames Create new vectors (mydf1 <- data.frame(subject = 1:3, response = 8:6))  subject response 1 1 8 2 2 7 3 3 6 Combine existing vectors var1 <- c(1:6) var2 <- c(6:1) var3 <- c(21:26) mydf2 <- data.frame(var1, var2, resp = var3) mydf2  var1 var2 resp 1 1 6 21 2 2 5 22 3 3 4 23 4 4 3 24 5 5 2 25 6 6 1 26 Index with [row, column] mydf1  subject response 1 1 8 2 2 7 3 3 6 mydf1[2, 1]  [1] 2 mydf1[2, 1] <- 6 mydf1  subject response 1 1 8 2 6 7 3 3 6 Index with [row, column] Extract whole rows/columns mydf1[2, ]   subject response 2 6 7 mydf1[, 2]  [1] 8 7 6 Extract subsets mydf1[2:3, 2] [1] 7 6 mydf1[2:3, 1:2]  subject response 2 6 7 3 3 6 Working with data frames But extract columns by name with $

mydf1$response  [1] 8 7 6 mydf1$response[2] 
[1] 7
mydf1\$response[2:3] 
[1] 7 6

Why should you use column names rather than number?

Working with data frames

View first rows with head()

head(mtcars)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Note

Add the argument n = 10 to head(mtcars). What does this do?

Working with data frames

View dimensions

dim(mtcars)
[1] 32 11
nrow(mtcars)
[1] 32
ncol(mtcars)
[1] 11

Tibbles

Tibbles are just tidyverse versions of data frames

mydf2
  var1 var2 resp
1    1    6   21
2    2    5   22
3    3    4   23
4    4    3   24
5    5    2   25
6    6    1   26
(mytibble <- tibble::tibble(mydf2))
# A tibble: 6 × 3
var1  var2  resp
<int> <int> <int>
1     1     6    21
2     2     5    22
3     3     4    23
4     4     3    24
5     5     2    25
6     6     1    26