Modes
All elements of a vector are of the same type. In the R language, we speak of ‘modes’ (or modalities). There are four modes: “numeric”, “character”, “logical” and “complex”.
Numerical mode
We’ve already seen part of this. R lets you create numerical vectors. For example, below, we create a sequence of integers from 1 to 10, which is stored in a vector.
integers <- 1:10
print(integers)
- Create a vector
z
containing the following integers in this order: 3, 12, 17, 20 to 41 and 50.
z <- c(3, 12, 17, 20:41, 50)
Character mode
Below is an example of vectors containing character strings.
names <- c("Stephen", "Katia", "Fairouz")
print(names)
class(names) # <=> is(names)
- Create
this_names
, a vector containing the following strings: “Riad”, “Robert” and “Nastassja”.
this_names <- c("Riad", "Robert", "Nastassja")
Logic mode
Below is an example of vectors containing Booleans (variables with
two states, ‘true’ and ‘false’ or 0 and 1). Note that true and false are
represented by the keywords TRUE
and FALSE
(which are reserved terms in the language).
logic <- c(TRUE, FALSE, TRUE)
print(logic)
class(logic) # is(logic)
TRUE
and FALSE
can be replaced by
T
and F
respectively. However, this last
solution is not recommended to increase readability.
logic <- c(T, F, T)
print(logic)
class(logic) # is(logic)
- Create a
true_false
vector containing the Boolean values true, false, false, false, true.
true_false <- c(TRUE, FALSE, FALSE, FALSE, TRUE)
Comparison operators
Logical vectors are generally created by R when comparing with comparison operators such as those in the table below:
Operation | Result |
---|---|
a == b | Equality operator. Returns true if a and b contain equal values. |
a != b | Returns true if a and b contain different values. |
a > b | Returns true if the value of a is greater than the value of b. |
a < b | Returns true if the value of a is less than the value of b. |
a >= b | Returns true if the value of a is greater than or equal to the value of b. |
a <= b | Returns true if the value of a is less than or equal to the value of b. |
!a | Logical negation (the opposite). |
A simple example is given below.
x <- c(1, 7, 10, 11, 3, -2, -4, 10)
x > 5
- Take the following vector x, test whether each of positions are less than 0 and put the result in a variable z. Print z.
x <- c(2, -3, -2, 0, 6, 5)
x <- c(2, -3, -2, 0, 6, 5)
z <- x < 0
print(z)
You may also compare the paired positions of two vectors.
x <- c(1, 7, 10, 11, 3, -2)
y <- c(2, 3, 10, 11, 4, -5)
x > y
- Given the following vectors
x
andy
, store inz
the Boolean vector indicating whether a position ofx
is equal (==) to a position ofy
.
x <- c(2, -3, -2, 0, 6, 5)
y <- c(2, 3, 10, 11, 4, -5)
x <- c(2, -3, -2, 0, 6, 5)
y <- c(2, 3, 10, 11, 4, -5)
z <- x == y
- Given the following vectors
x
andy
, store in z the Boolean vector indicating whether a position ofx
differs (!=) from that ofy
.
x <- c(2, -3, -2, 0, 6, 5)
y <- c(2, 3, 10, 11, 4, -5)
x <- c(2, -3, -2, 0, 6, 5)
y <- c(2, 3, 10, 11, 4, -5)
z <- x != y
Logic operators
Logical vectors support logical operations, including the
&
operator (AND) and the |
operator (OR).
They can be used to test two vectors, position by
position, to determine whether both positions are true
(&
) or whether at least one of them is true
(|
).
x <- c(TRUE, FALSE, TRUE, FALSE)
y <- c(TRUE, FALSE, FALSE, TRUE)
print(x & y)
print(x | y)
- Store in z the logic vector corresponding to the test: x > 1 and y < 7
x <- c(1, 7, 10, 11, 3, -2, -4, 10)
y <- c(2, 3, 10, 11, 4, -5, -4, 12)
x <- c(1, 7, 10, 11, 3, -2, -4, 10)
y <- c(2, 3, 10, 11, 4, -5, -4, 12)
z <- x > 1 & y < 7
print(z)
Any() and all() functions
With these function we can test whether at least one (any()) or all (all()) of the positions of a Boolean vector are true.
x <- 1:10
x > 0
all(x > 0)
x <- -10:10
x > 0
all(x > 0)
any(x > 0)
The which() function
The which()
function returns the
positions of a vector for which a condition is
TRUE
.
x <- c(2, -3, 5, 7, 10, 2, -4, 11)
which(x < 0)
- Store in
z
all positions for which the values ofx
andy
are the same
x <- c(1, 7, 10, 11, 3, -2, -4, 10)
y <- c(2, 3, 10, 11, 4, -5, -4, 12)
x <- c(1, 7, 10, 11, 3, -2, -4, 10)
y <- c(2, 3, 10, 11, 4, -5, -4, 12)
z <- which(x == y)
- Store in vector
z
all positions for which the values ofx
andy
differ.
x <- c(1, 7, 10, 11, 3, -2, -4, 10)
y <- c(2, 3, 10, 11, 4, -5, -4, 12)
x <- c(1, 7, 10, 11, 3, -2, -4, 10)
y <- c(2, 3, 10, 11, 4, -5, -4, 12)
z <- which(x != y)
- Store in vector
z
all positions for whichy
values are at least two times greater thanx
.
NB: The multiplication operator in R is ’*’.
x <- c(1, 7, 10, 11, 3, -2, -4, 10)
y <- c(2, 3, 10, 11, 4, -5, -4, 12)
x <- c(1, 7, 10, 11, 3, -2, -4, 10)
y <- c(2, 3, 10, 11, 4, -5, -4, 12)
z <- which(y >= 2 * x)
Example of functions to create vectors.
rep()
There are numerous functions for creating vectors. For example, the
rep()
function repeats an existing vector or value.
rep(x=1:5, times=3)
Use the rep()
(repeat) and
c()
functions to store the following vector in the
dna
variable:
[1] "A" "T" "T" "A" "T" "T" "A" "T" "T" "A" "T" "T" "A" "T" "T" "A" "T" "T" "A"
[20] "T" "T" "A" "T" "T" "A" "T" "T" "A" "T" "T"
x <- c("A", "T", "T")
x <- c("A", "T", "T")
dna <- rep(x, 10)
The seq() function
The seq()
function is used to create sequences of
numerical values at regular intervals. You define a starting value
(from
), an ending value (to
) and increment
according to a defined step (by
) or a targeted number of
values (length.out
).
Look at the help on the
seq()
function and propose :An instruction to create a vector
a
containing regularly spaced values (in steps of 0.1) between 1 and 10 (argumentby
).An instruction to create a vector
b
containing 20 values regularly spaced between 1 and 10 (argumentlength.out
).
a <- seq(from=1, to=10, by=0.1)
print(a)
b <- seq(from=1, to=10, length.out=20)
print(b)
Generating random values
Functions for generating random values
R allows you to generate random numbers (i.e. random) from a wide variety of distributions (e.g. normal distribution, uniform distribution, Poisson distribution…).
The code below randomly draws 1000 values from a normal distribution with mean 0 and standard deviation 1.
NB: In the computer context, the values appear
random, but are in fact calculated from algorithms that attempt to
reproduce random processes. The “seed” of the set.seed()
function (123
in the example below) allows you to force the
algorithm to return the same “random” values (as if you could somehow
force randomness…).
set.seed(123)
x <- round(rnorm(n=1000, mean=0, sd=1), digit=2) # use print(x) to show x
length(x)
You can use head()
to show the first 6 elements of x
head(x)
- Store in
z
500 values drawn from a normal distribution with a mean of 2 and a standard deviation of 3. The values should be rounded (round()
) to 2 decimal places (useround()
).
set.seed(123) # Keep this to ensure everyone has the same results
set.seed(123)
z <- round(rnorm(500, mean = 2, sd=3), 2)
Visualizing the distributions
You can plot the histogram of a distribution with the
hist()
function. The breaks
argument controls
the number of intervals in the histogram.
x <- round(rnorm(n=1000, mean=0, sd=1), digit=2)
hist(x, breaks = 20, col="blue", border="white")
Plot the distribution of x
with hist()
using 5 and 100 for the breaks parameter. What do you observe?
hist(x, breaks = 5)
hist(x, breaks = 100)
Sorting operations
The sort() function
The sort()
function allows you to sort a vector with the
decreasing argument indicating whether to perform an
ascending or descending sorting.
x <- c(20:25, 1:5, 50)
sort(x, decreasing = TRUE)
Store in x
the values of x
ordered in
ascending order.
x <- sort(x, decreasing = FALSE)
Random sampling from a vector
The sample() function
The sample()
function randomly selects size
elements from a vector x
. With replace = TRUE
,
each drawn element is replaced in x
before the next
draw.
sample(x=1:10, size = 5, replace = TRUE)
Exercises around the sample() function
- Perform 2 random selection of 5 elements with replacement from a vector containing all integers from 1 to 10.
print(sample(x=1:10, size = 5, replace = TRUE))
print(sample(x=1:10, size = 5, replace = TRUE))
- Perform 2 random draws of 5 elements without replacement from a vector containing all integers from 0 to 10.
print(sample(x=1:10, size = 5, replace = FALSE))
print(sample(x=1:10, size = 5, replace = FALSE))
Mathematical operations
Mathematical operators
Mathematical operations on vectors use standard operators (+, -, *, /, ^ for raising to a power) and are vectorized, meaning they apply to all elements automatically, eliminating the need for explicit loops. This simplifies syntax and enables complex operations in just a few lines.
To add 10 to all elements of a numeric vector, you would thus write:
x <- 1:10
print(x + 10)
To sum the elements at each position p of two vectors
x
and y
, you would write:
x <- 1:10
y <- 11:20
print(x + y)
The same principle will apply to other operations:
x ^ 2
x * 3
x / 4
x * y
- Compute the sum of each position of
x
andy
and store the result inz
.
x <- 1:5
y <- 5:1
x <- 1:5
y <- 5:1
z <- x + y
- Consider the following example:
x <- 1:3
y <- 1:8
x + y
- Given the following example:
x <- 1:4
y <- 1:8
x + y
Missing values
A vector can contain missing values. In R, they are denoted as
NA
(Not Available).
x <- c(NA, 40, 2, NA, 7, 8, 30)
print(x)
Other undefined values exist such as Inf
(Infinite), -Inf
, or NaN
(NotANumber). They are generally produced by
calculation errors.
1/0
log(0)
0/0
The functions is.na()
, is.infinite()
, and
is.nan()
allow you to iterate over vectors and return a
boolean value for each position indicating whether the position is of
type NA
, Inf/-Inf
, or NaN
,
respectively.
is.na(c(1:3, NA, 1:3))
is.nan(c(1:3, NaN, 1:3))
is.infinite(c(1:3, Inf, 1:3))
- Given the vector below, store in
z
the number of position that areNA
(useis.na()
).
Tips: You can apply sum()
over a
boolean vector to count the TRUE positions.
set.seed(123)
x <- sample(c(1:10, NA, -Inf, Inf), size = 1000, replace = TRUE)
set.seed(123)
x <- sample(c(1:10, NA, -Inf, Inf), size = 1000, replace = TRUE)
z <- sum(is.na(x))
Quizz
Answer the following questions:
The end
Thank you for following this tutorial.