If you ever want to get rid of a variable or object, use the rm command to delete it. To remove everything, use rm(list=ls()). Your code will be safe as long as you’ve saved it, but any variables, data frames, plots, environment objects, etc, will be removed. Libraries/packages that you’ve loaded will be unloaded — they will still be installed, but they will not be active and you will have to use library([package name]) to reload it.

Small tricks: * Wrap a command in () to output its result in the console, e.g. (x <- c(1,2,3)) will assign values to x and also print it.

We will begin by assigning values to a vector.

# The "<-" and "=" operators are equivalent
# This code creates a vector of numeric variables and prints it
x <- c(1,2,3)
print("This is my variable:")
## [1] "This is my variable:"
print(x)
## [1] 1 2 3
x = c(1,2,3)
print("This is my variable:")
## [1] "This is my variable:"
print(x)
## [1] 1 2 3

We will explore the main types of variables:

# Numeric
numeric_vector <- c(1,2,3,4,5)
print(numeric_vector)
## [1] 1 2 3 4 5
# Character
character_vector <- c("one","one","three","four","five")
print(character_vector)
## [1] "one"   "one"   "three" "four"  "five"
# Logical
logical_vector <- c(T,F,T,T,F)
print(logical_vector)
## [1]  TRUE FALSE  TRUE  TRUE FALSE
# Factor
# Note the "levels" when you print them
factor_vector <- as.factor(c("Toronto","Toronto","Montreal","Montreal","Ottawa","Montreal","Toronto","Montreal"))
print(factor_vector)
## [1] Toronto  Toronto  Montreal Montreal Ottawa   Montreal Toronto  Montreal
## Levels: Montreal Ottawa Toronto
# Examine the levels of a factor
levels(factor_vector)
## [1] "Montreal" "Ottawa"   "Toronto"
as.numeric(factor_vector) # Factors have an underlying number related to their label
## [1] 3 3 1 1 2 1 3 1
as.numeric(character_vector) # Characters do not
## Warning: NAs introduced by coercion
## [1] NA NA NA NA NA

We can also convert one type of variable to another, if it is appropriate

character_of_numbers <- c("1","2","3","4")
converted_characters <- as.numeric(character_of_numbers)
print(character_of_numbers)
## [1] "1" "2" "3" "4"
print(converted_characters)
## [1] 1 2 3 4

Lists can contain multiple variable types. Sometimes, you will want to add a row to a data frame. One way is to append a list to it.

# This list contains several types of variables
my_list <- list(1,"2",as.factor("ten"),FALSE)
# We can look at the structure of the list
str(my_list)
## List of 4
##  $ : num 1
##  $ : chr "2"
##  $ : Factor w/ 1 level "ten": 1
##  $ : logi FALSE
# Vectors do not support this function
my_vector <- c(1,"2",as.factor("ten"),FALSE)
str(my_vector)
##  chr [1:4] "1" "2" "1" "FALSE"

Finally, we look at a data frame.

# Create vectors that will contain our data
# We will create the columns
FirstCol <- c("Toronto","Toronto","Montreal","Halifax","Gatineau","Gatineau","Halifax","Toronto","Vancouver")
SecondCol <- c(21,24,25,33,20,18,19,20,20)
ThirdCol <- c("M","F","M","M","F","F","X","F","M")

# Create the data frame
my_data <- data.frame(FirstCol,SecondCol,ThirdCol)

# These column names are not very informative, so we should change them
colnames(my_data) # Look at the column names
## [1] "FirstCol"  "SecondCol" "ThirdCol"
# Re-assign the column names
colnames(my_data) <- c("City","Age","Sex")

# Let us select a specific column, in this case Age
age_vector <- my_data$Age
print(age_vector)
## [1] 21 24 25 33 20 18 19 20 20
# Let's add a row of data
new_row <- list(as.factor("Toronto"),20,"F")
data_with_new_row <- rbind.data.frame(my_data, new_row)

Exercises

Exercise 1: Create a data frame from the vectors provided. Name the columns “age”, “sex”, “location”

v1 <- c("20","27","26","24","27","18","22","23","24","29")
v2 <- c("M","M","F","F","M","M","F","F","M","F")
v3 <- c("Ottawa","Ottawa","Montreal","Vancouver","Toronto","Halifax","Toronto","Calgary","Montreal","Toronto")

Exercise 2: Convert the columns for age, sex, and location to numeric,factor, and factor, respectively.

Exercise 3: Half of the individuals have been assigned a treatment, and another half a placebo. Add a column of logical values called “treatment” using cbind or cbind.data.frame.

v4 <- c(rep(TRUE,5), rep(FALSE,5))

Exercise 4: Add an observation to your data. (Optional)