Colorado State University
Module 1: The Basics
1. Installing R
Follow the instructions here to download R and RStudio. R is a statistical programming language, and RStudio is a user interface that allows you to interact with the R language easily. If you already have R and RStudio downloaded on your computer, I recommended uninstalling both software and doing a fresh install with the newest versions.
2. Getting Started
Helvetica Light is an easy-to-read font, with tall and narrow letters, that works well on almost every site.
To get started in R, you need to give it some commands. Visit this link for explicit instructions on how to interact with the RStudio environment. You can either type commands into the command line in the console or within a script. You should always use a script so you can save your work. To create a new script, go to File > New File > R Script. Make sure to save your work as you go.
To run your commands, press Enter if you use the command line or Ctrl+Enter if you use a script (Command+Enter for Mac). To run all your lines of code at once, press Ctrl+A+Enter.
For the remainder of this document, code that can be copy and pasted into a script will appear in a light grey box. The output from running the code will appear with ## in front.
3. Giving R Basic Commands
Some simple commands you can give R are mathematical operations. The operators R uses for basic arithmetic are: addition (+), subtraction (-), multiplication (*), division (/), exponentiation (^): ​​​
1 + 1
4 - 2
2 * 3
12 / 4
2^2
R will follow the basic rules of operation:
20 - 2*(3 + 1)^2
​R will return an error message if you type a command it doesn’t recognize.​​
​3 % 5
4. Creating Objects
In most instances, you will want to store some of the data you have created. R lets you save data by storing it as an R object. An object is simply a name you assign to some data so you can reference it later. To create an object, choose a name and use <- to assign some data to it.
my_first_object <- 2
my_first_object
It is worth noting that you can also use = to assign objects.
my_first_object = 2
my_first_object
However, the general preference among the R community is to use <- instead of = for assigning objects. This will become more clear later, but the = sign is used to define arguments within functions, so the distinction between assigning objects and defining arguments makes code easier to follow.
Once you create an object, it will appear in the environment pane in Rstudio, and you can begin referencing it later commands.
20 - 2*(3 + 1)^my_first_object
my_first_object + my_first_object^(3)
Be wary, objects can be overwritten.
my_first_object <-2
my_first_object<-20 - 2*(3 + 1)^my_first_object
my_first_object
You can use the ls() function to see the names of all the objects in the environment.
my_second_object<-my_first_object + my_first_object^(3)
my_second_object
ls()
You can name an object in R almost anything you want, but there are a few rules. First, a name cannot start with a number. Second, a name cannot use some special symbols, like ^, !, $, @, +, -, /, or *. As best you can, try to make the names of your objects meaningful. The definition of meaningful is open to interpretation, but the overall goal is to make your code easier to follow. I prefer to name objects using the snake_case style (https://en.wikipedia.org/wiki/Snake_case), which is supposedly easier to read. Meaningful object names will become more clear in later exercises.
Speaking of making code easy to follow, make sure you always…
5. Comment your work!
Commenting your code is important for two main reasons. First, as a scientist, you should strive to make your work reproducible. Annotating your code makes it easier to understand for other people to interpret your process. Second, you will inevitably write a lot of code, get busy, move on to other things, and reopen your code months later with little to no memory of what you did. Spending the extra 5 minutes to annotate your work will save you a headache down the line.
To comment your code, simply put a hashtag character, #, in front of your annotations. R treats # in a special way, and it will not run anything that follows # on a line.
# First, I create an object
my_third_object<-2
# Then, I double it
my_fourth_object <- my_third_object * 2 # comments can also go here
While this example is trivial, comments will become incredibly helpful once you start writing more complex code.
6. Style Guide
Here is an example of two pieces of code that perform the same task, where one is written well and the other is written poorly:
# Good
roll<- function() {
die<-1:6
dice<-sample(die, size = 2, replace = TRUE)
return(sum(dice))
}
roll()
Here is an example of two pieces of code that perform the same task, where one is written well and the other is written poorly:
# Bad
roll<- function() {die<-1:6
dice<-sample(die, size = 2, replace = TRUE)
return(sum(dice))}
roll()
The good example uses conventional coding syntax, the second example is how I used to code when I first started programming. It may not seem overly burdensome to read the poorly written code in this example, but there will be instances when you need to write or read hundreds of lines of code. It will be burdensome then.
I highly recommend reviewing https://style.tidyverse.org/syntax.html for a nice overview of common coding conventions and style. I cannot stress enough the importance of getting into the habit of writing clean code in the early stages of learning. It may feel onreous at first, but I guarantee it will pay off in the long run.