\(\mathcal{R}\) is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the \(\mathcal{S}\) language and environment developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. \(\mathcal{R}\) can be considered as a different implementation of \(\mathcal{S}\). There are some important differences, but much code written for \(\mathcal{S}\) runs unaltered under \(\mathcal{R}\). \(\mathcal{R}\) provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The \(\mathcal{S}\) language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity. One of \(\mathcal{R}\)’s strengths is the ease with which well-designed publicationquality plots can be produced, including mathematical symbols and formula where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control. \(\mathcal{R}\)-Studio is a free and open source Integrated Development Environment (IDE) for R, a programming language for statistical computing and graphics. \(\mathcal{R}\)-Studio is available in two editions: \(\mathcal{R}\)-Studio Desktop, where the program is run locally as a regular desktop application; and \(\mathcal{R}\)-Studio Server, which allows accessing \(\mathcal{R}\)-Studio using a web browser while it is running on a remote Linux server. Prepackaged distributions of \(\mathcal{R}\)-Studio Desktop are available for MicrosoftWindows, Mac OS X, and Linux. \(\mathcal{R}\)-Studio is written in the C++ programming language and uses the Qt framework for its graphical user interface. Work on \(\mathcal{R}\)-Studio started at around December 2010 , and the first public BETA version (v0.92) was offcially announced in February 2011.
For downloading R go to: http://cran.stat.ucla.edu, and there are install download links for Linux, Mac(OS), andWindows. For downloadingR???Studio go to: http://www.rstudio.com/products/RStudio , and there are both Desktop (recommended) and Server version. Now, install \(\mathcal{R}\) and \(\mathcal{R}\)-Studio on your laptops! Please go to above links and follow the instructions for downloading and installing \(\mathcal{R}\) and \(\mathcal{R}\)-Studio.
Now, install \(\mathcal{R}\) and \(\mathcal{R}\)-Studio on your laptops! Please go to above links and follow the instructions for downloading and installing \(\mathcal{R}\) and \(\mathcal{R}\)-Studio.
Now open \(\mathcal{R}\) software! After \(\mathcal{R}\) is started, there is a console awaiting for input. You can enter commands one at a time at the command prompt (>) or run a set of commands from a source file. Type following simple calculations and commands, and see what are the results:
print('Hello, world')
2+2
print(2+2)
print('2+2')
Now, type below comment:
print(Hello, world!)
Did you get any error message? Why does that happen?
Now, on File tab, choose new script. And, redo what you did on Console window. In script window, for running the commands you can either use Ctrl + R (Cmmd + R in Mac) or make a right click and choose Run line or selection.
Could you tell what is the difference in running a command in R console and \(\mathcal{R}\)-Editor (\(\mathcal{R}\) Script)?
Now open \(\mathcal{R}\)-Studio software and repeat what you did in \(\mathcal{R}\)-Studio. (You can choose to use either \(\mathcal{R}\)-Studio or R for remaining part of this workshop based on your preferences. But, for now, we want to get familiar with the interface of both software! For benefiting the advantages of embedding \(\mathcal{R}\) in \(\LaTeX\) or HTML, it is preferred to use \(\mathcal{R}\)-Studio.)
Results of calculations can be stored in objects using the assignment operators:
An arrow (<-) formed by a smaller than character and a hyphen without a space!
The equal character (=).
These objects can then be used in other calculations. To print the object just enter the name of the object. There are some restrictions when giving an object a name:
x=1
print(x)
x<-1
print(x)
x<-2
print(x)
X=2
x+X
xX=x+X
print(xX)
y='a'
print(y)
w="a"
print(w)
z<-'Which one do you prefer? R or R-Studio'
print(z)
How can you drop/delete/remove an object? Try \({\tt rm()}\) command!
rm(x)
print(x)
rm(z)
print(z)
Now, create two small vectors with data. The following apply the function \({\tt c()}\) to combine three numeric values into a vector.
V1=c(1,2,3)
V2=c(1,2,'a')
print(V1)
print(V2)
Then, make one vector which include \(1,2,3,4,\) and \(5\). Store this vector, and name it \(x1\).
c(1,2,3,4,5)
x1<-c(1,2,3,4,5)
Now try \({\tt x2<-c(1:5)}\) and name it \(x2\).
x2<-c(6:10)
print(c1)
print(x2)
All text after the pound sign “#” within the same line is considered a comment.
\(\mathcal{R}\) provides extensive documentations. For example, entering \({\tt ?c}\) or \({\tt help(c)}\) at the prompt gives documentation of the function c in R. Please give it a try.
?c
help(c)
Vectors can be combined via the function \({\tt c}\). For examples, the following two vectors \(n\) and \(s\) are combined into a new vector containing elements from both vectors.
n = c(2, 3, 5)
s = c("aa", "bb", "cc", "dd", "ee")
c(n, s)
Arithmetic operations of vectors are performed member-by-member, i.e., member-wise.
For example, suppose we have two vectors a and b.
a = c(1, 3, 5, 7)
b = c(1, 2, 4, 8)
Then, if we multiply a by 5, we would get a vector with each of its members multiplied by 5.
5 * a
And if we add \(a\) and \(b\) together, the sum would be \(a\) vector whose members are the sum of the corresponding members from \(a\) and \(b\).
a + b
Similarly for subtraction, multiplication, and division, we get new vectors via member wise operations.
a - b
a * b
a / b
Recycling Rule: If two vectors are of unequal length, the shorter one will be recycled in order to match the longer vector. For example, the following vectors \(u\) and \(v\) have different lengths, and their sum is computed by recycling values of the shorter vector \(u\).
u = c(10, 20, 30)
v = c(1, 2, 3, 4, 5, 6, 7, 8, 9)
u + v
There are various ways to construct a matrix. When we construct a matrix directly with data elements, the matrix content is filled along the column orientation by default. For example, in the following code snippet, the content of \(B\) is filled along the columns consecutively.
B = matrix(
c(2, 4, 3, 1, 5, 7),
nrow=3,
ncol=2)
B # B has 3 rows and 2 columns
Transpose: We construct the transpose of a matrix by interchanging its columns and rows with the function \({\tt t()}\).
t(B) # transpose of B
B<-t(B)
Combining Matrices: The columns of two matrices having the same number of rows can be combined into a larger matrix. For example, suppose we have another matrix \(C\) also with 3 rows.
C = matrix(
c(7, 4, 2),
nrow=3,
ncol=1)
C # C has 3 rows
Then we can combine the columns of \(B\) and \(C\) with \({\tt cbind()}\).
cbind(B, C)
Similarly, we can combine the rows of two matrices if they have the same number of columns with the \({\tt rbind()}\) function.
D = matrix(
c(6, 2),
nrow=1,
ncol=2)
D # D has 2 columns
rbind(B, D)
Importing data into R is fairly simple. For STATA, use the \({\tt Foreign}\) package. For SPSS and SAS I would recommend the Hmisc package for ease and functionality. See the Quick-\(\mathcal{R}\) section on these packages, for information on obtaining and installing the these packages. Before working with some examples of importing data, we need to learn about \(\mathcal{R}\)-packages andWorking Directories.
\(\mathcal{R}\)-packages are reproducible and reusable \(\mathcal{R}\)-Codes written, tested, and confirmed by \(\mathcal{R}\)-community. To use an \(\mathcal{R}\)-package, you need to first download and install it. Assume, we want to install package Foreign, type following syntax and run it to see the results:
install.packages("foreign")
You also can install multiple package with one syntax. The following syntax installs both foreign and Hmisc packages.
install.packages(c("foreign", "ggplot2"))
Take-Home: You can install \(\mathcal{R}\)-packages through the menu as well, how?
For using \(\mathcal{R}\)-packages, you need to install them only once, but for using them, you need to call them every time you open your \(\mathcal{R}\)/\(\mathcal{R}\)-Studio. To call/load the installed packages in your \(\mathcal{R}\)-library, you can use following syntax:
library(foreign)
library(ggplot2)
Take-Home Exercise: How can we load multiple packages at once? Hint: One way is using a for-loop.
There are two ways to set your working directory: 1. Through the menu * InWindows: go to the File menu, select ChangeWorking Directory, and select the appropriate folder/directory * In Macs: go to the Misc menu, select ChangeWorking Directory, and select the appropriate folder/directory
setwd("...")
in which, the “…” is the specific pathway, e.g., inWindows:
setwd("C:/Users/User Name/Documents/FOLDER")
in Macs:
setwd("/Users/User Name/Documents/FOLDER")
Download NAVCO 2.0 from my GitHub page: NAVCO 2.0
setwd("C:\\Users\\Babak-Lenovo2017\\Dropbox\\SabanciUniv\\POLS 537\\Handouts")
Set your working directory accordingly, and load the CSV file you downloaded as follow
Copyright 2019. This is an in-progress project, please do not cite or reproduce without my permission.↩