R Programming
Data Types
Basic Data Types
R has five basic or “atomic” classes of objects:
 character
* numeric (real numbers)
* Numbers in R as numeric objects by default. (Double precision real numbers)
*
Inf
represents infinity. *NA
&NaN
represents an undefined value and not a number.is.na()
andis.nan()
are used to test objects.NA
values have a class also, so there are integerNA
, characterNA
, etc. A
NaN
value is alsoNA
but the converse is not true. * Attributes of an object like length and other metadata can be access using theattributes()
function. * integer 1
is a numeric object.1L
is an integer. * complex * logical (True/False)
Complex Data Types
Vectors
The most basic object is a vector.
* A vector can only contain objects of the same class.
* BUT: The one exception is a list, which is represented as a vector but can contain objects of
different classes (indeed, that’s usually why we use them)
* Empty vectors can be created with the vector()
function.
* Vector examples
1 2 3 4 5 6 

 Lists are a special type of vector that can contain elements of different classes.
1 2 3 4 5 6 7 8 

 Mixing Objects
When different objects are mixed in a vector, coercion occurs so that every element in the vector is of the same class.
1 2 3 

 Explicit Coercion
Objects can be explicitly coerced from one class to another using theas.*()
functions, if available.
1 2 3 4 5 6 7 8 9 

 Nonsensical coercion results in NAs
1 2 3 4 5 6 7 8 9 

 Matrices Matrices are vectors with a dimension attribute. The dimension attribute is itself an integer vector of length 2 (nrow, ncol).
1 2 3 4 5 6 7 8 9 10 

 cbinding and rbinding
Matrices can be created by columnbinding or rowbinding with
cbind()
andrbind()
.
1 2 3 4 5 6 7 8 9 10 11 

Factors
 Factors are nothing but enumeration data types used to represent categorical data.
 Can be ordered or unordered.
 Factor can be thought of as an integer vector where each integer has a label.
1 2 3 4 5 6 7 8 9 10 11 12 

Data Frames
 A special type of list where every list is of same length.
 All columns in data frame must have names

 Unlike matrices, data frames can store different classes of objects in each column.
 To create 
read.table()
orread.csv()
 To convert to a matrix 
data.matrix()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 

data.frame()
# create data frame or table e.g.,test.data.frame<data.frame(id=c(1,2,3,4,5),name=c("a","b","c","d","e"))
edit(framename)
# edit the content of the tablestr(frame)
# prints the structure and data types of the data framenames(framename)
– prints the column namesdataframe[index.ROW, index.COLUMN]
dataframe[[1]]
# returns the first column as a vectordataframe[1]
# returns the first column wrapped in a data framedataframe[c(1,3)]
# returns the 1st and 3rd column wrapped in a data framedataframe[1:3,]
# displays all columns but first 13 rows onlyis.data.frame(framename)
#checks if an object is a data frame
Basics Statistic Functions
mean(x)
# x is a vectormedian(x)
sd(x)
var(x)
cor(x,y)
cov(x,y)
lapply(dataframe, function)
# Apply function like mean/median over a list/dataframe
Graphs
plot(x,y)
# Scatter plot. Only numeric vectors or dataframes are allowed.barplot()
 Bar Chartboxplot()
 Box Plot. Provides a quick visual summary of a dataset. Thick line in middle is the median. Box identifies the 1st(bottom) and 3rd(top) quartiles.hist(x)
 Histogram. Groups data into bins
Appendix
Command Reference
ls()
orls(all.names = TRUE)
# Lists all variables/objects defined in the * sessionsetwd(“c:/xyz”)
# sets working directorygetwd()
# Gets working directoryrunif(8)
# generates 8 random numbersx < 9
# assigns 9 to object x in workspacex
# prints the value of xrm(x)
# removes the object xrm(list=ls())
# removes all objects in workspace Save & Load (Binary)
save()
# saves all objects to default file .RData. Objects still exist in *memory
(binary format)save(obj1, obj2, file=”filename”)
load(“filename”)
# loads from file to memory
 Save & Load (Text)
write.table(obj1, file=”filename”)
# only 1 obj at a timeload.table()
Packages
install.packages(c("ggplot2", "devtools", "KernSmooth")
# install the collection of packages from CRANlibrary()
#list all available packageslibrary(package)
# loads package on to memoryrequire(package)
# loads package on to memory. Used in scripts. Returns loading status as boolean.detach(package:name)
# unloads package from memory.
Help
?func
# open help page on function ‘func’help(func)
# same as aboveapropos("foo")
# list all functions containing string fooexample(foo)
# show an example of function foovignette()
# show available vignettes on installed packagesvignette("foo")
# show specific vignette
Bibliography
 R Inferno
 Software for Data Analysis  Programming with R (http://www.springer.com/statistics/computational+statistics/book/9780387759357)
 The R book (http://www.wiley.com/WileyCDA/WileyTitle/productCd0470973927.html)
 The Art of R Programming
 R in Action
 Ref Cards
 http://cran.rproject.org/doc/contrib/Shortrefcard.pdf
 http://www.statmethods.net/interface/help.html
 Tutorials
 http://www.johndcook.com/R_language_for_programmers.html
 http://cran.rproject.org/doc/manuals/Rintro.pdf
 http://www.decisionsciencenews.com/?p=261
 http://www.rtutor.com/rintroduction/dataframe
 http://tryr.codeschool.com/levels/2/challenges/1