background preloader

A Community Site for R – Sponsored by Revolution Analytics

A Community Site for R – Sponsored by Revolution Analytics

The Workspace The workspace is your current R working environment and includes any user-defined objects (vectors, matrices, data frames, lists, functions). At the end of an R session, the user can save an image of the current workspace that is automatically reloaded the next time R is started. Commands are entered interactively at the R user prompt. Up and down arrow keys scroll through your command history. You will probably want to keep different projects in different physical directories. IMPORTANT NOTE FOR WINDOWS USERS: R gets confused if you use a path in your code like c:\mydocuments\myfile.txt This is because R sees "\" as an escape character. getwd() # print the current working directory - cwd ls() # list the objects in the current workspace setwd(mydirectory) # change to mydirectory setwd("c:/docs/mydir") # note / instead of \ in windows setwd("/usr/rob/mydir") # on linux # save your command history savehistory(file="myfile") # default is ".Rhistory" q() # quit R.

Built-in Functions Almost everything in R is done through functions. Here I'm only refering to numeric and character functions that are commonly used in creating or recoding variables. Numeric Functions Character Functions Statistical Probability Functions The following table describes functions related to probaility distributions. Other Statistical Functions Other useful statistical functions are provided in the following table. Other Useful Functions Note that while the examples on this page apply functions to individual variables, many can be applied to vectors and matrices as well.

Cookbook for R » Cookbook for R An Introduction to R Table of Contents This is an introduction to R (“GNU S”), a language and environment for statistical computing and graphics. R is similar to the award-winning1 S system, which was developed at Bell Laboratories by John Chambers et al. This manual provides information on data types, programming elements, statistical modelling and graphics. This manual is for R, version 3.1.0 (2014-04-10). Copyright © 1990 W. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Preface This introduction to R is derived from an original set of notes describing the S and S-PLUS environments written in 1990–2 by Bill Venables and David M. We would like to extend warm thanks to Bill Venables (and David Smith) for granting permission to distribute this modified version of the notes in this way, and for being a supporter of R from way back. Comments and corrections are always welcome. 1.1 The R environment

"R" you ready? | My advances in R – a learner’s diary R resources on the Web Here are our suggestions for the best on-line resources for information about R. The R Project homepage. Look here for official news from the R Project, plus links to documentation, mailing lists, the official R FAQs, and more. StackOverflow. R bloggers. The Video Rchive. #rstats on Twitter. CRAN Task Views. On the other hand, if you're looking for a specific page, you can search by keyword at this interactive directory of all R packages. Books about R. See also: What is R? Home Page R FAQ: How does R handle missing values? R FAQ How does R handle missing values? Version info: Code for this page was tested in R Under development (unstable) (2012-02-22 r58461) On: 2012-03-28 With: knitr 0.4 Like other statistical software packages, R is capable of handling missing values. However, to those accustomed to working with missing values in other packages, the way in which R handles missing values may require a shift in thinking. Very basics Missing data in R appears as NA. x1 <- c(1, 4, 3, NA, 7)x2 <- c("a", "B", NA, "NA") NA is the one of the few non-numbers that we could include in x1 without generating an error (and the other exceptions are letters representing numbers or numeric ideas like infinity). We can see that R distinguishes between the NA and "NA" in x2--NA is seen as a missing value, "NA" is not. Differences from other packages NA cannot be used in comparisons: In other packages, a "missing" value is assigned an extreme numeric value--either very high or very low. x1 < 0 x1 == NA mean(x1)

Multiple Regression R provides comprehensive support for multiple linear regression. The topics below are provided in order of increasing complexity. Fitting the Model # Multiple Linear Regression Example fit <- lm(y ~ x1 + x2 + x3, data=mydata) summary(fit) # show results # Other useful functions coefficients(fit) # model coefficients confint(fit, level=0.95) # CIs for model parameters fitted(fit) # predicted values residuals(fit) # residuals anova(fit) # anova table vcov(fit) # covariance matrix for model parameters influence(fit) # regression diagnostics Diagnostic Plots Diagnostic plots provide checks for heteroscedasticity, normality, and influential observerations. # diagnostic plots layout(matrix(c(1,2,3,4),2,2)) # optional 4 graphs/page plot(fit) click to view For a more comprehensive evaluation of model fit see regression diagnostics. Comparing Models You can compare nested models with the anova( ) function. Cross Validation You can assess R2 shrinkage via K-fold cross-validation. Variable Selection

Graphical Parameters You can customize many features of your graphs (fonts, colors, axes, titles) through graphic options. One way is to specify these options in through the par( ) function. If you set parameter values here, the changes will be in effect for the rest of the session or until you change them again. The format is par(optionname=value, optionname=value, ...) # Set a graphical parameter using par() par() # view current settings opar <- par() # make a copy of current settings par(col.lab="red") # red x and y labels hist(mtcars$mpg) # create a plot with these new settings par(opar) # restore original settings A second way to specify graphical parameters is by providing the optionname=value pairs directly to a high level plotting function. # Set a graphical parameter within the plotting function hist(mtcars$mpg, col.lab="red") See the help for a specific high level plotting function (e.g. plot, hist, boxplot) to determine which graphical parameters can be set this way. Text and Symbol Size Lines Colors