background preloader

Knitr: Elegant, flexible and fast dynamic report generation with R

Knitr: Elegant, flexible and fast dynamic report generation with R
Overview The knitr package was designed to be a transparent engine for dynamic report generation with R, solve some long-standing problems in Sweave, and combine features in other add-on packages into one package (knitr ≈ Sweave + cacheSweave + pgfSweave + weaver + animation::saveLatex + R2HTML::RweaveHTML + highlight::HighlightWeaveLatex + 0.2 * brew + 0.1 * SweaveListingUtils + more). This package is developed on GitHub; for installation instructions and FAQ’s, see README. Motivation One of the difficulties with extending Sweave is we have to copy a large amount of code from the utils package (the file SweaveDrivers.R has more than 700 lines of R code), and this is what the two packages mentioned above have done. Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to humans what we want the computer to do. – Donald E. Features Acknowledgements Misc

plyr Look what I found: two amazing charts While doing some research for my statistics blog, I came across a beauty by Lane Kenworthy from almost a year ago (link) via this post by John Schmitt (link). How embarrassing is the cost effectiveness of U.S. health care spending? When a chart is executed well, no further words are necessary. I'd only add that the other countries depicted are "wealthy nations". Even more impressive is this next chart, which plots the evolution of cost effectiveness over time. Let's appreciate this beauty: Let the data speak for itself.

Short-refcard.pdf (application/pdf Object) Say it in R with "by", "apply" and friends R is a language, as Luis Apiolaza pointed out in his recent post. This is absolutely true, and learning a programming language is not much different from learning a foreign language. It takes time and a lot of practice to be proficient in it. I started using R when I moved to the UK and I wonder, if I have a better understanding of English or R by now. Languages are full of surprises, in particular for non-native speakers. With languages you can get into habits of using certain words and phrases, but sometimes you see or hear something, which shakes you up again. f <- function(x) x^2 sapply(1:10, f) [1] 1 4 9 16 25 36 49 64 81 100 It reminded me of the phrase that everything is a list in R. I remember how happy I felt, when I finally understood the by function in R. by Now let's find alternative ways of expressing ourselves, using other words/functions of the R language, such as aggregate, apply, sapply, tapply, data.table, ddply, sqldf, and summaryBy. aggregate apply and tapply ddply sqldf

R Time Series Tutorial The data sets used in this tutorial are available in astsa, the R package for the text. A detailed tutorial (and more!) is available in Appendix R of the text. You can copy-and-paste the R commands (multiple lines are ok) from this page into R. This quick fix is meant for people who are just starting to use R for time series analysis. If you're new to R/Splus, I suggest reading R for Beginners (a pdf file) first. ◊ Baby steps... your first R session. Ok, now you're an expert useR. We're going to get astsa now: install.packages("astsa") # install it ... you'll be asked to choose the closest CRAN mirror require(astsa) # then load it (has to be done at the start of each session) Let's play with the Johnson & Johnson data set. data(jj) # load the data jj # print it to the screen Qtr1 Qtr2 Qtr3 Qtr4 1960 0.71 0.63 0.85 0.44 1961 0.61 0.69 0.92 0.55 . . . . . . . . . . 1979 14.04 12.96 14.85 9.99 1980 16.20 14.67 16.02 11.61 options(digits=2) # the default is 7, but it's more than I want now ?

Omegahat Statistical Computing R Tutorials--Data Frames Preamble There is plenty to say about data frames because they are the primary data structure in R. Some of what follows is essential knowledge. Some of it will be satisfactorily learned for now if you remember that "R can do that." I will try to point out which parts are which. Set aside some time. Definition and Examples (essential) A data frame is a table, or two-dimensional array-like structure, in which each column contains measurements on one variable, and each row contains one case. Let's say we've collected data on one response variable or DV from 15 subjects, who were divided into three experimental groups called control ("contr"), treatment one ("treat1"), and treatment two ("treat2"). contr treat1 treat2 --------------------------- 22 32 30 18 35 28 25 30 25 25 42 22 20 31 33 --------------------------- This is a proper data frame (and leave out the dashed lines, although in actual fact R could read this table just as you see it here). Here's the catch. It's not a disaster.

Missing Data In R, missing values are represented by the symbol NA (not available) . Impossible values (e.g., dividing by zero) are represented by the symbol NaN (not a number). Unlike SAS, R uses the same symbol for character and numeric data. Testing for Missing Values is.na(x) # returns TRUE of x is missing y <- c(1,2,3,NA) is.na(y) # returns a vector (F F F T) Recoding Values to Missing # recode 99 to missing for variable v1 # select rows where v1 is 99 and recode column v1 mydata$v1[mydata$v1==99] <- NA Excluding Missing Values from Analyses Arithmetic functions on missing values yield missing values. x <- c(1,2,NA,3) mean(x) # returns NA mean(x, na.rm=TRUE) # returns 2 The function complete.cases() returns a logical vector indicating which cases are complete. # list rows of data that have missing values mydata[! The function na.omit() returns the object with listwise deletion of missing values. # create new dataset without missing data newdata <- na.omit(mydata) Advanced Handling of Missing Data

Lexical scope and function closures in R | Darren Wilkinson's research blog Introduction R is different to many “easy to use” statistical software packages – it expects to be given commands at the R command prompt. This can be intimidating for new users, but is at the heart of its power. Most powerful software tools have an underlying scripting language. This is because scriptable tools are typically more flexible, and easier to automate, script, program, etc. Programming from the ground up It is natural to want to automate (repetitive) tasks on a computer, to automate a “work flow”. Next, one can add in simple control structures, to support looping, branching and conditional execution. Although scripting is a simple form of programming, it isn’t “real” programming, or software engineering. Functions and procedures Procedures (or subroutines) are re-usable pieces of code which can be called from other pieces of code when needed. Variable scope Dynamic scope Lexical scope No, really, try and figure it out before reading on for the answer! Function closures References

Climate Charts & Graphs | My R and Climate Change Learning Curve

Related: