pbd: programming with big data in R
library(animation) ani.options(convert=shQuote('C:\\Program Files (x86)\\ImageMagick-6.7.9-Q16\\convert.exe')) setwd("<strong>C:\\R_Home\\Charts & Graphs Blog\\RClimateTools\\Arctic_sea-ice_extent</strong><em>") png_yn <- "y" pattern <- c(rep("dashed", 5), rep("solid", 12)) Climate Charts & Graphs | My R and Climate Change Learning Curve
Lexical scope and function closures in R | Darren Wilkinson's research blog Introduction R is different to many “easy to use” statistical software packages – it expects to be given commands at the R command prompt. This can be intimidating for new users, but is at the heart of its power. Most powerful software tools have an underlying scripting language. This is because scriptable tools are typically more flexible, and easier to automate, script, program, etc.
R - Books
In R, missing values are represented by the symbol NA (not available) . Impossible values (e.g., dividing by zero) are represented by the symbol NaN (not a number). Unlike SAS, R uses the same symbol for character and numeric data. Testing for Missing Values Missing Data
Preamble There is plenty to say about data frames because they are the primary data structure in R. Some of what follows is essential knowledge. Some of it will be satisfactorily learned for now if you remember that "R can do that." I will try to point out which parts are which. Set aside some time. R Tutorials--Data Frames
Recently, a student of mine asked me about automating the collection of data from the HTML form http://www.wrh.noaa.gov/forecast/xml/xml.php. The intent is to collect the forecasts twice a day and process the XML into a data frame. Getting the content of the URL, parsing it and extracting the data is a quite straightforward application of the RCurl and XML packages along with XPath and getNodeSet(). Automating the collection involves the cron facility on a Linux machine. Omegahat Statistical Computing
R Time Series Tutorial The data sets used in this tutorial are available in astsa, the R package for the text. A detailed tutorial (and more!) is available in Appendix R of the text.
For most of the classical distributions, base R provides probability distribution functions (p), density functions (d), quantile functions (q), and random number generation (r). Beyond this basic functionality, many CRAN packages provide additional useful distributions. In particular, multivariate distributions as well as copulas are available in contributed packages. Ultimate bibles on probability distributions are different volumes of N. CRAN Task View: Probability Distributions
R is a language, as Luis Apiolaza pointed out in his recent post. This is absolutely true, and learning a programming language is not much different from learning a foreign language. It takes time and a lot of practice to be proficient in it. I started using R when I moved to the UK and I wonder, if I have a better understanding of English or R by now. Say it in R with "by", "apply" and friends
Resources to help you learn and use R
Look what I found: two amazing charts While doing some research for my statistics blog, I came across a beauty by Lane Kenworthy from almost a year ago (link) via this post by John Schmitt (link). How embarrassing is the cost effectiveness of U.S. health care spending? When a chart is executed well, no further words are necessary. I'd only add that the other countries depicted are "wealthy nations". Even more impressive is this next chart, which plots the evolution of cost effectiveness over time.
plyr is a set of tools for a common set of problems: you need to split up a big data structure into homogeneous pieces, apply a function to each piece and then combine all the results back together. For example, you might want to: fit the same model to subsets of a data frame quickly calculate summary statistics for each group perform group-wise transformations like scaling or standardising It’s already possible to do this with base R functions (like split and the apply family of functions), but plyr makes it all a bit easier with: totally consistent names, arguments and outputs convenient parallelisation through the foreach package input from and output to data.frames, matrices and lists progress bars to keep track of long running operations built-in error recovery, and informative error messages labels that are maintained across all transformations plyr
Overview The knitr package was designed to be a transparent engine for dynamic report generation with R, solve some long-standing problems in Sweave, and combine features in other add-on packages into one package (knitr ≈ Sweave + cacheSweave + pgfSweave + weaver + animation::saveLatex + R2HTML::RweaveHTML + highlight::HighlightWeaveLatex + 0.2 * brew + 0.1 * SweaveListingUtils + more).
ComputingPresentation.R.conditionals.pdf (application/pdf Object)
R tutoriels et des lésions