Cookbook for R » Cookbook for R

developers:projects:gsoc2012:ropensci Summary: Dynamic access and visualization of scientific data repositories Description: rOpenSci is a collaborative effort to develop R-based tools for facilitating Open Science. Projects in rOpenSci fall into two categories: those for working with the scientific literature, and those for working directly with the databases. Visit the active development hub of each project on github, where you can see and download source-code, see updates, and follow or join the developer discussions of issues. See a complete list of our R packages currently in development. The student could choose to work on a package for a particular data repository of interest, or develop tools for visualization and exploration that could function across the existing packages. Skills required: Should be able to use R to perform data manipulation and aggregation.

Summer 2010 — R: ggplot2 Intro Contents Intro When it comes to producing graphics in R, there are basically three options for your average user. base graphics I've written up a pretty comprehensive description for use of base graphics here, and don't intend to extend beyond that. Both and make creating plots of multivariate data easier. The website for ggplot2 is here: Basics is meant to be an implementation of the Grammar of Graphics, hence gg-plot. Plots convey information through various aspects of their aesthetics. x position y position size of elements shape of elements color of elements The elements in a plot are geometric shapes, like points lines line segments bars text Some of these geometries have their own particular aesthetics. points point shape point size lines line type line weight bars y minimum y maximum fill color outline color text label value The values represented in the plot are the product of various statistics. Layer by Layer Displaying Statistics

Learning R R Reference Card Model visualisation. had.co.nz This page lists my published software for model visualisation. This work forms the basis for the third chapter of my thesis. classifly: Explore classification boundaries in high dimensions. Given p-dimensional training data containing d groups (the design space), a classification algorithm (classifier) predicts which group new data belongs to. Generally the input to these algorithms is high dimensional, and the boundaries between groups will be high dimensional and perhaps curvilinear or multi-facted. This R package provides methods for visualising the division of space between the groups. clusterfly: Explore clustering results in high dimensions. Typically, there is somewhat of a divide between statistics and visualisation software. There are also some custom methods for certain types of clustering, mostly inspired by the work of Dr Dianne Cook: Self organising maps (aka Kohonen neural networks), ? meifly: Models explored interactively. Installation Presentations/publications

Cookbook for R » Cookbook for R Quick-R: Home Page R Programming Welcome to the R programming Wikibook This book is designed to be a practical guide to the R programming language[1]. R is free software designed for statistical computing. There is already great documentation for the standard R packages on the Comprehensive R Archive Network (CRAN)[2] and many resources in specialized books, forums such as Stackoverflow[3] and personal blogs[4], but all of these resources are scattered and therefore difficult to find and to compare. The aim of this Wikibook is to be the place where anyone can share his or her knowledge and tricks on R. How can you share your R experience ? Explain the syntax of a commandCompare the different ways of performing each task using R.Try to make unique examples based on fake data (ie simulated data sets).As with any Wikibook please feel free to make corrections, expand explanations, and make additions where necessary. Some rules : Prerequisites[edit] We assume that readers have a background in statistics. See also[edit]

Highland Statistics Ltd Jump straight to Price and Order the book Outline Keywords Table of Contents Data sets and R code used Video files Support chapters Discussion board Outline This book presents Generalized Linear Models (GLM) and Generalized Linear Mixed Models (GLMM) based on both frequency-based and Bayesian concepts. The book uses the functions glm, lmer, glmer, glmmADMB, and also JAGS from within R. R code to construct, fit, interpret, and comparatively evaluate models is provided at every stage. Readers of this book have free access to: Chapter 1 of Zero Inflated Models and Generalized Linear Mixed Models with R. (2012a) Zuur, Saveliev, Ieno. See the Preface (and the text below) how to access the pdfs of these chapters. Keywords Table of contents Click for Table of contents Price and Order the book The paperback is priced at 49 GBP. Copyright statement This book is copyright material from Highland Statistics Ltd. Data sets and R code used in the book. Video file with general comments Alain Zuur Support chapters

RStudio Server Amazon Machine Image (AMI) - Louis Aslett Current AMI Quick Reference (27nd Jun 2015)Amazon instance type reference Click to launch through AWS web interface: What’s new recently? Easy Dropbox setup to make syncing files on/off server easy, including selective folder sync. < Back to homepage Amazon’s EC2 platform provides a convenient environment for rapidly procuring computational resources in the cloud. To get started with the Amazon cloud, you must first signup for an AWS account if you don’t already have one. Click here for a simple video guide to using the AMIs listed here, or for more detailed information read on. What is this? If you want to run a server in the Amazon cloud, you have to select what system you are going to bootup. In particular, many common tools and dependencies are built-in. Why an RStudio AMI? The RStudio team have done a phenomenal job with making it simplicity itself to install, but there are still several motivating factors which led to me creating this AMI: AMI Release History Usage Comments

Building an R Hadoop System - RDataMining.com: R and Data Mining This page shows how to build an R Hadoop system, and presents the steps to set up my first R Hadoop system in single-node mode on Mac OS X. After reading documents and tutorials on MapReduce and Hadoop and playing with RHadoop for about 2 weeks, finally I have built my first R Hadoop system and successfully run some R examples on it. Here I’d like to share my experience and steps to achieve that. Hopefully it will make it easier to try RHadoop for R users who are new to Hadoop. Note that I tried this on Mac only and some steps might be different for Windows. Before going through the complex steps below, let’s have a look what you can get, to give you a motivation to continue. Now let’s start. 1. 1.1 Download Hadoop Download Hadoop (hadoop-1.1.2-bin.tar.gz) at and then unpack it. 1.2 Set JAVA_HOME In conf/hadoop_env.sh, add the line below: export JAVA_HOME=/Library/Java/Home 1.3 Set up Remote Desktop and Enabling Self-Login ssh-keygen -t rsa -P ""

10 tips for making your R graphics look their best So you've spent hours slaving over the code for a beautiful statistical graphic in R, and now you're ready to show it to the world. You might be printing it, embedding it in a document, or displaying it on the web. Don't do your graph a disservice by causing it to look anything less than perfect in its final venue. 1. It's tempting to just create graphics to the on-screen device (such as X11 on Linux or Quartz on MacOS) and then just use "Save As..." from the menu. The best practice is to create a script file that begins with a call to the device driver (usually pdf or png), runs the graphics commands, and then finishes with a call to dev.off(). png(file="mygraphic.png",width=400,height=350)plot(x=rnorm(10),y=rnorm(10),main="example")dev.off() Not only will you often get better-looking results, but you'll have the means to recreate the graphic file six months down the line, when you've long forgotten how you did it manually. 2. 3. 4. 5. For PNG graphs, it's a bit tricker. dev.off() 6. 7.