background preloader

Graphing

Facebook Twitter

Ggplot2: Cheatsheet for Scatterplots. Data and Code Download All data and code for this blog can be downloaded here: NB: It's been pointed out to me that some images don't show up on IE, so you'll need to switch to Chrome or Firefox if you are using IE. Thanks! Why R for public health? I created this blog to help public health researchers that are used to Stata or SAS to begin using R. I find that public health data is unique and this blog is meant to address the specific data management and analysis needs of the world of public health. R is a very powerful tool for programming but can have a steep learning curve. Please email me with posts you would like to see or R questions, and I'll try my best to answer them.

Ggplot2: Cheatsheet for Barplots. Data and Code Download All data and code for this blog can be downloaded here: NB: It's been pointed out to me that some images don't show up on IE, so you'll need to switch to Chrome or Firefox if you are using IE. Thanks! Why R for public health? I created this blog to help public health researchers that are used to Stata or SAS to begin using R. I find that public health data is unique and this blog is meant to address the specific data management and analysis needs of the world of public health. R is a very powerful tool for programming but can have a steep learning curve. Please email me with posts you would like to see or R questions, and I'll try my best to answer them. How We Created Color Scales on Datavisualization.ch. Ggtern: ternary diagrams in R - an extension to ggplot2. Pocket : Introducing ggvis. Nbviewer.ipython. GgPlot2: Histogram with jittered stripchart. Here is an example of a Histogram plot, with a stripchart (vertically jittered) along the x side of the plot.

Alternatively, using the geom_rug function: Of course this simplicistic method need to be adjusted in vertical position of the stripchart or rugchart (y=-2, here), and the relative proportion of points jittering. Ggplot2. Ggplot2 Quick Reference: Themes | Software and Programmer Efficiency Research Group. A plot can be themed by adding a theme. ggplot2 provides two built-in themes: theme_grey() - the default theme, with a grey background theme_bw() - a theme with a white background To be more precise, ggplot2 provides functions that create a theme. These functions can be used to add a specific theme to a plot: ggplot() + ... + theme_bw() The theme produced by such a function is simply a structure containing a list of options.

These options describe the visual properties of the axes, legends, panels, strips, and the overall plot. When adding a theme to a plot, you can override some of the theme's options with opts(...). Ggplot() + ... + theme_grey() + opts(legend.background = theme_rect(fill="grey95", colour=NA)) When using opts() and adding a theme (like in the command right above), make sure that you add the theme before overriding the options (otherwise the opts(...) has no effect). The Currently Selected Theme theme_grey() The theme_grey() function creates the default theme of ggplot2. Setting Axis Limits on ggplot Charts. I’ve been doodling some chart in R/ggplot using geom_text() to generate a labelled scatterplot. The chart actually builds up several layers using different datasets, so it’s not obvious how to set the ranges cleanly: I know the lower bound I want for the y-axis (y=0), but I want to let the upper bound float. There’s also an issue with the labels overflowing the edges left and right.

So here are a couple of lines to make everything better (chart is in g): R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more... Ggplot2 - axis formatting. Ggplot2: Quick Heatmap Plotting. A post on FlowingData blog demonstrated how to quickly make a heatmap below using R base graphics.

This post shows how to achieve a very similar result using ggplot2. Data Import FlowingData used last season’s NBA basketball statistics provided by databasebasketball.com, and the csv-file with the data can be downloaded directly from its website. The players are ordered by points scored, and the Name variable converted to a factor that ensures proper sorting of the plot. Whilst FlowingData uses heatmap function in the stats-package that requires the plotted values to be in matrix format, ggplot2 operates with dataframes. For ease of processing, the dataframe is converted from wide format to a long format.

The game statistics have very different ranges, so to make them comparable all the individual statistics are rescaled. Plotting There is no specific heatmap plotting function in ggplot2, but combining geom_tile with a smooth gradient fill does the job very well. Rescaling Update Like this: Add a background png image to ggplot2 | julianhi's Blog. Hey everybody, this is just a short post but I found it very useful. I want to show you how to add images as a background to your ggplot2 plots. To do so we need the packages png and grid Btw, this is just a cool and fast way to import different packages at once. As an example for a background image plot I used the Sochi Olympic Medals plot by TRinker, which looks really good. The tutorial shows you how to create a plot based on the current medals scores which looks like this: First of all we need to load the picture.

And add a raster: And that´s nearly all! Now we just have to add it with the annotation_custom() function to our plot. The result will look like this: Like this: Like Loading... Ggplot2: Cheatsheet for Visualizing Distributions. In the third and last of the ggplot series, this post will go over interesting ways to visualize the distribution of your data. I will make up some data, and make sure to set the seed. library(ggplot2) library(gridExtra) set.seed(10005) xvar <- c(rnorm(1500, mean = -1), rnorm(1500, mean = 1.5)) yvar <- c(rnorm(1500, mean = 1), rnorm(1500, mean = 1.5)) zvar <- as.factor(c(rep(1, 1500), rep(2, 1500))) xy <- data.frame(xvar, yvar, zvar) >> Histograms I’ve already done a post on histograms using base R, so I won’t spend too much time on them.

Here are the basics of doing them in ggplot. More on all options for histograms here. Also, I found this really great aggregation of all of the possible geom layers and options you can add to a plot. Notice the warnings about the default binwidth that always is reported unless you specify it yourself. >> Density plots We can do basic density plots as well. >> Boxplots and more We can also look at other ways to visualize our distributions. Ggplot2 Quick Reference | Software and Programmer Efficiency Research Group. Our research often involves quantitative studies producing large amounts of data. To analyze and visualize that data we use various tools (and we sometimes develop our own, such as Trevis or LagAlyzer).

One of the most effective general information visualization tools we know is Hadley Wickham's ggplot2 package for R. Our pages here provide a quick reference, mostly for our own use. We made them public because we think others might benefit from them, too. This quick reference is based on ggplot2 version 0.8.8 running on R version 2.11.1. The Anatomy of a Plot In ggplot2, you create a plot using the ggplot() function. Besides a list of layers, a plot also has a coordinate system, scales, and a faceting specification. Each layer uses a specific kind of statistic to summarize data, draws a specific kind of geometric object (geom) for each of the (statistically aggregated) data items, and uses a specific kind of position adjustment to deal with geoms that might visually obstruct each other. A short tutorial for decent heat maps in R. I received many questions from people who want to quickly visualize their data via heat maps - ideally as quickly as possible. This is the major issue of exploratory data analysis, since we often don’t have the time to digest whole books about the particular techniques in different software packages to just get the job done.

But once we are happy with our initial results, it might be worthwhile to dig deeper into the topic in order to further customize our plots and maybe even polish them for publication. In this post, my aim is to briefly introduce one of R’s several heat map libraries for a simple data analysis. I chose R, because it is one of the most popular free statistical software packages around. Of course there are many more tools out there to produce similar results (and even in R there are many different packages for heat maps), but I will leave this as an open topic for another time. (download the script) source("path/to/the/script/heatmaps_in_R.R") if (! Data <- read.csv(".. Plotting y and log(y) in one figure. Sometimes I have the desire to plot both on the linear and on the log scale. To save space just two figures is not my solution. I want to reuse the x-axis, legend, title.

This post examines possibilities to do so with standard plot tools, lattice and ggplot2. Data Data is completely artificial.library(ggplot2)library(lattice)datastart <- data.frame(x=rep(1:5,2), y=c(1,2,10,50,1, .1,9,8,20,19), type=rep(c('a','b'),each=5))datastart x y type 1 1 1.0 a 2 2 2.0 a 3 3 10.0 a 4 4 50.0 a 5 5 1.0 a 6 1 0.1 b 7 2 9.0 b 8 3 8.0 b 9 4 20.0 b 10 5 19.0 b standard plot tools The trick here is to make two plots. Par(mfrow=c(2,1),mar=c(0,4.1,4,2)) plot(y~x, data=datastart, axes=FALSE, frame.plot=TRUE, xlab='', main='bla bla', col=c('red','green')[datastart$type]) legend(x='topleft', legend=c('a','b'), col=c('red','green'), pch=1) axis(side=2,las=1) par(mar=c(4,4.1,0,2)) xlab='x', log='y', ylab='log(y)', col=c('red','green')[datastart$type] axis(side=1) lattice data1=datastart data2=datastart data1$lab='linear' data2$lab='log' ggplot2.

Plot matrix with the R package GGally | Thiago G. Martins. I am glad to have found the R package GGally. GGally is a convenient package built upon ggplot2 that contains templates for different plots to be combined into a plot matrix through the function ggpairs. It is a nice alternative to the more limited pairs function. The package has also functions to deal with parallel coordinate and network plots, none of which I have tried yet.

The following code shows how easy it is to create very informative plots like the one in Figure 1. Figure 1 Plots like the one above are very helpful, among others things, in the pre-processing stage of a classification problem, where you want to analyze your predictors given the class labels. Controlling plot types We have some control over which type of plots to use. For example, the code below Figure 2 The details section of the help file of the ggpairs function describes which plots are available for each scenario. Auxiliary functions References: [1] GGally reference manual and help files. Like this: Like Loading...