background preloader


Facebook Twitter

GgExtra: R package for adding marginal histograms to ggplot2. My first CRAN package, ggExtra, contains several functions to enhance ggplot2, with the most important one being ggExtra::ggMarginal() - a function that finally allows easily adding marginal density plots or histograms to scatterplots.

ggExtra: R package for adding marginal histograms to ggplot2

Availability You can read the full README describing the functionality in detail or browse the source code on GitHub. The package is available through both CRAN (install.packages("ggExtra")) and GitHub (devtools::install_github("daattali/ggExtra")). Spoiler alert - final result. Data Analysis in R, the data.table Way. Lomb-Scargle periodogram for unevenly sampled time series. In the natural sciences, it is common to have incomplete or unevenly sampled time series for a given variable.

Lomb-Scargle periodogram for unevenly sampled time series

Determining cycles in such series is not directly possible with methods such as Fast Fourier Transform (FFT) and may require some degree of interpolation to fill in gaps. An alternative is the Lomb-Scargle method (or least-squares spectral analysis, LSSA), which estimates a frequency spectrum based on a least squares fit of sinusoid. The above figure shows a Lomb-Scargle periodogram of a time series of sunspot activity (1749-1997) with 50% of monthly values missing.

As expected (link1, link2), the periodogram displays a a highly significant maximum peak at a frequency of ~11 years. The function comes from a nice set of functions that I found here: An accompanying paper focusing on its application to time series of gene expression can be found here. Below is a comparison to an FFT of the full time series. To reproduce the example:Read more » R: Estimate Spectral Density of an Irregularly Sampled Time... Description The most commonly used method of computing the spectrum on unevenly spaced time series is periodogram analysis, see Lomb (1975) and Scargle (1982).

R: Estimate Spectral Density of an Irregularly Sampled Time...

The Lomb-Scargle method for unevenly spaced data is known to be a powerful tool to find, and test significance of, weak perriodic signals. The Lomb-Scargle periodogram possesses the same statistical properties of standard power spectra. Usage, y=NULL, spans = NULL, kernel = NULL, taper = 0.1, pad = 0, fast = TRUE, type = "lomb",demean = FALSE, detrend = TRUE, = TRUE, na.action =, ...) Arguments Details The raw Lomb-Scargle periodogram for irregularly sampled time series is not a consistent estimator of the spectral density, but adjacent values are asymptotically independent. The series will be automatically padded with zeros until the series length is a highly composite number in order to help the Fast Fourier Transform. Value. Knitr: Elegant, flexible and fast dynamic report generation with R.

Overview The knitr package was designed to be a transparent engine for dynamic report generation with R, solve some long-standing problems in Sweave, and combine features in other add-on packages into one package (knitr ≈ Sweave + cacheSweave + pgfSweave + weaver + animation::saveLatex + R2HTML::RweaveHTML + highlight::HighlightWeaveLatex + 0.2 * brew + 0.1 * SweaveListingUtils + more).

knitr: Elegant, flexible and fast dynamic report generation with R

This package is developed on GitHub; for installation instructions and FAQ’s, see README. This website serves as the full documentation of knitr, and you can find the main manual, the graphics manual and other demos / examples here. For a more organized reference, see the knitr book. Motivation One of the difficulties with extending Sweave is we have to copy a large amount of code from the utils package (the file SweaveDrivers.R has more than 700 lines of R code), and this is what the two packages mentioned above have done.

Features Acknowledgements Misc. Programming with R. Introduction to R. Computing and visualizing PCA in R. Following my introduction to PCA, I will demonstrate how to apply and visualize PCA in R.

Computing and visualizing PCA in R

There are many packages and functions that can apply PCA in R. In this post I will use the function prcomp from the stats package. ColorBrewer: Color Advice for Maps. Index. ggplot2 Geoms Geoms, short for geometric objects, describe the type of plot you will produce. geom_abline(geom_hline, geom_vline) Lines: horizontal, vertical, and specified by slope and intercept.

Index. ggplot2

Technical Tidbits From Spatial Analysis & Data Science. Even the most experienced R users need help creating elegant graphics.

Technical Tidbits From Spatial Analysis & Data Science

The ggplot2 library is a phenomenal tool for creating graphics in R but even after many years of near-daily use we still need to refer to our Cheat Sheet. Up until now, we’ve kept these key tidbits on a local PDF. But for our own benefit (and hopefully yours) we decided to post the most useful bits of code. Updated September 5, 2014 Updated October 8, 2014 We start with the the quick setup and a default plot followed by a range of adjustments below. Create interactive, online versions of your plots (easier than you think) We're using data from the National Morbidity and Mortality Air Pollution Study (NMMAPS).

Plotting distributions (ggplot2) Problem.

Plotting distributions (ggplot2)

Geom_histogram. ggplot2 Set.seed(5689) movies <- movies[sample(nrow(movies), 1000), ] # Simple examples qplot(rating, data=movies, geom="histogram") stat_bin: binwidth defaulted to range/30.

geom_histogram. ggplot2

Use 'binwidth = x' to adjust this.Warning message: position_stack requires constant width: output may be incorrect qplot(rating, data=movies, weight=votes, geom="histogram") Ggplot2: Cheatsheet for Visualizing Distributions. In the third and last of the ggplot series, this post will go over interesting ways to visualize the distribution of your data.

ggplot2: Cheatsheet for Visualizing Distributions

I will make up some data, and make sure to set the seed.