background preloader

Statistics with R, and open source stuff (software, data, community)

Statistics with R, and open source stuff (software, data, community)
R 3.1.0 (codename “Spring Dance“) was released today! You can get the source code from or wait for it to be mirrored at a CRAN site nearer to you. Binaries for various platforms will appear in due course. The full list of new features and bug fixes is provided below. Upgrading to R 3.1.0

Related:  R

INTRODUCTION TO R WITH EXERCISES - Training Material Title Introduction to R with exercises Author Yolande Tra for NobleProg Ltd Subfooter 150+ R Abbreviations The R programming language includes many abbreviations. Abbreviations exist in function names, argument names, and allowed values for arguments. This post expands on over 150 R abbreviations with the aim of making it easier for users new to R who are trying to memorise R commands. Context Abbreviations save time when typing and can make for less cumbersome code. However, abbreviations often make it more difficult to remember a command.

Tools on Datavisualization A Carefully Selected List of Recommended Tools 07 May 2012 Tools Flash, JavaScript, Processing, R When I meet with people and talk about our work, I get asked a lot what technology we use to create interactive and dynamic data visualizations. To help you get started, we have put together a selection of the tools we use the most and that we enjoy working with. Data Analysis Data Analysis (data sheet) 1) Score the individual genotypes 2) Calculate genotype frequencies 3) Calculate allele frequencies

RStudio in the cloud, for dummies You can have your own cloud computing version of R, complete with RStudio. Why should you? It's cool! Plus, there's a lot more power out there than you can easily get on your own hardware. And, it's R in a web page. R is Not So Hard! A Tutorial, Part 1: Syntax by David Lillis, Ph.D. Many of you have heard of R (the R statistics language and environment for scientific and statistical computing and graphics). Perhaps you know that it uses command line input rather than pull-down menus. Perhaps you feel that this makes R hard to use and somewhat intimidating! OK. Indeed, R has a longer learning curve than other systems, but don’t let that put you off!

start [R-Node] Writing /var/www/r-node/data/cache/2/2d54f1d82cf67432cd4c82a000527112.i failed Unable to save cache file. Hint: disk full; file permissions; safe_mode setting. Principal Component Analysis step by step In this article I want to explain how a Principal Component Analysis (PCA) works by implementing it in Python step by step. At the end we will compare the results to the more convenient Python PCA()classes that are available through the popular matplotlib and scipy libraries and discuss how they differ. The main purposes of a principal component analysis are the analysis of data to identify patterns and finding patterns to reduce the dimensions of the dataset with minimal loss of information. Here, our desired outcome of the principal component analysis is to project a feature space (our dataset consisting of n x d-dimensional samples) onto a smaller subspace that represents our data "well". A possible application would be a pattern classification task, where we want to reduce the computational costs and the error of parameter estimation by reducing the number of dimensions of our feature space by extracting a subspace that describes our data "best".

kgtests KGTESTS is a Microsoft Excel Program that I prepared, which implements the k and g tests to check for signatures of population expansion as described by Reich and Goldstein (1998) and Reich et al. (1999). 1) Reich, D.E., Feldman, M.W. and Goldstein, D.B. (1999) Statistical Properties of Two Tests that Use Multilocus Data Sets to Detect Population Expansions, Mol Biol Evol, 16, 453-466. 2) Reich, D.E. and Goldstein, D.B. (1998) Genetic evidence for a Paleolithic human population expansion in Africa, Proc Natl Acad Sci U S A, 95, 8119-8123. The associated downloads are as follows and are available for download, free of charge, under a general open-license agreement. 1) The program file: 2) The read_me file: R: Retrieving information from google using the RCurl package « "R" you ready? R: Retrieving information from google using the RCurl package 01Jan09 Lately I read the article Automatic Meaning Discovery Using Google by Cilibras and VitanyiIt which introduces the normalized google distance (NGD) as a measure of semantic relatedness of two search terms.

Output to a file Problem You want to save your graph(s) to a file. Solution Introduction is a web interface for Hadley Wickham's R package ggplot2. It is used as a tool for rapid prototyping, exploratory graphical analysis and education of statistics and R. The interface is written completely in javascript, therefore there is no need to install anything on the client side: a standard browser will do.

VC blog Posted: November 26th, 2014 | Author: Manuel Lima | Filed under: Uncategorized | No Comments » As some attentive users of Visual Complexity might have noticed, the number of projects featured on the website has slowly come to a halt, with the perpetual grand total of 777 being a grieving reminder of inactivity for well over a year. Today, If you go the the main page and look at the top right corner, you will see an invigorating new message: “Indexing 782 projects”. Of course I didn’t want to write this blog post to announce that five new projects have been added to the database. This recent addition is part of a larger plan I’ve been wanting to share with you for some time.

Related:  StatR