Must-Have Tools for Making Your Own Data Visualizations Data visualizations, the more intelligent sibling of infographics, can be a wildly productive way to make sense out of massive amounts of data. The problem is, this power typically comes with a steep learning curve. Now, thanks to a set of tools gathered together by the awesome folks behind Datavisualization.ch, creating data visualizations is a bit more approachable. If you’re looking to get your hands dirty, visit the site and pick any of the tools at random. The list of tools include Web apps like: Colorbrewer, a tool for selecting colors for maps, DataWrangler, an interactive web application for data cleaning and transformation, GeoCommons, a public community and set of tools to access, visualize and analyze data with compelling map visualizations, and Impure/Quadrigram, a visual programming language aimed to gather, process and visualize information. From the Datavisualization.ch team: ➤ Datavisualization.ch Selected Tools
Guide to Speeding Up R Code This is an overview of tools for speeding up your R code that I wrote for the Davis R Users’ Group. First, Ask “Why?” It’s customary to quote Donald Knuth at this point, but instead I’ll quote my twitter buddy Ted Hart to illustrate a point: I’m just going to say it.I like for loops in #Rstats, makes my code readable.All you [a-z]*ply snobs can shove it! — Ted Hart (@DistribEcology) March 12, 2013 Code optimization is a matter is a matter of personal taste and priorities. (1) Do you want your code to be readable? If you need to explain your code to yourself or others, or you will need to return to it in a few months time and understand what you wrote, it’s important that you write it in a way that is easy to understand. Some optimal code can be hard to read. (2) Do you want your code to be sharable? Most of the considerations of (1) apply here, but they have to be balanced with the fact that, if your code is painfully slow, others are not going to want or have time to use it. No? compare
ProjectTemplate Datavisualization.ch Selected Tools The Guerilla Guide to R Update: Okay. I’ve uploaded a new template and things seem to be fine now. Update: I am aware the table of contents is not being displayed in bullet form as I intended. The web template I’m using seems to be buggy. It also seems to think this page is in Indonesian…Working on it! Table of Contents: About: Stack Overflow is awesome. This is why I’ve collated, The Guerilla Cookbook for R. The cool thing is, this “book” essentially writes itself since most of the experts (and peer-reviewers) are answering the questions. How Was The Content Selected? I personally searched through Stack Overflow to find my favorite questions and shared them here.
Introducing R The purpose of these notes, an update of my 1992 handout Introducing S-Plus, is to provide a quick introduction to R, particularly as a tool for fitting linear and generalized linear models. Additional examples may be found in the R Logs section of my GLM course. R is a powerful environment for statistical computing which runs on several platforms. These notes are written specially for users running the Windows version, but most of the material applies to the Mac and Linux versions as well. 1.1 The R Language and Environment R was first written as a research project by Ross Ihaka and Robert Gentleman, and is now under active development by a group of statisticians called 'the R core team', with a home page at www.r-project.org. R was designed to be 'not unlike' the S language developed by John Chambers and others at Bell Labs. R is available free of charge and is distributed under the terms of the Free Software Foundation's GNU General Public License. 1.2 Bibliographic Remarks
What Facebook Knows Photographs by Leah Fasten If Facebook were a country, a conceit that founder Mark Zuckerberg has entertained in public, its 900 million members would make it the third largest in the world. It would far outstrip any regime past or present in how intimately it records the lives of its citizens. And yet, even as Facebook has embedded itself into modern life, it hasn’t actually done that much with what it knows about us. Few Privacy Regulations Inhibit Facebook Laws haven't kept up with the company's ability to mine its users' data. Even as Facebook has embedded itself into modern life, it hasn’t done that much with what it knows about us. Heading Facebook’s effort to figure out what can be learned from all our data is Cameron Marlow, a tall 35-year-old who until recently sat a few feet away from Zuckerberg. Facebook has all this information because it has found ingenious ways to collect data as people socialize. Contagious Information Social Engineering This is just the beginning.
Google Dev R Video Lectures I got this Google Developers R Programming Video Lectures from Stephen's blog - Getting Genetics Done. Very useful R tutorial for beginner! Short and efficient. Here is what I learned after watching the lectures: stop() and warning() function I was asked this question during a job interview. stop('message') will print out the error message and stop the function. warning('message') will print out the error message but continue the function. Passing additional arguments using an ellipsis(...), for example: myFunc <- function(a, return colMean(a, will allow us to call the function like myFun(a, na.rm=T), for example. You can also record the ellipsis arguments by args=list(...) return() vs. invisible() return() will return the values and print out in the screen. invisible will return the values but not print out to the screen. use recall() to call recursive function in R.
In-depth introduction to machine learning in 15 hours of expert videos In January 2014, Stanford University professors Trevor Hastie and Rob Tibshirani (authors of the legendary Elements of Statistical Learning textbook) taught an online course based on their newest textbook, An Introduction to Statistical Learning with Applications in R (ISLR). I found it to be an excellent course in statistical learning (also known as “machine learning”), largely due to the high quality of both the textbook and the video lectures. And as an R user, it was extremely helpful that they included R code to demonstrate most of the techniques described in the book. If you are new to machine learning (and even if you are not an R user), I highly recommend reading ISLR from cover-to-cover to gain both a theoretical and practical understanding of many important methods for regression and classification. It is available as a free PDF download from the authors’ website. Chapter 1: Introduction (slides, playlist) Chapter 2: Statistical Learning (slides, playlist) Interviews (playlist)
Episode #5 – How To Learn Data Visualization (with Andy Kirk) | Data Stories Hi Folks! We love Andy so much that we decided to keep him with us for another episode (well, actually we hope somebody will eventually pay the ransom). This time we talk about “learning visualization”, which is the perfect topic for him given his experience with his training visualization courses. We received many requests of people who wanted to know how to learn visualization in the past. So, here we are with a more than one hour long podcast with the three of us talking about it. Breakdown of the episode Introductory thoughts 00:00:00 Intro, Andy Kirk ( is again our guest 00:01:15 Topic: How to learn visualization 00:01:56 Multidisciplinarity 00:06:31 Reports from teaching practice 00:09:21 Theory and practice – rules vs, free exploration 00:12:24 Do you need to start with a question? Basic skills 00:15:43 What is the basic skill set to learn? Learning options and books 00:39:46 Everybody should have a datavis course! Resources and Links That’s all folks.
Creating your personal, portable R code library with GitHub As I discussed in a previous post, I have a few helper functions I’ve created that I commonly use in my work. Until recently, I manually included these functions at the start of my R scripts by either the tried and true copy-and-paste method, or by extracting them from a local file with the <code>source()</code> function. The former approach has the benefit of keeping the helper code inextricably attached to the main script, but it adds a good bit of code to wade through. The latter approach keeps the code cleaner, but requires that whoever is running the code always has access to the sourced file and that it is always in the same relative path – and that makes sharing or moving code more difficult. The resulting approach takes advantage of GitHub Gists and R’s ability to source via a web-based location to enable you to create a personal, portable library of R functions for private use or to share. The process is very straightforward.
Quick-R: Home Page