background preloader

Data science

Facebook Twitter

Visualizing Facebook Friends: Eye Candy in R. Earlier this week I published a data visualization on the Facebook Engineering blog which, to my surprise, has received a lot of media coverage I’ve received a lot comments about the image, many asking for more details on how I created it.

Visualizing Facebook Friends: Eye Candy in R

When I tell people I used R, the reaction I get is roughly what I would expect if I told them I made it with Microsoft Paint and a bottle of Jägermeister. Some people even questioned whether it was actually done in R. The truth is, aside from the addition of the logo and date text, the image was produced entirely with about 150 lines of R code with no external dependencies. In the process I learned a few things about creating nice-looking graphs in R. Transparency and Faking It My first attempt at plotting the data involved plotting very transparent lines. The solution was to manipulate the drawing order of the lines. Great Circles Euclidean Distance. VersionEye. Package reference: A.

Pretty R syntax highlighter. Maps with R (I) This is the first post of a short series to show some code I have learnt to produce maps with R.

Maps with R (I)

Some time ago I found this infographic from The New York Times (via this page) and I wondered how a multivariate choropleth map could be produced with R. Here is the code I have arranged to show the results of the last Spanish general elections in a similar fashion. Some packages are needed: Let’s start with the data, which is available here (thanks to Emilio Torres, who “massaged” the original dataset, available from here).

Each region of the map will represent the percentage of votes obtained by the predominant political option. The Spanish administrative boundaries are available as shapefiles at the INE webpage (~70Mb): (EDITED, following the question of Sandra). Then we shift the coordinates of the islands: and finally construct a new object binding the shifted islands with the peninsula: The last step before drawing the map is to link the data with the polygons: So let’s draw the map. Datavisualization.ch Selected Tools. The Growth of the Data Scientist. Greg J. Smith. 37 Data-ish Blogs You Should Know About.

You might not know it, but there are actually a ton of data and visualization blogs out there.

37 Data-ish Blogs You Should Know About

I'm a bit of a feed addict subscribing to just about anything with a chart or a mention of statistics on it (and naturally have to do some feed-cleaning every now and then). In a follow up to my short list last year, here are the data-ish blogs, some old and some new, that continue to post interesting stuff. Data and Statistics By the Numbers - Column from The New York Times visual Op-ed columnist, Charles Blow, who also used to be NYT's graphics director.Data Mining - Matthew Hurst, scientist at Microsoft's MSN, also the co-creator of BlogPulse.Statistical Modeling - We might disagree on certain things, but Andrew's blog is one of the few active pure statistics blogs.The Numbers Guy - Data-minded reporting from Carl Bialik of the Wall Street Journal.Basketball Geek - Like statistical analysis and basketball?

Statistical/Analytical Visualization Maps Design & Infographics Others Worth Noting. Neoformix - Discovering and Illustrating Patterns in Data.