All statistical/computational scientists should use git and github, but it can be hard to get started. I hope these pages help. (More blather below.) There are many resources for git and github; my aim is to provide the minimal guide to get started. I love git and github.
Mining of Massive Datasets
The book has now been published by Cambridge University Press. The publisher is offering a 20% discount to anyone who buys the hardcopy Here. By agreement with the publisher, you can still download it free from this page.
Building an R Hadoop System - RDataMining.com: R and Data Mining
This page shows how to build an R Hadoop system, and presents the steps to set up my first R Hadoop system in single-node mode on Mac OS X. After reading documents and tutorials on MapReduce and Hadoop and playing with RHadoop for about 2 weeks, finally I have built my first R Hadoop system and successfully run some R examples on it. Here I’d like to share my experience and steps to achieve that. Hopefully it will make it easier to try RHadoop for R users who are new to Hadoop. Note that I tried this on Mac only and some steps might be different for Windows. Before going through the complex steps below, let’s have a look what you can get, to give you a motivation to continue.
A Tutorial on Loops in R - Usage and Alternatives
Introduction In this easy-to-follow R tutorial on loops we will examine the constructs available in R for looping, and how to make use of R's vectorization feature to perform your looping tasks more efficiently. We will present a few looping examples; then criticize and deprecate these in favor of the most popular vectorized alternatives (amongst the very many) available in the rich set of libraries that R offers.
All about the position: Data scientist
Teradata Aster is seeking experienced individuals with demonstrated capability in the applied analytic and/or data science space. Proficiency in data manipulation, analytic algorithms, advanced math, and/or statistical modeling is required and application development experience a plus. We are looking for exceptional individuals to join our Professional Services team as an Analytic Data Scientists. This client-facing role will be engaged in the design and deployment of solutions.
An R "meta" book
by Joseph Rickert I am a book person. I collect books on all sorts of subjects that interest me and consequently I have a fairly extensive collection of R books, many of which I find to be of great value. Nevertheless, when I am asked to recommend an R book to someone new to R I am usually flummoxed. R is growing at a fantastic rate, and people coming to R for the first time span I wide range of sophistication. And besides, owning a book is kind of personal.
RStudio Server Amazon Machine Image (AMI) - Louis Aslett
Current AMI Quick Reference (27nd Jun 2015)Amazon instance type reference Click to launch through AWS web interface: What’s new recently? Easy Dropbox setup to make syncing files on/off server easy, including selective folder sync. Preinstalled RStudioAMI R package for server control.
A Tutorial on Using Functions in R! (and their scoping)
Introduction In a previous post, we covered part of the R language control flow, the cycles or loop structures. In a subsequent one, we showed how to avoid 'looping' by means of functions, that act on compound data in repetitive ways (the apply family of functions). Here, we introduce the notion of function from the R programmer point of view and illustrate the range of action that functions have within the R code ('scope'). The post will highlight concepts such as:
Data Science Bootcamp - 12 week career prep
New York City in-person instruction + ongoing career coaching + job placement support Winter Bootcamp: January 12, 2015 - April 3, 2015 Application Period Closed
ONLINE OPEN-ACCESS TEXTBOOKS
Search form You are here Forecasting: principles and practice Rob J Hyndman George Athanasopoulos Statistical foundations of machine learning
Highland Statistics Ltd
Jump straight to Price and Order the book Outline Keywords Table of Contents Data sets and R code used
The Analytics Edge
In the last decade, the amount of data available to organizations has reached unprecedented levels. Data is transforming business, social interactions, and the future of our society. In this course, you will learn how to use data and analytics to give an edge to your career and your life. We will examine real world examples of how analytics have been used to significantly improve a business or industry.
Thesis: practical tools for exploring data and models
Practical tools for exploring data and models This thesis describes three families of tools for exploring data and models. It is organised in roughly the same way that you perform a data analysis. First, you get the data in a form that you can work with. Chapter 2 describes the reshape framework for restructuring data. Second, you plot the data to get a feel for what is going on.