statistics by Terry M. Therneau Ph.D.Faculty, Mayo Clinic About a year ago there was a query about how to do "type 3" tests for a Cox model on the R help list, which someone wanted because SAS does it. The SAS addition looked suspicious to me, but as the author of the survival package I thought I should understand the issue more deeply. In-depth introduction to machine learning in 15 hours of expert videos In January 2014, Stanford University professors Trevor Hastie and Rob Tibshirani (authors of the legendary Elements of Statistical Learning textbook) taught an online course based on their newest textbook, An Introduction to Statistical Learning with Applications in R (ISLR). I found it to be an excellent course in statistical learning (also known as “machine learning”), largely due to the high quality of both the textbook and the video lectures. And as an R user, it was extremely helpful that they included R code to demonstrate most of the techniques described in the book. If you are new to machine learning (and even if you are not an R user), I highly recommend reading ISLR from cover-to-cover to gain both a theoretical and practical understanding of many important methods for regression and classification. It is available as a free PDF download from the authors’ website. Chapter 1: Introduction (slides, playlist)
Data Gravity The purpose of this site is to explore Data Gravity and Data Physics. By explore, we mean embrace with the community in open discussion with a goal of everyone in Software, Networking, Data, and Compute benefiting in the long term. Data Gravity was a concept first described in this blog post. Getting started with the `boot' package in R for bootstrap inference The package boot has elegant and powerful support for bootstrapping. In order to use it, you have to repackage your estimation function as follows. R has very elegant and abstract notation in array indexes.
Some hints for the R beginner Some hints for the R beginner Learning R “Impatient R” provides a brief grounding in the basics of using R. Beginner’s Guide to R from Computerworld. Try R is a fun way of learning some R. swirl (Statistics With Interactive R Learning) is another interactive environment for learning R.
Forrester's top emerging technologies to watch, now through 2020 Special Feature The Future of IT: A Strategic Guide ZDNet and TechRepublic draw on their community of C-level executives and business thinkers to prognosticate where business technology is headed over the next 36 months. This includes advice, perspectives, and opinions on both creating and reacting to the future. Read More Technology has given your customers choices and digital predators the edge.
Bootstrapping Nonparametric Bootstrapping The boot package provides extensive facilities for bootstrapping and related resampling methods. You can bootstrap a single statistic (e.g. a median), or a vector (e.g., regression weights). This section will get you started with basic nonparametric bootstrapping. Impatient R Translations français: Translated by Kate Bondareva. Serbo-Croatian: Translated by Jovana Milutinovich from Geeks Education. Must read books for Analysts (or people interested in Analytics) One of the ways I continue my learning is reading. I read for 30 minutes before hitting the bed every day. This not only makes sure that I learn some thing daily, but also ends my day in a fulfilling manner. Learn R for beginners with our PDF With so much emphasis on getting insight from data these days, it's no wonder that R is rapidly rising in popularity. R was designed from day one to handle statistics and data visualization, it's highly extensible with many new packages aimed at solving real-world problems and it's open source (read "free"). If you're ready to learn, we have just the ticket: A free PDF of Computerworld's "Beginner's guide to R." Included in this 45-page guide:
40 Free Online Tools and Software to Improve Your Workflow Jun 08 2011 Charts and graphs are the most effective ways to show the relationship between two different and interlinked entities. On a web page, a comprehensively designed flowchart, diagram or graph can be worth a thousand words. Random forests - classification description Contents Introduction Overview Features of random forests Remarks How Random Forests work The oob error estimate Variable importance Gini importance Interactions Proximities Scaling Prototypes Missing values for the training set Missing values for the test set Mislabeled cases Outliers Unsupervised learning Balancing prediction error Detecting novelties A case study - microarray data Classification mode Variable importance Using important variables Variable interactions Scaling the data Prototypes Outliers A case study - dna data Missing values in the training set Missing values in the test set Mislabeled cases Case Studies for unsupervised learning Clustering microarray data Clustering dna data Clustering glass data Clustering spectral data References Introduction This section gives a brief overview of random forests and some comments about the features of the method.