# Data Analysis Examples

The pages below contain examples (often hypothetical) illustrating the application of different statistical analysis techniques using different statistical packages. Each page provides a handful of examples of when the analysis might be used along with sample data, an example analysis and an explanation of the output, followed by references for more information. These pages merely introduce the essence of the technique and do not provide a comprehensive description of how to use it. The combination of topics and packages reflect questions that are often asked in our statistical consulting. As such, this heavily reflects the demand from our clients at walk in consulting, not demand of readers from around the world. For grants and proposals, it is also useful to have power analyses corresponding to common data analyses. The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California. Related:  R Analysis

Finding the Best Subset of a GAM using Tabu Search and Visualizing It in R The famous probabilist and statistician Persi Diaconis wrote an article not too long ago about the "Markov chain Monte Carlo (MCMC) Revolution." The paper describes how we are able to solve a diverse set of problems with MCMC. The first example he gives is a text decryption problem solved with a simple Metropolis Hastings sampler. I was always stumped by those cryptograms in the newspaper and thought it would be pretty cool if I could crack them with statistics. So I decided to try it out on my own. The example Diaconis gives is fleshed out in more details by its original authors in its own article. The decryption I will be attempting is called substitution cipher, where each letter of the alphabet corresponds to another letter (possibly the same one). The strategy is to use a reference text to create transition probabilities from each letter to the next. To create a transition matrix, I downloaded War and Peace from Project Gutenberg. Created by Pretty R at inside-R.org

Brandon Foltz, M.Ed. This is the first video in what will be, or is (depend­ing on when you are watch­ing this) a mul­ti­part video series about Sim­ple Lin­ear Regres­sion. In the next few min­utes we will cover the basics of Sim­ple Lin­ear Regres­sion start­ing at square one. And for the record, from now on if I say "regres­sion" I am refer­ring to sim­ple lin­ear regres­sion as opposed to mul­ti­ple regres­sion or mod­els that are not linear. Regres­sion allows us to model, math­e­mat­i­cally, the rela­tion­ship between two or more vari­ables. For now, we will be work­ing with just two vari­ables; an inde­pen­dent vari­able and a depen­dent vari­able. So in this video, we are going to talk about that idea. So if you are new to Regres­sion or are still try­ing to fig­ure out exactly what it even IS…this video is for you. So sit back, relax, and let's go ahead and get to work. For my com­plete video library orga­nized by playlist, please go to my video page here:

Formulae in R: ANOVA and other models, mixed and fixed | Just the kind of thing you were expecting R’s formula interface is sweet but sometimes confusing. ANOVA is seldom sweet and almost always confusing. And random (a.k.a. mixed) versus fixed effects decisions seem to hurt peoples’ heads too. In the following, assume that Y is a dependent variable and A, B, C, etc. are predictors, all contained in data frame d. Formula Recap If you use R then you probably already know this, but let’s recap anyway. lm(Y ~ A + B, data=d) Interactions are expressed succinctly with the asterisk lm(Y ~ A * B, data=d) or equivalently but more explicitly by specifying component parts using the colon notation, like lm(Y ~ A + B + A:B, data=d) This is useful for more complex interaction structures, e.g. lm(Y ~ A * B * C, data=d) which contains all main effects, all two way interactions, and a three way interaction. lm(Y ~ A + B + C + A:B + A:C + B:C, data=d) is the same except for having no three way interaction. lm(Y ~ (A + B + C)**2, data=d) lm(Y ~ (A + B + C)^2, data=d) lm(Y ~ A + B + B**2, data=d) Classical ANOVA