background preloader

Statistics

Facebook Twitter

Www.stat.cmu.edu/~hseltman/309/Book/chapter9.pdf. Testing the assumptions of linear regression. Quantitative models always rest on assumptions about the way the world works, and regression models are no exception. There are four principal assumptions which justify the use of linear regression models for purposes of prediction: (i) linearity of the relationship between dependent and independent variables (ii) independence of the errors (no serial correlation) (iii) homoscedasticity (constant variance) of the errors (a) versus time (b) versus the predictions (or versus any independent variable) (iv) normality of the error distribution. If any of these assumptions is violated (i.e., if there is nonlinearity, serial correlation, heteroscedasticity, and/or non-normality), then the forecasts, confidence intervals, and economic insights yielded by a regression model may be (at best) inefficient or (at worst) seriously biased or misleading.

How to detect: The best test for residual autocorrelation is to look at an autocorrelation plot of the residuals. When Unequal Sample Sizes Are and Are NOT a Problem in ANOVA. In your statistics class, your professor made a big deal about unequal sample sizes in one-way Analysis of Variance (ANOVA) for two reasons. 1. Because she was making you calculate everything by hand. Sums of squares require a different formula if sample sizes are unequal, but SPSS (and other statistical software) will automatically use the right formula. 2.

Nice properties in ANOVA such as the Grand Mean being the intercept in an effect-coded regression model don’t hold when data are unbalanced. Instead of the grand mean, you need to use a weighted mean. That’s not a big deal if you’re aware of it. The only practical issue in one-way ANOVA is that very unequal sample sizes can affect the homogeneity of variance assumption. Real issues with unequal sample sizes do occur in factorial ANOVA, if the sample sizes are confounded in the two (or more) factors. Tagged as: Analysis of Variance, ANOVA, SPSS, Unequal sample sizes. R Tutorial Series: Two-Way ANOVA with Unequal Sample Sizes. When the sample sizes within the levels of our independent variables are not equal, we have to handle our ANOVA differently than in the typical two-way case. This tutorial will demonstrate how to conduct a two-way ANOVA in R when the sample sizes within each level of the independent variables are not the same.

Tutorial Files Before we begin, you may want to download the sample data (.csv) used in this tutorial. Be sure to right-click and save the file to your R working directory. Beginning Steps To begin, we need to read our dataset into R and store its contents in a variable. > #read the dataset into an R variable using the read.csv(file) function> dataTwoWayUnequalSample <- read.csv("dataset_ANOVA_TwoWayUnequalSample.csv")> #display the data> dataTwoWayUnequalSample The first ten rows of our dataset Unequal Sample Sizes In our study, 16 students participated in the online environment, whereas only 14 participated in the offline environment.

Weighted Means ANOVA using Type I Sums of Squares. Two-way anova. When to use it You use a two-way anova (also known as a factorial anova, with two factors) when you have one measurement variable and two nominal variables. The nominal variables (often called "factors" or "main effects") are found in all possible combinations. For example, let's say you are testing the null hypothesis that stressed and unstressed rats have the same glycogen content in their gastrocnemius muscle, and you are worried that there might be sex-related differences in glycogen content as well.

The two factors are stress level (stressed vs. unstressed) and sex (male vs. female). A two-way anova may be done with replication (more than one observation for each combination of the nominal variables) or without replication (only one observation for each combination of the nominal variables). Assumptions Two-way anova, like all anovas, assumes that the observations within each cell are normally distributed and have equal variances.

Two-way anova with replication Examples Similar tests. Www.stat.cmu.edu/~hseltman/309/Book/chapter11.pdf. Www.angelfire.com/wv/bwhomedir/notes/anova2.pdf. Two-way ANOVA. Www.calvin.edu/~rpruim/courses/m243/F03/handouts/anova2.pdf. Stats: One-Way ANOVA. A One-Way Analysis of Variance is a way to test the equality of three or more means at one time by using variances. Assumptions The populations from which the samples were obtained must be normally or approximately normally distributed. The samples must be independent. The variances of the populations must be equal. Hypotheses The null hypothesis will be that all population means are equal, the alternative hypothesis is that at least one mean is different. In the following, lower case letters apply to the individual samples and capital letters apply to the entire set collectively.

Grand Mean The grand mean of a set of samples is the total of all the data values divided by the total sample size. Another way to find the grand mean is to find the weighted average of the sample means. Total Variation The total variation (not variance) is comprised the sum of the squares of the differences of each mean with the grand mean. There is the between group variation and the within group variation. .

Stats: Two-Way ANOVA. The two-way analysis of variance is an extension to the one-way analysis of variance. There are two independent variables (hence the name two-way). Assumptions The populations from which the samples were obtained must be normally or approximately normally distributed. The samples must be independent. The variances of the populations must be equal. Hypotheses There are three sets of hypothesis with the two-way ANOVA. The null hypotheses for each of the sets are given below. The population means of the first factor are equal. Factors The two independent variables in a two-way ANOVA are called factors.

Treatment Groups Treatement Groups are formed by making all possible combinations of the two factors. As an example, let's assume we're planting corn. The data that actually appears in the table are samples. Main Effect The main effect involves the independent variables one at a time. Interaction Effect The interaction effect is the effect that one factor has on the other factor. Within Variation. Gapminder: Unveiling the beauty of statistics for a fact based world view. - Gapminder.org. Shoffma5 presentations. Www.psych.uw.edu/writingcenter/writingguides/pdf/stats.pdf. Www4.uwsp.edu/psych/cw/statistics/Wendorf-ReportingStatistics.pdf.