background preloader

Standard Deviation and Variance

Standard Deviation and Variance
Deviation just means how far from the normal Standard Deviation The Standard Deviation is a measure of how spread out numbers are. Its symbol is σ (the greek letter sigma) The formula is easy: it is the square root of the Variance. Variance The Variance is defined as: The average of the squared differences from the Mean. To calculate the variance follow these steps: Work out the Mean (the simple average of the numbers)Then for each number: subtract the Mean and square the result (the squared difference).Then work out the average of those squared differences. Example You and your friends have just measured the heights of your dogs (in millimeters): The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and 300mm. Find out the Mean, the Variance, and the Standard Deviation. Your first step is to find the Mean: Answer: Mean = 600 + 470 + 170 + 430 + 3005 = 19705 = 394 so the mean (average) height is 394 mm. Now we calculate each dog's difference from the Mean: So the Variance is 21,704 Formulas

Quartiles Quartiles are the values that divide a list of numbers into quarters. First put the list of numbers in order Then cut the list into four equal parts The Quartiles are at the "cuts" Like this: Example: 5, 8, 4, 4, 6, 3, 8 Put them in order: 3, 4, 4, 5, 6, 8, 8 Cut the list into quarters: And the result is: Quartile 1 (Q1) = 4 Quartile 2 (Q2), which is also the Median, = 5 Quartile 3 (Q3) = 8 Sometimes a "cut" is between two numbers ... the Quartile is the average of the two numbers. Example: 1, 3, 3, 4, 5, 6, 6, 7, 8, 8 The numbers are already in order In this case Quartile 2 is half way between 5 and 6: Quartile 1 (Q1) = 3 Quartile 2 (Q2) = 5.5 Quartile 3 (Q3) = 7 Interquartile Range The "Interquartile Range" is from Q1 to Q3: To calculate it just subtract Quartile 1 from Quartile 3, like this: Example: The Interquartile Range is: Box and Whisker Plot You can show all the important values in a "Box and Whisker Plot", like this: A final example covering everything: Put them in order: Cut it into quarters:

Standard Deviation The standard deviation of a probability distribution is defined as the square root of the variance where is the mean, is the second raw moment, and denotes the expectation value of . is therefore equal to the second central moment (i.e., moment about the mean), The square root of the sample variance of a set of values is the sample standard deviation The sample standard deviation distribution is a slightly complicated, though well-studied and well-understood, function. However, consistent with widespread inconsistent and ambiguous terminology, the square root of the bias-corrected variance is sometimes also known as the standard deviation, of a list of data is implemented as StandardDeviation[list]. Physical scientists often use the term root-mean-square as a synonym for standard deviation when they refer to the square root of the mean squared deviation of a quantity from a given baseline. , and To find the standard deviation range corresponding to a given confidence interval, solve (5) for , giving

Normal Distribution Data can be "distributed" (spread out) in different ways. But there are many cases where the data tends to be around a central value with no bias left or right, and it gets close to a "Normal Distribution" like this: A Normal Distribution The "Bell Curve" is a Normal Distribution. Many things closely follow a Normal Distribution: heights of people size of things produced by machines errors in measurements blood pressure marks on a test We say the data is "normally distributed". Quincunx Standard Deviations The Standard Deviation is a measure of how spread out numbers are (read that page for details on how to calculate it). When you calculate the standard deviation of your data, you will find that (generally): Example: 95% of students at school are between 1.1m and 1.7m tall. Assuming this data is normally distributed can you calculate the mean and standard deviation? The mean is halfway between 1.1m and 1.7m: Mean = (1.1m + 1.7m) / 2 = 1.4m Standard Scores How far is 1.85 from the mean?

Correlation When two sets of data are strongly linked together we say they have a High Correlation. The word Correlation is made of Co- (meaning "together"), and Relation Correlation is Positive when the values increase together, and Correlation is Negative when one value decreases as the other increases Like this: Correlation can have a value: 1 is a perfect positive correlation 0 is no correlation (the values don't seem linked at all) -1 is a perfect negative correlation The value shows how good the correlation is (not how steep the line is), and if it is positive or negative. Example: Ice Cream Sales The local ice cream shop keeps track of how much ice cream they sell versus the temperature on that day, here are their figures for the last 12 days: And here is the same data as a Scatter Plot: We can easily see that warmer weather leads to more sales, the relationship is good but not perfect. In fact the correlation is 0.9575 ... see at the end how I calculated it. Correlation Is Not Good at Curves Where:

Calculating the mean from a frequency table It is easy to calculate the Mean: Add up all the numbers, then divide by how many numbers there are. Example 1: What is the Mean of these numbers? Add the numbers: 6 + 11 + 7 = 24 Divide by how many numbers (there are 3 numbers): 24 ÷ 3 = 8 The Mean is 8 But sometimes you won't have a simple list of numbers, you might have a frequency table like this (the "frequency" says how often they occur): (it says that score 1 occurred 2 times, score 2 occurred 5 times, etc) You could list all the numbers like this: But rather than do lots of adds (like 3+3+3+3) it is often easier to use multiplication: And rather than count how many numbers there are, we can add up the frequencies: So let's calculate: And that is how to calculate the mean from a frequency table! Here is another example: Example: Parking Spaces per House in Hampton Street Isabella went up and down the street to find out how many parking spaces each house had. What is the mean number of Parking Spaces? Answer: Notation (where f is frequency)

Standard Deviation Formulas Deviation just means how far from the normal Standard Deviation The Standard Deviation is a measure of how spread out numbers are. You might like to read this simpler page on Standard Deviation first. But here we explain the formulas. The symbol for Standard Deviation is σ (the Greek letter sigma). This is the formula for Standard Deviation: Say what? OK. Say you have a bunch of numbers like 9, 2, 5, 4, 12, 7, 8, 11. To calculate the standard deviation of those numbers: 1. The formula actually says all of that, and I will show you how. The Formula Explained First, let us have some example values to work on: Example: Sam has 20 Rose Bushes. The number of flowers on each bush is Work out the Standard Deviation. Step 1. In the formula above μ (the greek letter "mu") is the mean of all our values ... Example: 9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4 The mean is: So: μ = 7 Step 2. This is the part of the formula that says: So what is xi ? In other words x1 = 9, x2 = 2, x3 = 5, etc. Step 3.

Related: