Thumbnail for Analysis of Variance (ANOVA) and F statistics .... MADE EASY!!! by Stats with Brian

Analysis of Variance (ANOVA) and F statistics .... MADE EASY!!!

Stats with Brian

10m 34s2,212 words~12 min read
YouTube auto captions
Transcript source

YouTube auto captions

This transcript was extracted from YouTube's auto-generated caption track. The transcript below is server-rendered so it can be read, searched, cited, and shared without opening the original YouTube player.

Pull quotes
[0:00]In an introductory statistics class, we learn a few different tests for quantitative data involving averages.
[0:00]We might want to ask a question like, is the average height in the US equal to 66 inches?
[0:00]And we might want to ask, is the average height in the US equal to the average height in Canada?
[0:00]Here, we can see that Canada is a little taller than the US, but maybe not significantly taller.
Use this transcript
Related transcript hubs

[0:00]Let's learn about ANOVA. In an introductory statistics class, we learn a few different tests for quantitative data involving averages. First, you might learn about a one sample t-test. We might want to ask a question like, is the average height in the US equal to 66 inches? We are using one sample of data from the United States. Then we move on to a two-sample t-test. And we might want to ask, is the average height in the US equal to the average height in Canada? So we collect two samples of data and compare the averages of the two groups. Here, we can see that Canada is a little taller than the US, but maybe not significantly taller. But what if we want to ask, is the average height equal in all of these countries? Is it equal in the US, Canada, and Mexico, or maybe even more than three countries? Do we just do a three or more sample t-test? Is there such a thing? Well, what we really do is a test that is closely related to a t-test, called the Analysis of Variance or ANOVA. In ANOVA, our null hypothesis is that all the group means are equal. We might write that μ1 is equal to μ2 is equal to μ3. We start off assuming that all the groups have an equal mean. The thing that we might want to prove, the alternative hypothesis, is that at least one mean is different from the other means. It's hard to write this alternative hypothesis in symbols. Because this is not the same thing as saying that they are all unequal to each other. We want to reject the null hypothesis even if only one group is different than the others, they don't all have to be different. And again, ANOVA can be used for any number of groups, three or more. So we could have more than three groups here, we could have five or more. Technically, ANOVA could even be used for two groups, though in that case, we would typically just do a two-sample t-test instead. So here is the null and alternative hypothesis for our specific problem, that the mean height in the US is equal to the mean height in Canada, which is equal to the mean height in Mexico. And the alternative hypothesis is that at least one country has a different average height than the others. When we do a hypothesis test, what we want to do is create a test statistic that measures how weird our data is under the null hypothesis. So when we do a t-test, we create a t-statistic, which measures how far our observed data is from the expected data. When we have more than two groups though, it's not exactly clear how to measure the difference between three or more groups. For instance, here are the means of these three groups. We see that the US is taller than Mexico, which is taller than Canada. So, we can measure the difference between the first and the second group, the first and third, the second and third.

[2:37]But there's a lot of differences here. And if we had more than three groups, there would be lots of pairwise differences. Maybe we could add these up, maybe we could square them and add them up. And that would be some measure of how different these groups are. Similarly, we could also calculate how far each of the means is from the overall mean, which is sometimes called the grand mean. So here, in orange, we have the mean of all three groups. And we can see the green arrows showing us how different each group is from the overall mean. This is some measurement of how well our data means our null hypothesis. If the null is true, these green arrows, these distances should be very close to zero. So maybe we could just measure these distances or maybe we could square these distances and add them up. Well, when we start to talk about squaring distances from the mean, this is starting to sound a lot like variance, the distance from the mean. So let's talk a little bit about variance. Within each of these three groups, there is a high variance. Right, so we see within each group, the USA, Canada, and Mexico, they all have a lot of variability. In contrast, with this set of data, there is low variance within each group. The height of the people in each group is relatively close together. There's low variability. So this is the within group variance. It describes each group separately, within itself. Low variance, high variance. We also have between group variance. And this describes how different the means are. Okay, so here we have three groups. This is the mean of each group. And here is the overall mean. How much variation is there between the groups? Very little. All three groups are very, very close to the overall mean. But what if the data looked like this? Here is the mean of each group. Now, here is the overall mean. And we see that some of the groups differ quite a bit from the overall mean. There is a lot of variance between the three groups, between the three countries. So the between group variance can be high or low, and then within group variance can be high or low. Now if the between group variance is low, the groups have similar means. So regardless of whether the within group variance is high, or whether it's low, these three groups have similar means. So if the between group variance is low, the groups have similar means and we do not want to reject the null hypothesis. Now, if the between group variance is high, the groups have different means. We would want to reject the null hypothesis. If the within group variance is small, we can see that these groups are very, very clearly different. There's almost no overlap in the heights of people in the three countries. However, we could also have a high between group variance and a high within group variance, which we see here. Here, we still see that the three groups have very different means, but it is harder to tell because the within group variance is so high. If the within group variance is large, there is more overlap between groups and it becomes harder to tell that the groups have different means. So the groups likely have different means when the variance between groups is much higher than the variance within groups. We measure this with a statistic we call F. The F statistic is the variance between the groups divided by the variance within the groups. The F statistic is always positive because variance is always positive, so we're dividing two positive numbers. When F is small, that means the variance between the groups is small. And that means we do not have evidence that the groups are different because there's not much variation between the groups. But when F is high, that means we have evidence to reject the null hypothesis. We conclude the groups have different means because there is a high variance between groups. And we'll learn to calculate this in more detail in a second. For now, let's focus on one more detail, the variance within the groups. If there are three or more groups, how do we get only one measurement of the variance within groups? We can only do this when all groups have the same variance. So this is an assumption of ANOVA, that all the groups have the same variance. Let's see this. So these groups all have the same variance, which means we can do ANOVA. Similarly, these three groups all have a high variance, so we could also do ANOVA. However, these three groups do not have the same variance. We see that Canada has a very low variance, while Mexico has a very high variance, and the USA is somewhere in between. Since these groups don't have the same variance, we cannot do ANOVA. So, let's look at calculating this F statistic in a little bit more detail. Let's let k be the number of groups. So, for us, this is three. k equals 3 because we have three groups, three countries. And N will be the total sample size of all the groups combined, big N. So here is our data. The between group variance is the average squared deviation of the group means from the overall mean. So, we're going to take all three of our groups, right? So this is a sum from i equals 1 to k, which is 3. And we're going to take these distances between the orange line and each group. These distances measured by the green arrows and we're going to square them and add them up. And then we want to take the average distance, so we're going to divide by k minus 1. So it's not k, it's k minus 1. And this happens a lot when we talk about variance because of the degrees of freedom involved in the problem. Now let's look at the within group variance. The within group variance is the average squared deviation of each data point from its own group mean. Okay, so now we're not involving the overall mean, we're doing within each group. So within each group, these green arrows measure the within group variation of how far the points are from the mean. And what we're going to do is for each of these groups, we're going to add up all of these deviations and we're going to square them and add them up. So basically, we're going to take all three of these groups and we're going to take all of the green arrows, we're going to square them and add them up. And we're going to take an average, so we're going to divide by not N, but N minus K. And again, this is related to the degrees of freedom involved in the problem because we have to estimate a lot of parameters. This estimate of the variance is also called the pooled variance. We are pooling all three groups and putting them together. This is basically the same as calculating each group's variance and averaging them, of course, weighted by the sample size, if we have different sample sizes in each group. So our F statistic is the variance between groups over the variance within groups. Which, if we expand, can be expressed through this formula. If this number is big, we reject the null hypothesis and conclude our groups have different means. What number is considered too big is determined by looking up a cut-off, called a critical value, using an F table in a statistics textbook or using software like R. To look up the critical value, we must specify the degrees of freedom, which are k minus 1 and N minus K. So here's a visual recap of ANOVA. We have our null hypothesis that all the groups are equal, and we want to prove that at least one country has a different height. And we measure this through the F statistic, the variance between groups over the variance within groups. In this example, we have low variance between groups. All the groups have similar means. Because the variance between groups is small, the F statistic will be small. And we will fail to reject the null hypothesis. Here, we also have a small variance between groups. All groups have the same mean. Which means the variance between groups is small, our F statistic is small, and we will also fail to reject the null. Here, we have a high variance between groups. The groups have different means. So we have a high variance between groups, different means. And we also have a low variance within groups. So this denominator is very small. And this means our F statistic is going to be very big. And that's going to make us reject the null. When we see this data, we conclude that the US has a different height than the other countries. Now, here we have the tricky case. Here, we do have a high variance between the groups. They have different means. But we also have high variance within the groups. Each group has quite a bit of variability, so it's harder to tell the difference. So it's unclear, we want to reject the null, but the high within group variation makes it harder to tell if the groups are different. And that's why we aren't always just eyeballing results like this visually. That's why we actually want to do an ANOVA, we want to calculate an F statistic so we can see how it compares to the critical value. And we want to know if our results are statistically significant or perhaps just due to random variation. And here, in this case, we said we can't do ANOVA because the within group variability between the groups is very different. Which violates the assumptions of ANOVA. Thanks for watching, please like and subscribe to learn more statistics.

Need another transcript?

Paste any YouTube URL to get a clean transcript in seconds.

Get a Transcript