### Math 225 Course Notes

Return to the Math 225 Homepage

### Chapter 8

#### Contents

Analysis of Variance (ANOVA) is a general technique
for analyzing data in which the variable of interest
(the response variable) is quantitative,
and the explanatory variables are categorical.
We will concentrate only on one specific question,
namely, testing whether the population means for a variable
measured in several different populations are all the same or not.
Each individual is measured (the response variable) and categorized
into one of the populations (the explanatory variable).

The basic idea is simple.
Find the mean from each sample.
If there is a great deal of variation among the sample means,
this is evidence that the population means are not all the same,
since we expect the sample means to be "close" to the population means.
"Close" is determined by the amount of variation within each sample
and the sample sizes.
If there is a great deal of variation within the samples,
this indicates that there is uncertainty about the location of the population
mean, and hence, it is more plausible that the population means are all the
same.
In contrast,
if there is very little variation within the samples,
we have a good idea of where the population means are,
and potential evidence that the population means are not all the same.

To conduct a formal test,
we compare these two variations.

F = variation among sample means / variation within samples

If this F statistic is large,
this is evidence that the population
means are not all equal (since variation among sample means
is large compared to variation within samples).
If F is not large,
this is consistent with the hypothesis that all population means are equal.
An ANOVA table is a device
to facilitate this computation.
To decide whether F is large or not,
the value of the test statistic should be compared to the the
F distribution with the correct numbers of degrees of freedom.

Last modified: April 16, 1996

Bret Larget,
larget@mathcs.duq.edu