# Math 225

## Introduction to Biostatistics

### Notes from Lecture #18

1. Multiple comparisons: In an ANOVA problem, once the null hypothesis of equal means is rejected, there is usually more that can be said. Is there a single population mean differnet from the rest, or are they probably all different? Can we estimate different pairwise difference in population means with confidence? We need to recognize that making multiple comparisons is more likely to find a "significant difference" than making just a single comparison. To compensate for multiple comparisons, we need larger margins of error in simultaneous confidence intervals. There are many ways to approach this problem. Here, we describe the Scheffe method and the Bonferroni methods of multiple comparisons.

2. Scheffe's method. Scheffe's method is very conservative. It explicitly allows any number of comparisons. These comparisons can be made post hoc (after the fact). Scheffe's method allows for any type of comparison (called contrasts). We will examine only differences in population means.

The basic formula for a confidence interval for a difference in population means from two independent samples is

(difference in sample means) ± (multiplier)(pooled estimate of sd)(1/n1+1/n1)

Scheffe's method uses this same format. The differences are that we will use all samples to estiamte the common sd and we will use a multiplier from the F distribution instead of the t distribution.

For a 95% confidence interval for the pairwise differences in mean cuckoo bird length (in mm) for the hedge sparrow, robin, and wren host populations, we calculate the multiplier to be

multiplier = sqrt( (g-1) F(1-alpha) ) = sqrt( 5*2.29 ) = 3.38.

Notice that this number is larger than the t distribution value with 114 degrees of freedom, 1.98.

For all the pairwise confidence intervals, the estimated sd is 0.91 (see previous notes). The sample sizes are different for each comparison. The three pairwise confidence intervals are:

Sparrow - Robin: (23.12 - 22.58) ± 3.38 * 0.91 * sqrt(1/14 + 1/16) or 0.54 ± 1.13.
Sparrow - Wren: (23.12 - 21.13) ± 3.38 * 0.91 * sqrt(1/14 + 1/15) or 1.99 ± 1.14.
Robin - Wren: (22.58 - 21.13) ± 3.38 * 0.91 * sqrt(1/16 + 1/15) or 1.45 ± 1.11.

We can be at least 95% confident that all three confidence intervals are correct. We are at least 95% confident that the mean egg size of cuckoo bird eggs laid in wren nests are smaller than those in robin or hedge sparrow nests. We do not have sufficient evidence to concluse that the means for robins and hedge sparrows are different.

3. Bonferroni's method. Bonferroni's method is appropriate when all comparisons to be made are specified before the analysis, unlike Scheffe's method which allows snooping after the fact. Bonferroni's method works by increasing the confidence level of the individual comparisons so that the resultant combined comparison has at least the specified confidence level. If k comparisons are prespecified, each should have a confidence level equal to 1 - alpha/k so that the simultaneous condfidence level in all k comparisons has confidence at least 1 - alpha.

We can repeat the previous example. In this case, for three comparisons and simultaneous 95% confidence, we would want the individual confidence levels to be 1 - 0.05/3 = 98 1/3%. This value is not in our table, but we can use S-PLUS to find that the approapriate t value is 2.43. The three pairwise confidence intervals are:

Sparrow - Robin: (23.12 - 22.58) ± 2.43 * 0.91 * sqrt(1/14 + 1/16) or 0.54 ± 0.81.
Sparrow - Wren: (23.12 - 21.13) ± 2.43 * 0.91 * sqrt(1/14 + 1/15) or 1.99 ± 0.82.
Robin - Wren: (22.58 - 21.13) ± 2.43 * 0.91 * sqrt(1/16 + 1/15) or 1.45 ± 0.79.

Notice that the margins of error are smaller for Bonferroni's method than Scheffe's. This will be true when the number of comparisons is small, but not true if the number of comparisons is large enough.