Math 225

Introduction to Biostatistics


Notes from Lecture #14

  1. Hypothesis tests for a Mean

    A Motivating Problem. Is the average birth weight in a given population larger than 7 and a half pounds? Suppose in a sample of sixteen individuals, the average birth weight is 132 ounces with a standard deviation of 20 ounces.
  2. To test this hypothesis, we could gather data from a sample of individuals. If the sample mean birth weight exceeds 7 and a half pounds by more than we could explain by chance alone, there is evidence that the mean birth weight is greater.

  3. A hypothesis test consists of these parts.
    1. State null and alternative hypotheses.
    2. Calculate a test statistic.
    3. Calculate a p-value.
    4. Summarize your findings in the context of the problem.

  4. We can apply this to the example at hand.
    1. State null and alternative hypotheses. In our population, the mean birth weight is unknown and represented by the Greek letter mu. We ask if there is sufficient evidence to conclude that mu exceeds 7 and a half pounds, or 120 ounces based on our sample. Our null hypothesis is to assume that mu is 120 while the alternative is to assume it is larger. In formal notation, H0: mu = 120
      Ha: mu > 120

    2. Calculate a test statistic. We should first check that our sample data is well summarized by a mean and standard deviation. If there is very strong skewness or any extreme outliers, methods based on normal sampling distributions will not be valid.

      Given that examination of the data does not find these problems, the test statistic

      z = (x-bar - mu) / (sigma / sqrt(n))

      has a standard normal distribution when the null hypothesis is true. Unfortunately, we do not know the value of sigma. We can substitute in the sample standard deviation s. This adds variability to the denominator as well as the numerator, and the resulting sampling distribution is no longer standard normal. It has a t distribution with n-1 degrees of freedom. These distributions are symmetric, bell-shaped, and centered at 0 as the standard normal distribution is, but are more spread out. the extra spread decreases as the sample size increases.

      The test statistic we use is

      t = (x-bar - mu) / (s / sqrt(n))

      For our data, t = (132 - 120) / (20/sqrt(16)) = 2.40.

    3. Calculate a p-value. For this one-sided alternative, the p-value is the area to the right of 2.40 under a t distribution with 15 degrees of freedom. From the t table with 15 degrees of freedom, we see that the area to the left of 2.249 is 0.98 and the area to the left of 2.602 is 0.99. Therefore the area to the left of 2.40 is between these values and the area to the right is between 0.01 and 0.02. With software, we could calculate this exactly as 0.0149.

    4. Summarize your findings in the context of the problem. If the true population mean birth weight were 120 ounces, fewer than once in 50 samples would we get a sample mean as large as we observed. Because this is somewhat unlikely, a better explanation might be that the population mean birth weight is larger than 120 ounces.

  5. What about alpha? Often times, users of hypothesis tests will compare p-values to fixed significance levels. A common choice is alpha=0.05. A p-value smaller than 0.05 is called "significant at the 5% level". At this significance level, one might reject the null hypothesis. We maintain that reporting a p-value gives a better description of the evidence against a null hypothesis than simply stating whether or not it is significant at the 5% level.

    In the example, the result is not significant at the 1% level. A user of fixed significance levels would not reject the null hypothesis with this data. This does not mean, however, that there is strong evidence that the null hypothesis is actually true. It only means that the data was insufficient to rule out chance alone as the reason for a difference between the observed sample mean and the assumed population mean. You cannot prove the null hypothesis is true. You can say with high confidence that the true population mean is in a small interval containing an assumed value if you have sufficient data.


Last modified: March 27, 2001

Bret Larget, larget@mathcs.duq.edu