Section 6.5: Confidence Interval for a Population Proportion

Key Concepts

It is appropriate to form confidence intervals for a single population proportion when sample is drawn and the statistic of interest is counting the proportion in the sample that fall into a given category. The methodology presented here is only appropriate for large samples. As in all examples presented in this chapter, the central limit theorem allows us to conclude that the sampling distribution is approximately normal.

In this situation, there are different formula for the mean and standard error, but the same logic and procedure for constructing confidence intervals remains the same.

Another example shows how to apply these ideas to an estimation problem.

Formula

The sampling distribution of is summarized by:

mean( ) = p

and SE( ) = The shape will be approximately normal for sufficiently large samples. A general rule of thumb is that if np > 5 and n(1-p) > 5, then the distribution will be approximately normal. The techniques of this section should not be used to construct confidence intervals if the sample size is too small.

Example

Suppose that 1000 American women aged 50--54 are randomly selected, and that 18 are found to have breast cancer. Construct a 95% confidence interval for the proportion of American women in this age group with breast cancer.

Our sample proportion is 18/1000 = .018. We need to check that np is at least 5.

```1000 * .018 = 18 > 5
```
(This will hold whenever there are at least 5 successes and failures in the sample.)

We may conclude that the sampling distribution for the sample proportion will be approximately normal, by the central limit theorem.

We estimate the SE to by replacing p by in the SE formula .

```  sqrt( (.018)(.982)/1000 ) = .00420
```
From our large sample size, we may use a reliability coefficient from the normal distribution.
```  .018 +/- (1.96)(.00420)
```
or
```  .018 +/- .008
```