# Math 225

## Introduction to Biostatistics

### Solutions to suggested textbook problems

#### Chapter 1

1-4: d, e, and i are discrete; a, b, c, f, g, h, and j are continuous.

#### Chapter 2

2-3 (a,b): mean = 2.8, median = 2.5.

2-4 (a,b): mean = 41.2, median = 36.

2-3 (e,f,g) and quartiles: range = 7, variance = 3.73, s = 1.93, lower quartile = 2, upper quartile = 4.
All units are children, except for variance which is children squared.
(Note: some people calculate quartiles slightly differently.)

2-4 (e,f,g) and quartiles: range = 96, variance = 609.2, s = 24.7, lower quartile = 29, upper quartile = 44.
All units are hours, except for variance which is hours squared.
(Note: some people calculate quartiles slightly differently.)

#### Chapter 3

3-2: (a) 100*99*98 = 970,200.
(b) 100*99*98/(3*2*1) = 161,700.

3-5: (a) 1/5; (b) 1/10; (c) 3/10; (d) 1/20.

3-9: (a) 1/24; (b) 1/6.

3-11: 0.0711.

3-12: 63/115 = 0.5478.

3-13: 6/13 = 0.4615.

3-14: (a) 0.05; (b) 0.90; (c) 0.9895; (d) 0.0105; (e) 0.10; (f) 0.8182.

3-15: 0.64.

3-16: 0.824.

3-17: 0.659.

#### Chapter 4

4-6: (a) sigma2 = 2/3

(b and c)

```Sample | mean
--------------
3,3    | 3
3,4    | 3.5
3,5    | 4
4,3    | 3.5
4,4    | 4
4,5    | 4.5
5,3    | 4
5,4    | 4.5
5,5    | 5
-------------
```

(d) The mean of the sampling distribution is 4 = the mean of population.
The variance of the sampling distribution is 1/3 = (2/3)/2.

#### Chapter 5

5-1: P(cure) = 0.20. P(four or more cured) = 0.0064 + 0.00032 = 0.00672. This is unusual if chance alone explains. It would be worth looking into special conditions. Also, by "cured", does this imply that they are disease-free for five years?

5-2a: P(X=5) = 0.2007, P(X=6) = 0.2508, P(X=7) = 0.2150, P(X=5,6,or 7) = 0.6665.

5-3: (a) P(X < 3) = 0.1247, (b) P(X=5) = 0.1755, (c) P(X > 6) = 0.2378, (d) P(X=2 or X=3) = 0.2246.

5-4: P(X=1) = 0.0284. If this person were an average laboratory chemist, the chance of doing so poorly is fairly small. However, the student is probably not as experienced as the population for whom the 70% success rate is figured. The student needs a lot more practice!

5-7: Exact binomial with n=200 and p=0.02 (or approximate Poisson with mu = 4).
(a) P(X <= 10) = 0.9975 (0.9972) (b) P(X=0) = 0.0176 (0.0183)
(c) P(X > 4) = 0.3712 (0.3712) (d) P(X=4) = 0.1973 (0.1954)

5-8: (a) P(40 < X < 48) = 0.5763 (b) P(X < 42) = 0.3446 (c) P(X > 45) = 0.4207 (d) 35.8 to 52.2 (e) z = -1.20, P(X < 38) = 0.1150 (The NFL needs place kickers too!)

5-9: (a) P(42 < x-bar < 46) = 0.7941 (b) P(x-bar < 40) = 0.0057 (c) P(x-bar > 48) = 0.0057 (d) P(40.9 < x-bar < 47.1) = 0.95 (e) z = -3.79, P(x-bar < 38) = 0.0001, yes this is unusual.

5-10: (a) P(96 < X < 104) = 0.2249 (b) P(X < 88) = 0.1957 (c) P(X > 115) = 0.1420 (d) P(X < 123.0) = 0.95 (e) P(X > 77.0) = 0.95 (f) P(72.6 < X < 127.4) = 0.95

5-11: (a) P(X=1) = 0.0914 (b) P(X < 4) = 0.6472 (both exact binomial and Poisson approximation). (c) P(4 <= X <= 12) = 0.8416 (exact) or 0.8437 (normal approximation).

5-12: Assuming the population of monkeys is much larger than the 25 selected, this is a binomial problem. The normal approximation is useful.

Exact answer: P(15 <= X <= 20) = 0.5738.
Normal approximation: P(15 <= X <= 20) = 0.5957.

(Notice that 25*0.2 = 5, so the normal approximation is just barely accepatable.)

#### Chapter 6

6-1: (Leo Rebholz)

xbar=322 s=101 n=25 a)90%CI for u = 322 +- (t)SE t (24 dF, 90%) = 1.711 CI = [287.4,356.6]

b)CI using z and sigma = [289.1,354.9]

c)From a, we are 95% confident that u lies in [287.4,356.6]. This does NOT mean that the probability that u is in that interval is .95 . u is a constant and not a variable, and thus we cannot put a probability on it. Confidence is an "after the fact" statement, whereas probability is a "before the fact" statement.

d)sample size to estimate u within +-5 on a 95% CI: Then z*SE=5=z*sigma/sqrt(n) -> 1.96*100/sqrt(n)=5 -> n=1537

6-2: (Leo Rebholz)

a) x-bar is an unbiased point estimate of u, since mean(xbar)=u s^2 is an unbiased point estimate of sigma^2 xbar=49 s^2=25

b)CI= [45.16,52.84]

6-4: (Leo Rebholz)

a)[10.07,74.32]

b)(s1)^2= (30.61)^2, (s2)^2= (12.61)^2

[.26 < sigma1/sigma2 < 91.64]

6-5: (Leo Rebholz)

a)[-4.61,-1.39]

b)95% ci for sigma=[3.78,5.61]

6-7: (Leo Rebholz)

a)[1.3,2.2]

b)like a one-population interval

c)standard deviation and dF

6-8: (Leo Rebholz)

[-7.325,13.925] Note dF=20, and dF must be computed since we are not assuming equal variance. 6-9: A 99% confidence interval for the proportion in favor of the increase is:
0.460 ± 0.091.
Fifty percent is in the confidence interval. A two-sided hypothesis test with p=0.50 would not be rejected at the alpha=0.01 level. Even if no one's opinion changes, there is a chance that the vote for a pay increase will occur. 6-12:(Leo Rebholz)

sample size so that we have +-.5 n=62

#### Chapter 7

While the textbook (and Leo) takes the approach of rejecting the null hypothesis if the test statistic is in a critical region, you are expected to understand how to calculate a p-value. If the p-value is smaller than a given significance level alpha, your decision would be to reject the null hypothesis at that significance level.

7-1: (Leo Rebholz)

Ho: u=65
Ha:u (not)=65 (two sided)

a)test stat=-4.43 critical values(rejection lines)= +-2.13 Reject Ho. There is a change.

b)H0: u=65 Ha: u<65 (one side) test stat=-4.43 critical val=-1.753 Reject Ho. Our data is lower at the alpha=.05 significance level anat.

c)Assumptions are random sample and a normal distribution.

d)test stat=-5.36 and critical values are +-1.96 Reject!

e)critical val=-1.645 Reject!

7-2: (Leo Rebholz)

a)z-test stat=5.32 critical vals=+-2.576 Reject!

b)t-test stat=4.15 critcal vals=+-1.98 Reject!

c)part a one sided crit val=1.96 Reject! part b one sided crit val=1.66 Reject!

d)One sided is more appropriate, as we would be more interested in if males are taller than before as opposed to their heights being different. The one sided test would be more likely to lead to rejection.

e)chi-square test stat=162.52 (two sided) crit vals are 73.4 and 128.4 Reject!

7-3: (Leo Rebholz)

Ho: u=14
Ha: u>14
...note this is one-sided as if it is less than 14, that is better.

t-test-stat=3.57 crit val=1.71 Reject! They are liars.

7-5: (Leo Rebholz)

a)test-stat=2.50 0.01 < p-val < 0.02 Rejection will depend on testing level(which is not provided)

b)Assumptions are that these are independent random samples from 2 normally distributed populations with equal variances anat.

c)z-test-stat=3.984 crit-val=+-1.96 Reject!

7-6: (Leo Rebholz)

a)p-val >.8 Accept!

b)p-val>.8 Accept!

c)assumptions are normal random independent samples with equal variances.

d)I would choose part a test, as the sample variances are close.

7-7: (Leo Rebholz)

a)Ho: u1=u2
Ha: not equal

0.4 < p-val < 0.6

Accept!

b)normal, random and independent samples

7-9: (Leo Rebholz)

a)0.01 b)12 pairs of values -> 12 differences that we test on -> 11dF

7-10(a):

H0: p = 0.90
Ha: p < 0.90

z = -2.67 (use p=0.9 in the formula for SE)

p-value = 0.0038.

This is smaller than 0.05, so is significant at the alpha = 0.05 level. In fact, it is much more significant. By chance, we would expect a result at least this extreme about once every 263 samples. There is fairly strong evidence that the population proportion of four-year-olds with no evidence of dental cavities is less than 90 percent.

7-11:

H0: p1 = p2
Ha: p1 not equal to p2

The first sample has 57 out of 110 in favor. The second sample has 41 out of 75 in favor.

If we assume that the two population proportions are equal, the best guess of the common value is p-bar = 98 / 185 = 0.530.

The estimated SE under H0 is sqrt( 0.530*0.470/110 + 0.530*0.470/75 ) = 0.0747.

z = (0.518 - 0.547) / 0.0747 = -0.38.

The two-sided p-value is 0.704. This is not a small probability. The observed data is consistent with no difference in population proportions. There is no evidence that opinion in the two communities differs on this issue.

#### Chapter 9

(Solutions by Leo Rebholz.)

```9-1:
a)
ss	df	ms	f
3.052	2	1.53	.77
23.65	12	1.97
26.7	14
p-val>.25
No difference!

d)
placebo vs drug a = 1.535
"	"	b = .487
a	"	b = .269

9-2:
(Lab Homework)
F=57.84
There is a difference between mean counts.

9-6
a) i is the treatment # (or group number)
j is the person # for a given treatment
ij denotes a particular person from a particular treatment

b) Ho: means are all equal
Ha: means are not all equal

c)
ss	df	ms	f	p-val
200	3	66.67	6.67	.001

Bret Larget,
larget@mathcs.duq.edu

```