Math 225

Introduction to Biostatistics


Highlights from Lecture #4

We solved two problems. In doing so, we introduced the concepts of conditional probability and Bayes Rule. Using a tree diagram to solve this type of problem is very useful.

Chapter 3

  1. A conditional probability is the probability of an event given information about another event. By definition,

    P(A|B) = P(A and B) / P(B)

    provided that P(B) is not 0. This formula may be visualized with a Venn diagram (see page 55 in the textbook). The expression P(A|B) is read "the probability of A given B".

  2. After some algebra, we find that

    P(A and B) = P(B) * P(A|B)

    It is also true that

    P(A and B) = P(A) * P(B|A)

    We can think of these formulae as saying that for events A and B both to occur, first one must happen, and given it does, the other must.

  3. A review of formulas for probabilities of combinations of events is in order.
    1. P(A or B) = P(A) + P(B) - P(A and B)
      Notice that when events A and B are mutually exclusive, P(A and B) = 0 and this simplifies to

      P(A or B) = P(A) + P(B)

    2. P(A and B) = P(A) * P(B|A)
      Notice that when events A and B are independent, P(B|A) = P(B) and the expression simplifies to

      P(A and B) = P(A) * P(B)

Two Problems

Problem 1:

The following relative frequencies are known from review of literature on the subject of strokes and high blood pressure in the elderly.

  1. Ten percent of people aged 70 will suffer a stroke within five years.
  2. Of those individuals who had their first stroke within five years after turning 70, forty percent had high blood pressure prior to their stroke.
  3. Of those individuals who did not have a stroke by age 75, twenty percent have high blood pressure.
What is the probability that a 70 year-old patient with high blood pressure will have a stroke within five years?

Solution:

First translate the previous English into the language of probability.
Let H = {individual has high blood pressure at age 70}.
Let S = {individual has a stroke between ages 70 and 75}.
Then,

  1. P(S) = 0.10;
  2. P(H|S) = 0.40;
  3. P(H|not S) = 0.20.

A tree diagram is useful for the problem solution.

 	      high blood pressure      .04
             /
 	  .4/
 	   /
    stroke<
   /       \
.1/       .6\
 /           \_low blood pressure      .06
<             
 \                high blood pressure  .18
.9\              / 
   \          .2/
    \          /
     no stroke<
               \
              .8\
                 \
                  low blood pressure   .72

Then,

P(S|H) = P(S and H) / P(H) = 0.04/(0.04+0.18) = 0.182.

Problem 2:

A single gene has a dominant allele A and recessive allele a. A cross of AA vs. aa leads to F1 offspring of type Aa. Two of these mice are crossed to get the F2 generation, some of which are AA, some of which are Aa, and some of which are aa. A male with the dominant trait from the F2 generation is randomly selected. He is either homozygous dominant (AA) or heterozgous (Aa). He is mated with a homozygous recessive (aa) female. They have one offspring with the dominant trait. What is the probability that the father is heterozygous?

Solution:

We begin with a review of some basic genetics.

In a cross of two Aa individuals, the three possible genotypes are AA, Aa, and aa.

    A  | a
-----------
A | AA | Aa
-----------
a | Aa | aa
-----------

The three probabilities are 0.25, 0.50, and 0.25 respectively. Because the father has the dominant phenotype, the father's genotype is either AA or Aa.

P(father is AA | father is AA or Aa)
= P(father is AA and father is AA or Aa) / P(father is AA or Aa)
= P(father is AA) / P(father is AA or Aa)
= (1/4) / (3/4) = 1/3.

Also, P(father is Aa | father is AA or Aa) = 2/3.

There are two possible crosses with the recessive mother. If the father is homozygous dominant AA, the offspring will be Aa with probability 1. If the fatehr is heterozygous (Aa), the offspring is equally likely to be Aa or aa.

Now define D = {offspring is dominant} and H = {father is heterozygous}.

We wish to find P(H|D). We will use Bayes Rule and a tree.

                 D         1/3
                /
            0.5/
              /
       H     <
      /       \
 2/3 /      0.5\
    /           \_not D    1/3
   <             
    \             D        1/3
 1/3 \           / 
      \        1/
       \       /
        not H <
               \
               0\
                 \
                  not D    0
   

Thus,

P(father is heterozygous | offspring is dominant)
= P(H and D) / P(D)
= (1/3) / (1/3 + 1/3)
= 1/2


Last modified: January 24, 2001

Bret Larget, larget@mathcs.duq.edu