### Section 2.3: Grouped Data - The Frequency Distribution

#### Key Concepts

A frequency table shows the counts (or the relative proportions) of the number of observations in each class interval.

#### Frequency Tables

A frequency table is simply a count of the number of observations that fall into each of several class intervals.

How to Make a Frequency Table

1. Pick two numbers that span the data.
2. Divide this range into a reasonable number of equally spaced class intervals.
3. Count the number of observations in each class interval.
4. Graph it so that neighboring boxes touch.
Generally speaking, there should be around 5 - 15 class intervals. You will want to be on the high side if the data is fairly skewed. There should be no gaps in the class intervals.

Make a frequency table with data from Exercise 2.3.8.

```Class Interval   Frequency   Rel. Frequence
-------------------------------------------
10 - 19            2           .07
20 - 29            4           .13
30 - 39            9           .30
40 - 49            7           .23
50 - 59            6           .20
60 - 69            2           .07
-------------------------------------------
30          1.00
```
The relative frequencies are determined by dividing the frequencies by the total number of observations.

#### How to Make a Histogram

A histogram is a graph of the information in a frequency table.

For each class interval, graph a bar that spans the entire class interval whose height is equal to the frequency (or the relative frequency).

The area of a bar over a class interval relative to the total area corresponds to the proportion of the data in that class interval.

Make a histogram with the data from Exercise 2.3.8. The boundaries should be at 9.5, 19.5, 29.5, 39.5, 49.5, 59.5, and 69.5 for the frequency table listed above.

#### Interpreting Histograms

You should be able to do things such as:
1. Identify if a variable has outliers.
2. Get a rough idea of the center and the spread.
3. Describe the shape (skewness and symmetry).
4. Find the proportion of observations in intervals.
5. Find intervals that contain a given proportion.
An outlier is an observation that sticks out of the overall pattern of the graph. There are no outliers in this example.

A histogram is approximately symmetric is the left and right sides are approximately mirror-images of one another.

A histogram is skewed to the right if the right half is stretched out much farther than the left half.

A histogram is skewed to the left if the left half is stretched out much farther than the right half.

The histogram from this example is approximately symmetric. Describing it with a very mild skewness to the right is also acceptable.

#### Making Stem-and-Leaf Displays

A stem-and-leaf display is a method to write down the data in a way that describes its shape. A large vertical line divideds the stems from the leaves.

The last significant digit of each observation is called the leaf. There is exactly one leaf per observation written on the right side of the display. For each row (stem) the leaves are ordered from smallest to largest. The leaves should line up exactly, and be flush with the vertical line, to give a visual description of how many observations are in each row.

The remaining portion of each observation is called the stem. Each stem is written only once, on the left side of the vertical line. Even if there are no observations for a stem, the stem should be written to show that there is a gap in the data.

An example is the easiest way to learn how to do it. Do data from Exercise 2.3.8 in class.

``` 1|02
2|1223
3|245566789
4|3455556
5|345567
6|04
```
There are 30 observations. The fourth smallest is 22, while the largest is 64.

When the data has too many significant digits, better results come if the data is rounded first. Look at the first two columns of data from Exercise 2.3.9. Here is a stem-and-leaf display after rounding.

``` 17|1
18|
19|5
20|13
21|04
22|2
23|
24|089
```
If the data will all fall onto a very small number of stems, stems can be split. The options are splitting each stem into two (with leaves of 0-4 on the top stem and leaves of 5-9 on the bottom stem) or into five (with leaves of 0-1, 2-3, 4-5, 6-7, 8-9). Here are the first three columns of Exercise 2.3.5 shown both ways.
``` 0|11334
0|5578
1|01223
1|7
```
or
``` 0|11
0|33
0|455
0|7
0|8
1|01
1|223
1|
1|7
```