### Math 225 Course Notes

Return to the Math 225 Homepage

### Chapter 9

#### Contents

Regression is a general technique for analyzing multi-variate data.
In this course, we will restrict discussion
to the case where we wish to predict
a quantitative response variable on the basis of a single
quantitative explanatory variable.
Furthermore,
we will assume that
the relationship between the two variables
is summarized well by a straight line.
This does not mean that all observed data must be exactly on a line.
It does mean that
a straight line through the center of a scatterplot of the points
is a good description of the trend of the data,
and is not substantially improved by drawing a curve through the data.
Correlation is a measure of the linear
relationship between two variables,
that is measured on a scale from -1 to 1.
The strength of the relationship increases as the correlation moves away from
zero.

Correlation and ordinary least squares regression
(OLS)
are intimately related to one another.
OLS is one of several methods available for finding the "best" line
to describe a set of points.
It has theoretical and computational advantages that make it the most-used
method.
The result of OLS is a fitted regression line.
This line can be found from the means and standard deviations
of each variable and the correlation coefficient.

Analyzing data by regression without a computer is very tedious.
In an example,
we will learn how to pull information from the regression output
of a statistics software package.

Last modified: April 16, 1996

Bret Larget,
larget@mathcs.duq.edu