Lesson 4: Fitting a Line to Data

Fitting a Line to Data

Student Summary

Sometimes, we can use a linear function as a model of the relationship between two variables. For example, here is a scatter plot that shows heights and weights of 25 dogs together with the graph of a linear function which is a model for the relationship between a dog’s height and its weight.

A scatterplot, horizontal, dog height in inches, 6 to 30 by 3, vertical, 0 to 112 by 16. Same scatterplot as previous, this time with a line through 9 comma 0 and 27 comma 80.

For some dogs, we can see that the model does a good job of predicting the weight given the height. These correspond to points on or near the line. The model doesn’t do a very good job of predicting the weight given the height for the dogs whose points are far from the line.

For example, there is a dog that is about 20 inches tall and weighs a little more than 16 pounds. The model predicts that the weight would be about 48 pounds. We say that the model overpredicts the weight of this dog. There is also a dog that is 27 inches tall and weighs about 110 pounds. The model predicts that its weight will be a little less than 80 pounds. We say the model underpredicts the weight of this dog. For most of the dogs in this data set, though, the model does a good job of predicting the weight from the height.

Sometimes a data point is far away from the other points or doesn’t fit a trend that all the other points fit. We call these outliers.

Visual / Anchor Chart

Standards

Addressing

8.SP.1

Construct and interpret scatter plots for bivariate measurement data to investigate patterns of association between two quantities. Describe patterns such as clustering, outliers, positive or negative association, linear association, and nonlinear association.

8.SP.2

Understand that straight lines are widely used to model relationships between two quantitative variables. For scatter plots that suggest a linear association, informally fit a straight line, and informally assess the model fit by judging the closeness of the data points to the line.