Skip to Main Content

Predictions from Bivariate Data

Lesson Plan

Predictions from Bivariate Data

Objectives

The lesson explores scatter plots and specifically how to create a line of best fit and use that line to answer contextual questions about the data in the plot. Students will:

  • identify a line of best fit by hand (estimation) and using a calculator.
  • use the equation of the line of best fit to explore the meaning of the y-intercept and the slope.

Essential Questions

  • What does it mean to estimate or analyze numerical quantities?
  • What makes a tool and/or strategy appropriate for a given task?
  • How can data be organized and represented to provide insight into the relationship between quantities?
  • How does the type of data influence the choice of display?
  • How can probability and data analysis be used to make predictions?

Vocabulary

  • Bivariate Data: Pairs of linked numerical observations. Example: a list of heights and weights for each player on a football team.
  • Clustering: When many data points on a scatter plot are grouped closely together.
  • Linear Association: When the relationship between two variables shows a linear trend. On a scatter plot, data points that have a linear association can clearly be modeled by a line of best fit.
  • Line of Best Fit: The line that most closely approximates the data in a scatter plot (provided the data demonstrates a linear association).
  • Negative Correlation: Describes a relationship between two variables such that as the values of one variable increase, the values of the other variable decrease.
  • Nonlinear Association: When the relationship between two variables does not show a linear trend. On a scatter plot, data points that have a nonlinear association cannot be modeled by a line of best fit, as there is no obvious linear pattern.
  • Outlier: A data point that diverges greatly from the overall pattern of the data.
  • Positive Correlation: Describes a relationship between two variables such that as the values of one variable increase, the values of the other also increase.
  • Scatter Plot: A graph with points plotted to show a relationship between two variables.

Duration

60–90 minutes

Prerequisite Skills

Prerequisite Skills haven't been entered into the lesson plan.

Materials

  • Scatter Plot worksheets (M-8-7-3_Scatter Plot.docx) for each student and one for the overhead projector
  • graphing calculator or access to spreadsheet program (Excel, etc.)

Related Unit and Lesson Plans

Related Materials & Resources

The possible inclusion of commercial websites below is not an implied endorsement of their products, which are not free, and are not required for this lesson plan.

  • NCTM’s Mean and Median

http://illuminations.nctm.org/ActivityDetail.aspx?ID=160

  • NCTM’s Line of Best Fit

http://illuminations.nctm.org/ActivityDetail.aspx?ID=146

  • Line of Best Fit Worksheet

http://mdk12.org/share/clgtoolkit/lessonplans/BikingHome.pdf

  • Power Point Presentation—scatter plot worksheet

http://blue.utb.edu/qttm/at/ppt/Linear%20Fts/Scatter%20Plots.pptx

Formative Assessment

Suggested Instructional Supports

  • View
    Active Engagement, Metacognition, Modeling
    W: Students will work on determining how to use given data points to make predictions about values for which data points do not exist. They will also learn about the limitations of those predictions. 
    H: Students will be presented with data that is easily accessible—height based on age—and should enjoy making predictions that may seem “strange” like predicting the height of a 40-year-old using a line of best fit with a result of 12 feet. 
    E: Students will be engaged in discussions about and encouraged to think about seemingly nonmathematical topics like growth rates, limits on growth, and so on. Students will come up with their own lines of best fit and make their own predictions. 
    R: Students will revise their thinking about the predictions based on the line of best fit they drew. They will also have opportunities to create and revise their lines based on better techniques (i.e., “eyeballing” the line, drawing it using a straightedge, and calculating the actual equation of the line).  
    E: Activity 3 allows students to evaluate their understanding of the entire concept, from creating a scatter plot to estimating and/or calculating the equation of the line of best fit to interpreting their equation in the context of the problem. In addition, a PowerPoint worksheet is provided to aid in evaluating student mastery. 
    T: Students are given multiple methods to draw their line of best fit and are encouraged to think about the pros and cons of each method (ease vs. accuracy). The Extension section can be used to tailor the lesson to meet the needs of students. Use the Routine section for suggestions of ways to review lesson concepts during the year. The Small Group section can be used for students who need additional practice. The Expansion section contains ideas for students who are ready for a greater challenge. 
    O: While the lesson concepts are really about graphing lines to approximate data, students get to engage with real-world topics such as age vs. height and also come up with their own topics for exploration. They get the opportunity to make up “realistic” data and provide reasoning to explain their data.  

Instructional Procedures

  • View

    Activity 1

    Provide students with a Scatter Plot worksheet (M-8-7-3_Scatter Plot.docx) and project the following scatter plot for students to see (the plot below is identical to the one on the worksheet):

    “This scatter plot shows the heights of 24 different kids from ages 1 to 7 years.” Review how to read a scatter plot (i.e., ask students to estimate the heights of the three 1-year-olds, the 5-year-olds, etc.), to ensure that they are reading the scatter plot correctly.

    “Does the scatter plot tell us anything about the heights of any 8-year-olds?” (No) “Can we use the data from the scatter plot to make an educated guess about the height an 8-year-old might have?” (Yes)

    Have students write down their extrapolated guess based on the data in the plot for the height, in inches, of an 8-year-old. Then, ask students how many guessed 40 inches. Most likely, no students will have made a guess of 40 inches; if any did, just note the number of guesses for 40 inches on the board. Otherwise, ask, “Why did nobody guess 40 inches?” Students should point out that 40 inches looks like an average height for a 5-year-old and the heights of kids increase each year.

    “Is it possible for an 8-year-old to be 40 inches tall?” Guide students toward the understanding that any prediction we make is just that—a prediction. It’s not a guarantee; it’s just a guess about what might happen in the future or with a population we have not actually surveyed.

    Continue asking about predictions, increasing by 1 or 2 each time (i.e., next ask if anyone predicted 42 inches, and so on). Determine the most popular prediction. Ask students who gave that prediction to explain how they arrived at their prediction. Ask other students who had the same prediction how they arrived at their prediction. Collect as many methods for making a prediction as possible.

    Steer the conversation towards using a ruler, the edge of a piece of notebook paper, or some other straightedge to help make a prediction. If no students made their prediction by using a straightedge, demonstrate how such a prediction could be made.

    “What makes using a straightedge easy with this particular set of data?” Students should recognize that the data essentially lies on a straight line. “So, we can use that straight line and imagine extending it to the right to see how tall an 8-year-old might be. If everyone drew a straight line to approximate the data, would everyone draw exactly the same line?” (No) “Why not?”

    Help students realize that our line is just an approximation. We’re trying to draw it as close as possible to all the points. But just using our estimation skills, it’s not possible to draw the perfect line, one that minimizes the distance from all the points at once. Sketch a line like the following on the scatter plot:

     

    “Let’s use this as an approximation. Your line might be a little different, which is fine. We’ll call this a line of best fit.Write this term down somewhere so students can see it. “It’s the line that best fits our data, at least by our guess. Again, it might not be perfect. Based on this line of best fit, what do you think the height of an 8-year-old might be?” (50 inches)

    “Depending on which source you look at, the average height for 8-year-old boys is about 51 inches, and is about 50.75 inches for girls. So, our estimate is pretty accurate. Based on the line of best fit, what do you think the height of a newborn infant might be?” (25 inches)

    “Again, depending on what source you look at, the average height for newborns is 20 inches for boys and 19 inches for girls. So, our estimate was somewhat close, but not quite as close as our estimate for 8-year-olds. Why might our estimate not be as accurate?”

    (Possible reasons include the heights represented on our scatter plot were slightly greater than average; newborns grow at a faster rate from years 0–1 than during other intervals; children do not grow at a constant rate throughout childhood, etc.) Use this as an opportunity to discuss the validity of predictions made based on lines of best fit.

    Activity 2

    Have students look at the second part of their worksheet and post the following data somewhere for the entire class to see:

    Age

    (years)

    Height

    (inches)

    1

    28

    1

    29

    1

    27

    2

    31

    2

    33

    2

    30

    3

    33

    3

    35

    3

    32

    4

    37

    4

    38

    4

    36

    5

    39

    5

    40

    5

    45

    6

    42

    6

    41

    6

    44

    7

    44

    7

    45

    7

    48

    “This data is the same data that was used to generate the scatter plot we’ve been talking about. Using the actual numbers, we can come up with the actual line of best fit. The ones we drew were really just estimates, and it isn’t possible to determine points on our line very accurately just by looking at it. For example, was the predicted height of an 8-year-old exactly 50 inches? Was it 51? Or 50.5? But, if we have the actual equation, we can use it to find the exact point on our line.”

    If students have graphing calculators, have them enter the data in the list portion of their calculator (for TI calculators), and then guide students toward having the calculator produce a line of best fit (through the stat menu). Students may need help understanding the output of the calculator’s linreg feature, which typically outputs a value for a and a value for b that need to be considered in the equation y = ax + b.

    “You can also find the equation of the line of best fit using something like Microsoft Excel or any other spreadsheet program, or by hand, although doing it by hand is very time-consuming. The equation for the line of best fit for our data is y = 2.96x + 25.14.” Write this equation on the board for reference.

    “In our equation, x represents the age in years, as shown by the x-axis on the scatter plot and y represents the height in inches. So, use this equation to determine the predicted height of an 8-year-old.” Have students work with partners if necessary and provide additional instruction depending on how students are doing.

    Ask students to provide their answers. If there are numerous answers, work through the method with the class to arrive at the correct answer and to clear up any mistakes other students may have made. The correct height is 48.82.

    “What prediction can we make about the height of a newborn based on our equation?” Guide students towards the realization that when 0 is substituted for x, we’re just left with 25.14, which is the prediction. Draw a point representing this on the scatter plot.

    “What do we call a point like (0, 25.14)?” (The y-intercept) “So, when we look at our equation, we already have the y-intercept; it’s just the number being added to the x term. What about 2.96? What does that represent?” If students don’t remember, remind them that the form y = mx + b is called “slope-intercept” form. Students should determine that the slope of the line is 2.96.

    “What does 2.96 mean in the context of the problem? We already know what the
    y-intercept means: it’s the height when the age is 0. How about the slope?”
    Guide students through a discussion that slope is also called the rate of change, so it must be a rate of how fast one quantity is changing based on another quantity. Students can also be reminded of how to calculate slope given two points: the numerator is always the difference of the
    y-coordinates, while the denominator is always the difference of the x-coordinates.

    “So, when we talk about slope, the numerator represents the y-values. What quantity do our y-values represent? Age or height?” Students should know that, for this plot, the y-value represents height. “Therefore, our slope is going to represent the change in height per year; a slope of 2.96 means that a child’s height is changing by 2.96 inches per year. Does that mean each year his/her height changes by exactly 2.96?” (No) “Why not?” (Answers here should touch on the fact that children grow at different rates, our slope is the slope of a line of best fit, which is an estimate, etc.) “But, we can continue to use that as an estimate. We already know, based on our line of best fit, that an 8-year-old has an expected height of 48.82 and his/her height increases by 2.96 inches per year. Based on that data, what should we expect for the height of a 9-year-old?” (51.78 inches)

    Point out to students that they could have arrived at the answer by simply adding 2.96 to 48.82 or by plugging x = 9 into the equation and determining the associated value of y. Depending on the class, ask students to predict heights for 10- and/or 11-year-olds. (54.74 and 57.70 inches, respectively)

    “Based on our line of best fit, how tall can we expect a 40-year-old person to be?” Here, students must recognize that they should revert to using the equation rather than continually adding 2.96. They should arrive at an answer of 143.54 inches. “144 inches is 12 feet tall. So what went wrong?”

    Guide students toward the realization that the line of best fit can only be used to make predictions up to a certain point, depending on the situation. “Why doesn’t our line of best fit work for predicting the height of 40-year-olds?” Students should recognize that people stop growing at a certain point (and may even start to shrink), and so our line won’t be an accurate predictor.

    Activity 3

    Have students work in small groups and brainstorm quantities that they think share a linear relationship. It doesn’t have to be a perfectly linear relationship, and the two quantities could even have a negative correlation. Have them come up with a list and then select one as a group. All members in the group should research the relationship to come up with some reasonable values. For instance, if they choose shoe size and height, they can use their own shoe sizes and heights as data points. Have students generate 12 to 20 data points and create a scatter plot to show the relationship. Then have students determine the line of best fit, either by estimating it or by using a graphing calculator.

    Bring the class back together for discussion. Have one group describe the quantities and identify which quantity they put on the x-axis and which they put on the y-axis, and then have them give their equation for a line of best fit. If they didn’t generate an equation, ask them what the
    y-intercept of their estimated line was and also by how much the y-value changed for each unit of change in the x-direction. Ask other groups what the slope represents. They should express their answers using phrases like “the amount that [quantity] changes for each [quantity].” For example, “the amount that a person’s shoe size increases for each foot s/he grows.”

    Go through all the groups’ scatter plots, discussing the equation of best fit and in particular what the slope and y-intercept represent (with units). This is also an opportunity to discuss limitations to their line of best fit.

    An option for assessing student comprehension of scatter plots, correlation, and prediction is by using the following PowerPoint as a work/study packet or class activity. Review and evaluate student responses individually or have students trade papers and correct them as a class: http://blue.utb.edu/qttm/at/ppt/Linear%20Fts/Scatter%20Plots.pptx

    An alternative or additional tool for checking student comprehension is the worksheet at the following Web site: http://mdk12.org/share/clgtoolkit/lessonplans/BikingHome.pdf

    Extension:

    • Routine: To review the concepts of scatter plots and lines of best fit throughout the year, periodically present students with a real-world scatter plot with a line of best fit. Have students make predictions about the data and discuss the validity and likelihood of their predictions.
      Small Group:
      Have students collect bivariate data and create a scatter plot. Have them generate a line of best fit and use it to make predictions about their data, as well as examining the validity and limitations of the line of best fit. Students may use NCTM’s Line of Best Fit: http://illuminations.nctm.org/ActivityDetail.aspx?ID=146
    • Expansion: Have students explore data that is not related linearly and work with polynomials of best fit of higher orders. Students can compare the predictions based on each model. Students can also use exponential curves of best fit. Even though they haven’t been exposed to many (or any) functions of this type, they can still interpret the function for different x-values.
    • Technology: Have students work with graphing calculators and spreadsheet software to explore lines of best fit and also other higher-order polynomials of best fit.

Related Instructional Videos

Note: Video playback may not work on all devices.
Instructional videos haven't been assigned to the lesson plan.
Final 04/26/13
Loading
Please wait...

Insert Template

Information