Skip to Main Content

Correlation

Lesson Plan

Correlation

Objectives

The lesson focuses on the ideas of correlation and scatter plot representations. Students will:

  • create and explore scatter plots.
  • analyze strength of the relationship via the correlation coefficient.
  • plot and analyze a real-world sequence and make further analyses/connections.
  • develop a study involving two variables.

Essential Questions

  • What does it mean to estimate or analyze numerical quantities?
  • What makes a tool and/or strategy appropriate for a given task?
  • How can data be organized and represented to provide insight into the relationship between quantities?
  • How does the type of data influence the choice of display?
  • How can probability and data analysis be used to make predictions?

Vocabulary

  • Clustering: When many data points on a scatter plot are grouped closely together.
  • Linear Association: When the relationship between two variables shows a linear trend. On a scatter plot, data points that have a linear association can clearly be modeled by a line of best fit.
  • Line of Best Fit: The line that most closely approximates the data in a scatter plot (provided the data demonstrates a linear association).
  • Negative Correlation: Describes a relationship between two variables such that as the values of one variable increase, the values of the other variable decrease.
  • Nonlinear Association: When the relationship between two variables does not show a linear trend. On a scatter plot, data points that have a nonlinear association cannot be modeled by a line of best fit, as there is no obvious linear pattern.
  • Outlier: A data point that diverges greatly from the overall pattern of the data.
  • Positive Correlation: Describes a relationship between two variables such that as the values of one variable increase, the values of the other also increase.
  • Scatter Plot: A graph with points plotted to show a relationship between two variables.

Duration

120–150 minutes

Prerequisite Skills

Prerequisite Skills haven't been entered into the lesson plan.

Materials

Related Unit and Lesson Plans

Related Materials & Resources

The possible inclusion of commercial websites below is not an implied endorsement of their products, which are not free, and are not required for this lesson plan.

  • NCTM’s Line of Best Fit

http://illuminations.nctm.org/ActivityDetail.aspx?ID=146

  • Interactives from Annenberg Media

http://www.learner.org/interactives/statistics/index.html

  • Short quiz on correlation and scatter plots

http://www.mathopolis.com/questions/q.php?id=3072&site=1&ref=/data/scatter-xy-plots.html&qs=3072_3073_3074_3075_3772_3773_3774_3775_3776

  • Optional online quiz to test overall understanding of scatter plots

http://www.regentsprep.org/Regents/math/ALGEBRA/AD4/PracPlot.htm

Formative Assessment

  • View
    • Use results of the applet exploration to determine the level of students’ understanding.
    • Assess the quality of students’ studies to gauge student comprehension. Judge students’ specification of the problem, approach to gathering data, and type of analysis selected. The data presentation must also support the kinds of data in the findings.
    • Use the Lesson 2 Exit Ticket (M-8-7-2_Lesson 2 Exit Ticket.docx) to assess student mastery of general correlation concepts.
    • The quiz at the following Web site may be used to determine students’ overall grasp of the topic of scatter plots:

    http://www.regentsprep.org/Regents/math/ALGEBRA/AD4/PracPlot.htm

Suggested Instructional Supports

  • View
    Active Engagement, Modeling, Explicit Instruction
    W: The lesson discusses the ways in which data are related to one another as revealed through analysis, conclusions, and predictions. Students are given the tools to create and explore scatter plots, develop an understanding of correlation, and explore effects of outliers on correlation/relationship between the variables. In order for students to make meaningful interpretations of the degree to which data correlate with one another, it is important for them to understand how to classify data and how to discriminate between correlation and causation. 
    H: Students will plot points and determine correlation coefficients (thus creating a connection between the way data looks when plotted and the descriptor of the quantities’ relationship) using one of NCTM’s applets. Students can also estimate the correlation coefficient and then view the actual r-value (correlation coefficient). The exploration is engaging and moves quickly to capture and maintain student interest. 
    E: Students explore relationships between quantities they come up with in real-world contexts, and they also explore key ideas through the use of the NCTM applet (or graphing calculator), which allows them the freedom to create fictional data sets and gauge the relationships between their fictional variables. 
    R: The small-group (4 to 5 students) presentations to the class give students a way to develop their individual ideas into a unified plan, collaborate with their peers, learn from mistakes, and evaluate what they have prepared both individually and collectively. 
    E: In developing their study and presentations, students can use their creative energy to incorporate skills and knowledge from the lesson into a form that can be directly evaluated by the class and the teacher. 
    T: Use the Extension section to tailor the lesson to meet the needs of students. The Routine section suggests opportunities to review lesson concepts throughout the year. The Small Group section provides opportunities for relearning or additional practice. The Expansion section includes a challenge for students who are prepared to move beyond the requirements of the standard. 
    O: This lesson uses an exploratory and hands-on approach. Students are encouraged to problem-solve, make deductions, brainstorm, and apply knowledge. The culminating activity ties together all explorations visited within the lesson. 

Instructional Procedures

  • View

    Activity 1

    Tell students that they will be exploring the relationship between different sets of data. “Suppose you surveyed 100 kids from ages 3 to 18 and for each of them, you recorded their age and their height. What kind of relationship would you expect to see between a person’s age and his/her height?” Students should recognize that as students get older, their height increases.

    Sketch a quick scatter plot to represent fictional data about age and height. The scatter plot should show positive correlation, but should not be precisely linear. Depending on the class, you can ask students if the relationship will be linear, i.e., “Does a person grow the same amount each year?”

    Label the scatter plot Positive Correlation and tell students a positive correlation can be thought of as “Whenever one of the quantities increases, the other quantity also increases.” Remind them that they don’t have to increase by the same amount—just they’re both increasing. “What are some other real-life examples of two things that might show a positive correlation?” Encourage students to brainstorm examples of a positive correlation. Some possibilities include outside temperature vs. air conditioning bill, study time vs. test scores, fat grams vs. calories, etc.)

    Ask students to work in pairs and list two or three different pairs of data that would have a positive correlation. Remind students that the key is that when one of the quantities increases, the other one also increases. After each pair has a couple of ideas, have students list them on the board. Depending on class size, have each pair list one or more ideas. Ask students if there are any combinations on the board that are “stronger” than others. In other words, are there any pairs where when one quantity increases, the other quantity always increases, maybe even by a fixed amount? You can suggest, if nothing on the board meets this criteria, the relationship between the number of hours driven and the distance driven by someone driving 60 miles per hour. Explore the relationship between the quantities with a strong positive correlation and sketch another scatter plot to represent fictional data about these two quantities, making sure the data is more closely clumped together around an imaginary line with positive slope.

    “This data definitely has a positive correlation, but we can also say it has a strong positive correlation. In other words, it is much clearer with this data that when one quantity increases, the other quantity also increases. For our example with age and height, it’s possible that when people age one year, their height barely increases, and sometimes their height increases by an average amount, and sometimes they grow incredibly fast. In general, there is still a positive correlation, but not as strong a correlation as in our second example.”

    (Note: Students who have difficulty understanding how to interpret correlation often lose sight of the presence of both variables because they tend to think about correlation as one thing. Emphasize for them that there are two directions that each variable can move. Positive correlation means the second variable moves in a positive direction as the first variable moves in a positive direction. Positive correlation also requires that the second variable moves in a negative direction as the first variable moves in a negative direction. Highlight the qualitative rather than the quantitative characteristics before using real data.)

    Activity 2

    Repeat this activity with negative correlation, starting with an example such as the amount of money in people’s bank accounts and the length of vacation they go on. “As the length of their vacation increases, what happens to the amount of money in their bank account?” Students should note that it decreases. “Does it always decrease by the same amount?” Have students provide reasons for variance in the rate at which the amount of money would decrease (airline flights, changing hotels, eating at fancier restaurants, etc.)

    “These two quantities have a negative correlation: when one quantity increases, like the length of their vacation, the other quantity decreases.” Sketch some fictional data on a scatter plot to represent a negative correlation. Make sure the data is not strongly correlated (it should have some spread to it). Have students work in pairs to come up with quantities with negative correlation and share their ideas as before. If there isn’t one that has a strong negative correlation, suggest the height of a candle and the length of time for which it has been burning, or, reversing the strong positive correlation idea, the distance a person is from his/her destination based on how long s/he has been driving at a constant rate. Sketch some fictional data for the situation with a strong negative correlation, and explain that this data has a strong negative correlation.

    Activity 3

    Now ask students about the relationship between a person’s height and the number of movies s/he has seen in the theater in the last year (or two other unrelated topics). Ask students to describe the relationship between the two quantities. They may come up with some ideas (for instance, an extremely tall person may be less likely to go to the theater because s/he doesn’t want to obstruct someone else’s view), but point out that in general, for 99% of the population, there is no real relationship between the data. Sketch a scatter plot that could represent this data and point out that there is no pattern. Make the data cluster around particular values of average height and a reasonable number of movies to have seen in the last year. Also, include an outlier or two to represent extremely tall people that don’t visit the theater. “Because one quantity doesn’t seem to increase or decrease based on the other quantity, we say this data has no correlation.” Circle the cluster of data and tell students, “This data also appears to cluster around a particular area, indicating that most of the people represented by this scatter plot are around the same height and went to the movies around the same number of times. This is an example of clustering: when lots of data points are all grouped together in the same area. What about these other data points?” Indicate the outliers. “What do they represent?”

    Even though students may not use the term “outliers,” they should recognize that they “lie” on the outer edges of the data represented by the scatter plot and are far from the cluster of data.

    Activity 4

    “In mathematics, we generally want to use numbers to quantify or explain relationships. It’s okay to say that a set of data shows a positive correlation, for example, but we like to have a way to quantify that correlation, to assign a number to it.”

    Have students work in groups with the applet at http://illuminations.nctm.org/ActivityDetail.aspx?ID=146.

    (This activity can also be done with a graphing calculator, but the applet makes it much easier to plot points and move them (by checking “Move Points”) and watch the results update in real time.)

    Some groups should plot data that has a positive correlation, some that has a negative correlation, and some that has no correlation at all. As students plot the data, have them click on the “Computer Fit” check box and note the value of r that the Web site returns.

    “What happens to the value of r as your positive correlation becomes stronger and stronger?” (The value of r increases.) “What is the greatest value of r you can achieve?” Students should be able to get to 0.9 quite easily, but it’s doubtful they will get to exactly 1.0. Tell students, “The greatest possible correlation is 1. When the correlation is 1, what do you think the data looks like?” Based on the fact that the Web site applet also draws a line of best fit, guide students toward the recognition that data that all falls on the same line has a correlation of 1.

    Repeat the same line of questions about negative correlations, noting that the smallest possible correlation is −1.

    “What do you think the value of r is for data that is not correlated at all?” Students should guess that it is 0.

    “The value of r is called the correlation coefficient and is a numerical measure of the correlation of our data. It’s difficult to compute by hand, but calculators and spreadsheet programs like Excel can compute it quite easily (as can this Web site).”

    Have students work in small groups to develop a study in which they determine two quantities they want to measure; design a plan; collect, analyze, and interpret data; and make predictions. Students will present their findings and visual representations to the class. Discussion of the relationship between the variables and correlation coefficient must be included. Students must connect the findings to the context of the study and thus make real-world interpretations.

    An optional quiz on scatter plots is available at the following Web site. The quiz may be used to assess student understanding of the topic of scatter plots:

    http://www.regentsprep.org/Regents/math/ALGEBRA/AD4/PracPlot.htm

    Extension:

    • Routine: Asking students to create and develop their own data based on their individual interests is a motivating force. Topics such as current movie box office sales, top-ten lists of video game, possible music downloads, and sports records all have large data troves that are relatively easy to search. Use partner grouping to help students understand the relationship between variables, line of best fit, and correlation coefficient.

    Another way to review student understanding of correlations in scatter plots is to give the online quiz at the following Web site:

    http://www.mathopolis.com/questions/q.php?id=3072&site=1&ref=/data/scatter-xy-plots.html&qs=3072_3073_3074_3075_3772_3773_3774_3775_3776

    • Small Group: Assign groups to develop and write lists of data categories they believe will have strong correlations. For example, student grade level and average shoe size, or daily high temperature and ice cream sales. For students with more skill in representing correlations, assign writing lists where negative correlation is more likely. For example, average daily temperatures and winter clothing sales.
    • Expansion: Have students do additional experimentation with the applet at

    http://illuminations.nctm.org/ActivityDetail.aspx?ID=146. Then ask students to form some general conclusions about strongly correlated data and weakly correlated data.

Related Instructional Videos

Note: Video playback may not work on all devices.
Instructional videos haven't been assigned to the lesson plan.
Final 04/26/13
Loading
Please wait...

Insert Template

Information