When you receive information about events that occur over time do you look for patterns or relationships in the information? For example, when you see a tall couple walking down the street do you think their children will be tall? When you learn that a new children’s toy is becoming popular do you expect the toy’s price to increase? When you hear that someone is highly educated do you assume the person is also wealthy?

When we receive and process information, we frequently look for meaningful relationships in the information. It’s usually helpful to do so because we learn about the individual events and develop knowledge we can use to make predictions. At the same time, it is important to remember that even if two events are related it doesn’t mean that one necessarily *causes* the other.

**Are These Two Variables Correlated?**

Because of our desire to know whether two events are related, and if so, how closely, we have developed methods for measuring the direction and strength of the relationship between phenomena. One such method is the Pearson correlation coefficient, which measures the degree to which there is a *linear* relationship between two variables. The Pearson correlation coefficient ranges from +1 to –1.

If the correlation coefficient is close to +1 there is a strong *positive* relationship, meaning that as one variable increases the other tends to *increase* as well. If the correlation coefficient is close to –1 there is a strong *negative* relationship, meaning that as one variable increases the other tends to *decrease*. Finally, a correlation coefficient close to zero signals the lack of a linear relationship between the two variables.

**Perception of Government Quality and Willingness to Pay Taxes**

Let’s say you’re interested in understanding the relationship between how highly your community rates its local government and your community’s willingness to pay additional taxes. You’ve collected survey data from the community for the past five years. In each of those years, the overall rating for the local government has been 91, 87, 83, 92, and 89, respectively (scores range from 0 to 100, where 0 is awful and 100 is wonderful).

At the same time, your community’s willingness to pay additional taxes has been 62, 10, 55, 63, and 91, respectively (scores range from 0 to 100, where 0 is unwilling to pay additional taxes under any circumstances and 100 is completely willing to pay additional taxes). Given these data, what can you say about the relationship between how highly your community rates its local government and your community’s willingness to pay additional taxes?

The Pearson correlation coefficient will provide you with the direction and strength of the linear relationship between these two variables. In this case, the correlation coefficient is 0.31*, so you can say there is a weak positive relationship between how highly your community rates its local government and your community’s willingness to pay additional taxes (remember, these data are made up). The weak positive relationship means that as your community’s rating of the local government increases, to a weak extent, your community’s willingness to pay additional taxes also increases.

**Correlation Does Not Imply Causation**

An important point to remember is that the correlation coefficient only provides information about the direction and strength of the linear relationship between two variables. It does not provide information about a non-linear relationship between two variables and it does not imply that one variable causes the other. Sometimes we assume, or jump to the conclusion, that because two variables are correlated one necessarily causes the other, but this is not always the case and is an improper assumption to make. Remember, correlation does not imply causation.

**Guard Against Assumptions of Causality**

Whether at home or at work, we are frequently taking in and processing information. We are often trying to identify meaningful patterns and relationships in the data so we can understand what we’re interpreting and improve our ability to make predictions with the data. The Pearson correlation coefficient is an important tool you can use to measure the relationship between two variables because it provides you with the direction and strength of the linear relationship between the variables. While the correlation coefficient is a useful measure of association, it is important to remember that correlation does not imply causation. Guard against the urge to assume, or be easily persuaded, that a causal relationship exists simply because two events are correlated. By doing so, you will reduce the likelihood of making an unfounded (and perhaps costly) assumption that one variable causes another and increase your chances of making informed, defensible decisions.

***Pearson correlation coefficient (r)**:

r = [n*(Sum xy) – (Sum x)(Sum y)] / square root([n*Sum x^{2 }– (Sum x)^{2}][n*Sum y^{2 }– (Sum y)^{2}])

where:

n = number of pairs of scores

Sum xy = sum of the product of paired scores

Sum x = sum of x scores

Sum y = sum of y scores

Sum x^{2} = sum of squared x scores

Sum y^{2} = sum of squared y scores

**In the example:**

n = 5

Sum xy = 24,972

Sum x = 442

Sum y = 281

Sum x^{2} = 39,124

Sum y^{2} = 19,219

Therefore, the **Pearson correlation coefficient, r,** equals:

r = [5(24,972) – (442)(281)] / square root([5(39,124) – (442*442)][5(19,219) – (281*281)])

r = (124,860 – 124,202) / square root([195,620 – 195,364][96,095 – 78,961])

r = 658 / square root([256][17,134])

r = 658 / square root(4,386,304)

r = 658 / 2,094

**r = 0.31**

In this case, the correlation for the made up data is 0.31, which indicates a weak positive relationship between how highly a community rates its local government and the community’s willingness to pay additional taxes.