PEARSON function

2 years ago

3 minutes

The PEARSON function in Excel is used to calculate the Pearson correlation coefficient between two sets of data. The Pearson correlation coefficient (denoted as $r$ ) measures the strength and direction of the linear relationship between two variables. It ranges from -1 to 1, where:

1 indicates a perfect positive linear relationship,
-1 indicates a perfect negative linear relationship,
0 indicates no linear relationship.

Syntax:

PEARSON(array1, array2)

Arguments:

array1: Required. The first set of data values (the independent variable).
array2: Required. The second set of data values (the dependent variable).

Both arrays must have the same number of data points, and the data points should correspond to each other (i.e., the $n$ -th data point of array1 should be paired with the $n$ -th data point of array2).

Output:

The function returns a single numeric value, which is the Pearson correlation coefficient $r$ between the two sets of data.

How It Works:

The Pearson correlation coefficient is calculated using the following formula: $r = \frac{\sum{(X_i – \bar{X})(Y_i – \bar{Y})}}{\sqrt{\sum{(X_i – \bar{X})^2} \sum{(Y_i – \bar{Y})^2}}}$ Where:
- $X_i$ and $Y_i$ are the individual data points from array1 and array2,
- $\bar{X}$ and $\bar{Y}$ are the means of array1 and array2, respectively.

Example 1: Finding the Pearson Correlation Coefficient

Suppose you have two sets of data:

array1: [1, 2, 3, 4, 5]
array2: [2, 4, 6, 8, 10]

To calculate the Pearson correlation coefficient between these two arrays:

Use the formula:

=PEARSON(A1:A5, B1:B5)

Where A1:A5 contains the data from array1 and B1:B5 contains the data from array2.

Since these two arrays have a perfect positive linear relationship, the result will be 1.

Example 2: Negative Correlation

Suppose you have two sets of data:

array1: [1, 2, 3, 4, 5]
array2: [10, 8, 6, 4, 2]

Use the formula:

=PEARSON(A1:A5, B1:B5)

The result will be -1, indicating a perfect negative linear relationship.

Example 3: No Correlation

Suppose you have two sets of data:

array1: [1, 2, 3, 4, 5]
array2: [2, 3, 1, 5, 4]

Use the formula:

=PEARSON(A1:A5, B1:B5)

The result will be a value closer to 0, indicating that there is no linear correlation between these two sets of data.

Key Points:

The Pearson correlation coefficient quantifies the linear relationship between two variables.
1 indicates a perfect positive linear relationship, -1 indicates a perfect negative linear relationship, and 0 indicates no linear relationship.
The function is only appropriate for detecting linear relationships; it does not capture non-linear relationships.

Use Cases:

Statistics: Determine the strength and direction of the relationship between two variables.
Finance: Analyze the correlation between two financial assets (e.g., stock prices or returns).
Research: Identify correlations between variables in scientific and social science studies.
Quality control: Measure the correlation between variables such as input factors and output results.

Notes:

The PEARSON function works only for linear correlations. It does not assess non-linear relationships.
The data arrays must have the same number of data points. If they don’t, the function will return an error.
If either of the arrays has constant values (i.e., no variation), the function will return a #DIV/0! error, as the correlation cannot be calculated for constant data.

Post Views: 27

Syntax:

Arguments:

Output:

How It Works:

Example 1: Finding the Pearson Correlation Coefficient

Example 2: Negative Correlation

Example 3: No Correlation

Key Points:

Use Cases:

Notes:

Related posts: