RSQ function

The RSQ function in Excel calculates the coefficient of determination (R2R^2) for the relationship between two sets of data. This statistic is a measure of how well the variation in one dataset can be explained by the variation in another dataset, often used in regression analysis.


Syntax

=RSQ(known_y's, known_x's)

Parameters

  1. known_y’s (required): The dependent data points (Y-values).
  2. known_x’s (required): The independent data points (X-values).

Key Points

  • R2R^2 (the coefficient of determination) is the square of the correlation coefficient (r).
  • Values range from 0 to 1:
    • 0: No relationship between the variables.
    • 1: Perfect linear relationship.
  • Both arrays (known_y’s and known_x’s) must have the same length and contain numeric values.
  • Non-numeric values, empty cells, or logical values in the arrays are ignored.

Calculation Formula

The R2R^2 value is calculated as:

R2=SSregSStotalR^2 = \frac{\text{SSreg}}{\text{SStotal}}

Where:

  • SSreg: Regression sum of squares (explained variance).
  • SStotal: Total sum of squares (total variance).

Alternatively, R2R^2 is the square of the Pearson correlation coefficient (rr):

R2=(r)2R^2 = (r)^2


Examples

1. Simple Calculation

Suppose you have the following datasets:

  • Y-values (dependent) in A1:A5: {10, 20, 30, 40, 50}
  • X-values (independent) in B1:B5: {1, 2, 3, 4, 5}

To calculate R2R^2:

=RSQ(A1:A5, B1:B5)

Result: 1

  • This indicates a perfect linear relationship between X and Y.

2. Non-Perfect Relationship

For Y-values {3, 7, 8, 12, 15} in C1:C5 and X-values {1, 2, 3, 4, 5} in D1:D5:

=RSQ(C1:C5, D1:D5)

Result: 0.986 (approximately)

  • This indicates a strong, but not perfect, linear relationship.

3. Handling Unequal Data

If Y-values {10, 20, 30} and X-values {1, 2, 3, 4}, Excel will return a #N/A error because the arrays are of unequal length.

4. Data with Noise

For Y-values {5, 15, 25, 35, 50} and X-values {1, 2, 3, 4, 5}:

=RSQ(E1:E5, F1:F5)

Result: A value less than 1, reflecting the presence of noise in the data.


Notes

  • Error Handling:
    • If arrays have different lengths, Excel returns a #N/A error.
    • If the arrays are empty or contain fewer than two data points, Excel returns a #DIV/0! error.
  • Use Cases:
    • Assessing the goodness-of-fit for linear regression.
    • Evaluating the strength of the relationship between two variables.
  • Compatible with Charts: Use with scatter plots and trendlines to display the R2R^2 value visually.

Related Functions

  • CORREL: Calculates the correlation coefficient (rr).
  • LINEST: Provides a more detailed linear regression analysis.
  • SLOPE: Returns the slope of the regression line.
  • INTERCEPT: Returns the intercept of the regression line.
  • FORECAST.LINEAR: Predicts future values based on a linear trend.

The RSQ function is particularly helpful for determining how well one variable predicts another, making it a key tool in statistical and predictive analysis.

Leave a Reply 0

Your email address will not be published. Required fields are marked *