You are working with the text-only light edition of "H.Lohninger: Teach/Me Data Analysis, Springer-Verlag, Berlin-New York-Tokyo, 1999. ISBN 3-540-14743-8". Click here for further information.

Regression
Coefficient of Determination

When calculating a regression model, we are interested in a measure of the usefulness of the model. There are several ways to do this, one of them being the coefficient of determination. The concept behind this coefficient is to calculate the reduction of the error of prediction when the information provided by the x values is included in the calculation.

So we have to look at two cases:

(1) we assume that x does not contribute to the prediction of y:  the best guess for the predicted value of Y is the mean of all y values. The sum of squared errors is given as

(2) we include the information provided by x for the prediction of y: this means that the errors are reduced, since the regression line represents a best fit to the data. The sum of squared errors is then given as

The coefficient of determination is then the relative reduction of the error when the information in x is included to the model:

r2 = (SStot - SSreg)/SStot

Thus the coefficient of determination (also sometimes called goodness of fit) specifies the amount of sample variation in y explained by x. For simple linear regression the coefficient of determination is simply the square of the correlation coefficient between Y and .

Last Update: 2005-Jul-16