Coefficient of determination serves as a measure in statistical analysis model. It determines how well a model explains and predicts the future outcomes. An another measure of how well the least squares equation performs an predictor of y.

The values of R$^{2}$ lies between 0 and 1 and higher the R$^{2}$ the more useful the model. Coefficient of determination is symbolized by r$^{2}$ because it is square of the coefficient of correlation symbolized by r.

Useful in trend analysis and serves as an important tool in the determining the degree of linear correlation of variables in regression analysis.It can also be used as a guideline to measure the accuracy of the model.

## Definition

A statistic used in statistical models whose purpose lies either in predicting the future outcomes or in testing of hypothesis. An indicative of the level of explained variability in the model. If r$^{2}$ = 1 then it is said to be a perfect fit and can be used as a reliable model for future forecasts, model explains all the variability of the response data around its mean.If r$^{2}$ = 0 then it is an indicative that the model fails to accurately model the data set. R$^{2}$ should be used in conjunction with residual plots as through R$^{2}$ you might not get to know the whole story of your data set. For multiple regression it is also known as the coefficient of multiple determination.

## Formula

Formula for coefficient of determination is given below:

Correlation (r) =  $\frac{n\sum xy - (\sum x)(\sum y)}{\sqrt{n(\sum x^{2})-(\sum x)^{2}}\sqrt{n(\sum y^{2})-(\sum y)^{2}}}$

where, n = Number of observations

$\bar{x}$ = Mean of x values.

$\bar{y}$ = Mean of y values.
Alternative formula For R$^{2}$ is

R$^{2}$ = $\frac{SS_{yy} - SSE}{SS_{yy}}$

= 1 - $\frac{SSE}{SS_{yy}}$

Smaller the SSE the more reliable the predictions obtained from the model

## Interpreting Coefficient of Determination

A model is said to be a good fit to the data if the difference between the observed values and the models predicted values are small and unbiased.
One needs to be aware of the following points while interpreting coefficient of determination.

1) As the number of variables in the model are increased value of R$^{2}$ also increases. This is a drawback while using R$^{2}$ as one might keep adding the variable to increase the value of R$^{2}$. To avoid this one can use adjusted R square and predicted R square.

2) Proportion of variance of one variable with the help of another variable can be obtained. A measure helping us in how certain one can be in making predictions from a graph.

3) R$^{2}$ tells us how much better we do by using the regression equation rather than just $\bar{y}$ to predict y.
SSE and SS$_{yy}$ will be nearly identical if x contributes no information to y. SSE is said to be minimal when x contributes lots of information to y.

4) While plotting determine how well the regression line passes through each point on the scatter plot, it it covers all the points on the plot then it is said to explain all the variation in the data set, if not and is away from the points then it is about less to explain the variability in the data set.