In previous article of this series we learned how to calculate values of coefficients, test of slope coefficients and Hypothesis.

Let us continue where we left out

Here in this article we will learn about:

- ANOVA
- Coefficient of Determination

Lets start with ANOVA:

#### What is ANOVA?

A basic idea about ANOVA, that of partitioning variation, is a fundamental idea of experimental idea of experimental statistics. The ANOVA belies its name in that it is not concerned about analyzing variances but rather with analyzing the variances of mean.

There are two types of ANOVA:

- One way ANOVA
- Two way ANOVA

I have explained One way and Two way ANOVA respectively.

Now lets discuss **Coefficient Of Determination**

#### What is Coefficient of Determination?

Coefficient of determination denoted by R² or r² and pronounced as R-squared, it is a ratio of sum of squared.

R² or r²=SS(reg)/SS(t)

*R²*is a statistic that will give some information about the goodness of fit of a model.*R²*,coefficient of determination measure of how good is the relationship between dependent and independent variable.*R²*lies between [0,1].- An
*R²*of 1 indicates that there is 100% relationship between variables. - If R² = 0.8 explain 80% variability between variables.
- An
*R²*of 0 indicates that there is no relationship between the variables. - R² does not tell you that independent variable is the cause of change in dependent varibale.
- R² does not tell you whether correct regression model was used.

R² increase or decrease on adding of any extra regressor variable, so we can not much dependent on R².

If this isn’t a solution then tere might be other way to find coefficient of determination of model. Yes, there is a solution known as Adjusted R².

The above properties for R² and Adjusted R² will remain same.

- The adjusted
*R*^{2}can be negative, and its value will always be less than or equal to that of*R*^{2}. - The adjusted
*R*^{2}increases only when the increase in*R*^{2}(due to addition of a new regressor variable)

The adjusted *R*^{2} is defined as

where

*p*is the total number of regressor variables in the model (not including the constant term)*n*is the sample size.

Adjusted *R*^{2} can also be written as

where

- df
_{t}is the total degrees of freedom. -
*n*– 1 of the estimate of the population variance of the dependent variable. - df
_{e}is the degrees of freedom of regression model. *n*–*p*– 1 of the estimate of the underlying population error variance.

Next is Model Adequacy checking, Multicollinearity and selecting significant explanatory variables.

We will discuss these remaining topics in the next article of this series. Till then, if you have any doubt or suggestion please feel free to shoot me an email on khanirfan.khan21@gmail.com or mention in comment.

Categories: Machine Learning, R