We come across a number of inter-related events in our day-today life. For instance, the yield of a crop depends on the rainfall, the cost or price of a product depends on the production and advertising expenditure, the demand for a particular product depends on its price, expenditure of a person depends on his income and so on.

The regression analysis confined to the study of only two variables at a time is called a simple regression. But quiet often the values of a particular phenomenon may be affected by multiplicity of factors. The regression analysis for studying more than two variables at a time is known as multiple regression. In this section we shall discuss with linear Regression.

## Linear Regression Definition

Line of regression of y on x is the line which gives the best estimate for the value of y for any specified value of x.
Similarly, the line of regression of x on y is the line which gives the best estimate for the value of x for any specified value of y

Linear Regression Equation:
Line of regression of y on x : Let (x1 , y1) , (x2 , y2) . . . . . . . . . . . .. . . . . . . . . . . (xn , yn) be n pairs of observations on the two variables x and y under study.
Then the linear regression of y on x is given by
y = a + b x

where a = $\frac{(\sum x^{2})(\sum y)-(\sum x)(\sum xy)}{n\sum x^{2}-(\sum x)^{2}}$ and

b = $\frac{n\sum xy-(\sum x)\sum y}{n\sum x^{2}-(\sum x)^{2}}$

### Linear Regression Table:

 x y x y x2 y2 $\sum x$ = $\sum$ y = $\sum$ x y = $\sum$ x2 = $\sum$ y2 =

Substituting the sum of x, y, (xy), x2 and y2 in the above formula for a and b we get the real values of a and b.

Substituting the real values of a and b in the equation y = a + bx we get the equation of regression line.

Simple Linear Regression: In a simple linear regression the value of y corresponds to each of the values of x, meaning as x varies, the value of y also varies.
The equation is given by, y = a + bx

Multiple Linear Regression: In a multiple linear regression, the value of y corresponds to more than one value of x, which can be expressed as,
y = a + b1 x1 + b2 x2 + .....................

## Linear Regression Example

### Solved Examples

Question 1: Obtain the regression line from the following data.
 X 6 2 10 4 8 Y 9 11 5 8 7

Solution:

Regression equation of Y on X is Y = a + b X

 X Y XY X2 Y2 6 9 54 36 81 2 11 22 4 121 10 5 50 100 25 4 8 32 16 64 8 7 56 64 49 $\sum$ X = 30 $\sum$ Y = 40 $\sum$ XY = 214 $\sum$ X2 = 220 $\sum$ Y2 = 340

a = $\frac{(\sum x^{2})(\sum y)-(\sum x)(\sum xy)}{n\sum x^{2}-(\sum x)^{2}}$

= $\frac{(220)(40)-(30)(214)}{5(220)-30^{2}}$

a   = 11.9

b = $\frac{n\sum xy-(\sum x)\sum y}{n\sum x^{2}-(\sum x)^{2}}$

= $\frac{5(214)-(30)(40)}{5(220)-30^{2}}$

b   = - 0.65
Substituting the values of X and Y in the equation Y = a + b X we get
Y = 11.9 + (- 0.65) X
=> Y = 11.9 - 0.65 X

Question 2: The data corresponding to heights of fathers and sons in inches are given below:
 Heights of Fathers 65 66 67 68 69 70 72 67 Heights of Sons 67 68 65 72 72 69 71 68

Solution:

 X Y x = X - $\overline{X}$   x = X - 68 y=Y - $\overline{Y}$             y = Y - 69 x2 y2 x y 65 67 -3 -2 9 4 6 66 68 -2 -1 4 1 2 67 65 -1 - 4 1 16 4 68 72 0 3 0 9 0 69 72 1 3 1 9 3 70 69 2 0 4 0 0 72 71 4 2 16 4 8 67 68 -1 -1 1 1 1 $\sum$ X = 544 $\sum$ Y = 552 $\sum$ x = 0 $\sum$ y = 0 $\sum$ x2 = 36 $\sum$ y2 = 44 $\sum$ xy = 24

$\overline{X}$ = $\frac{544}{8}$ = 68

$\overline{Y}$ = $\frac{552}{8}$ = 69

$\sigma _{yx}$  = $\frac{n(\sum xy)-(\sum x)(\sum y)}{n\sum x^{2}-\left (\sum x \right)^{2}}$

= $\frac{(8)(24)-0}{8(36)-0}$

= 0.66

$\sigma _{yx}$    = 0.7
The regression line of y on x is given by the formula,
y - $\overline{X}$ = $\sigma _{yx}$ (x - $\overline{X})$
Substituting for $\sigma_{yx}$, $\overline{X}$ and $\overline{Y}$, we get

y - 69 = 0.7 (x - 68)
y  = 0.7 x + 21.4, which is the required equation of regression.