A measure of central tendency which represents the centralness of a data set is not sufficient to describe the data distribution completely. If it is observed the means of two data distributions are equal, it cannot be concluded that the individual values in the two distributions are distributed in a similar way. In one distribution most of the values might be found clustered around the mean and in another, the data values might be spread over a greater range of values. Thus we need a measure that will convey the spread of the data set to understand the distribution better. Such measures are called the measures of variability.

A measure of variability for a data distribution is a number that conveys the idea of spread of the values in the data set around the mean. Measures of variability helps to understand the distribution of data set better. Such measures are called the measures of variability or the measures of variation.
The following is a list of measures of variability used in Statistics
  1. Range
  2. Mean Absolute Deviation
  3. Inter quartile range
  4. Variance
  5. Standard Deviation
  6. Coefficient of Variation
Of the different measures of variability used in statistics, the three important commonly used measures are
  1. Range
  2. Variance
  3. Standard Deviation

Range:
Range is defined to be the difference between the highest and the lowest values in a data distribution.

Range = H - L where H represents the highest value and L the lowest value in the data set.

Range is a measure of dispersion which is simple to understand and easy to calculate. It is not based on all values and indeed it is unaffected by all other items except the highest and lowest values. It is greatly affected and loses its significance when one of the end values is either extremely high or extremely low.

Variance:
A data distribution is generally described by Mean the measure of central tendency and Variance the measure of variability.
The variance of a data set is defined to be the average of the squares of deviations from the mean. This average is exact in the case of Population variance and approximate for sample variance. The two formulas used to compute the Population and Sample Variances are given below.

Population Variance σ2 = $\frac{\sum (X-\mu )^{2}}{N}$ where μ is the population mean and N the size of the population.

Sample Variance S2 = $\frac{\sum (X-\overline{X})^{2}}{n-1}$ where X is the sample mean and n the sample size.

Standard Deviation:

Standard deviation is the positive square root of variance of a data set. The formulas for population and sample standard deviations are,

Population Standard Deviation σ = $\sqrt{\frac{\sum (X-\mu )^{2}}{N}}$

Sample Standard Deviation S = $\sqrt{\frac{(X-\overline{X})^{2}}{n-1}}$


Both the variance and the standard deviation are based on all the values in the data distribution and hence the most reliable measures of spread of data. These measures are used to find how consistent a variable is. While comparing two data sets these measures are used to determine which distribution is more variable. Variance and Standard deviation are also used extensively in Inferential Statistics.

Below you could see example

Solved Example

Question: Three sample data sets with same range and mean are given below. Compute the standard deviation for the three sets and compare the variations.

 8 12   14 18   20  24  26  28
 8  10  10  15  26  26  27  28
 8  14  16  18  20  22  24  28

Solution:
The mean of all the three data sets = 18.75     

and the Range = 28 - 8 = 20 ( Highest Value - Lowest Value)

Let us calculate the standard deviation for each of the samples.

Variance for Sample one $S_{1}^{2}$ = $\frac{(8-18.75)^{2}+(12-18.75)^{2}+(14-18.75)^{2}+(18-18.75)^{2}+(20-18.75)^{2}+(24-18.75)^{2}+(26-18.75)^{2}+(28-18.75)^{2}}{7}$ 

                                                      = $\frac{351.5}{7}$ = 50.21

Standard deviation of Sample one S1 = 7.09

Variance for Sample two $S_{2}^{2}$ = $\frac{(8-18.75)^{2}+(10-18.75)^{2}+(10-18.75)^{2}+(15-18.75)^{2}+(26-18.75)^{2}+(26-18.75)^{2}+(27-18.75)^{2}+(28-18.75)^{2}}{7}$

                                                       = $\frac{391.5}{7}$ = 55.93

Standard Deviation of Sample Two S2 = 7.48

Variance for Sample Three $S_{3}^{2}$ = $\frac{(8-18.75)^{2}+(14-18.75)^{2}+(16-18.75)^{2}+(18-18.75)^{2}+(20-18.75)^{2}+(22-18.75)^{2}+(24-18.75)^{2}+(28-18.75)^{2}}{7}$ 

                                                         = $\frac{271.5}{7}$ = 38.79

Standard Deviation of Sample Three S3 = 6.23

Of the three samples with the same mean and range, Sample three is the least variable.