Hypothesis testing is used to help make a judgment about a claim by addressing the question, can the observed difference be attributed to chance? A statistical hypothesis test is used to make decisions about the data which can be from controlled experiment or an observational study. A result is said to be statistically significant if it has not occurred by chance. Statistical tests help us to determine the outcome of an experiment where we either accept or reject the hypothesis.

A statistical hypothesis is an assumption about a population parameter where the assumption may or may not be true. To determine whether a statistical hypothesis is true, the approach is to examine the entire population.

As the task is too tedious and impractical we examine the random sample from a population. The hypothesis will be rejected if the given data is not consistent with the hypothesis. If the statistical hypothesis specifies the population completely then it termed as a simple statistical hypothesis otherwise it is called a composite statistical hypothesis.
There will be two types of hypothesis:

Null hypothesis: The hypothesis will be that the sample observations are purely from chance cause and we use $H_{0}$ to denote the null hypothesis (Tested for possible rejection under the assumption it may be true). Mostly it will be a claim of no difference and null hypothesis is mostly popular.

Alternate hypothesis: The hypothesis that sample observations are influenced by some non-random cause. Here the observations will be of real effect and we use $H_{1}$ to denote the alternate hypothesis.

  • Frame the null hypothesis and alternative hypothesis for the given data.
  • Identify a test statistic to examine the null hypothesis. There will be different types of test statistics one has to choose accordingly for the given problem. If we have large test statistics then we reject the null hypothesis favoring alternative hypothesis.
  • Calculate the p -value, it is the probability of obtaining the observed difference in the outcome measure given that no difference exists between treatments in the population. Smaller the p-value stronger will be the evidence against the null hypothesis. We have the following cases:
P -value
> 0.10 The observed difference is 'not significant'
≤ 0.10 The observed difference is 'marginally significant'
≤ 0.05 The observed difference is 'significant'
≤ 0.01 The observed difference is 'highly significant'

Here significant refers to the observed difference is not due to chance.
  • Decision and conclusion: Compare the p-value to the level of significance $\alpha$. If p ≤ $\alpha$ we will reject the null hypothesis and based on the level of significance we conclude whether we accept or reject the null hypothesis for the given problem.

Solved Examples

Question 1: A study for 18 participants was considered to examine the ideal body weight in the population.

The data is 107 119 99 114 120 104 88 114 124 116 101 121 152 100 128 114 95 117. (100 represents ideal body weight for the given problem)

Test whether the given data supports non-ideal body weight in the population. Given $\frac{\sigma}{\sqrt{n}}$ = 3.40.
Set the null and alternate hypotheses
Hypotheses: Null hypothesis $H_{0}$: $\mu$ =100
            Alternative hypothesis $H_{1}$ :$\mu \neq$  100 (Two sided)

Test Statistic:
 As the data is assumed to be normal the Z test is given by:

 $t_{cal}$ = $\frac{\bar x -\mu }{\frac{\sigma }{\sqrt{n}}}$

$\bar x$ = $\frac{(107 + 119 + 99 + 114 + 120 + 104 + 88 + 114 + 124 + 116 + 101 + 121 + 152 + 100 + 128 + 114 +95 + 117)}{18}$ = 112.94

$\mu$ = 100 (given)

$\frac{\sigma}{\sqrt{n}}$ = 3.40 (given)

$t_{cal}$ = $\frac{(112.94 - 100)}{3.40}$ = 3.81

Under t tables for degrees of freedom (18 - 1 = 17) the $t_{tabu}$ is 3.81 and the p value is 0.0016( two-sided).

From the above we see that there is good evidence to reject $H_{0}$ and we conclude the difference is significant.

Question 2: A research firm is interested to study the reaching time to campus for the Cambridge university, as on average some people claim they take 30 minutes to reach from the parking lot. As many people are against the 30 minutes. Test the claim assuming $\bar x = 20$ and $\sigma = 6$. Level of significance is 0.10 and n = 5.
The null and alternative hypotheses are
             $H_{0} :\mu ≥ 30$

            $H_{1} : \mu < 30$

Test statistic
     $Z_{calc}$ = $\frac{\bar x - \mu _{0}}{\frac{\sigma }{\sqrt{n}}}$ = $\frac{20-30}{\frac{6}{\sqrt{5}}}$ = -3.727

Decision rule

Under Normal tables for $Z_{calc}$ = -3.727 we get $Z_{tabu}$ = -1.28 which is on the left in the tables.

As $z_{calc} < Z_{ tabu}$ our test statistic value lies in the rejection region. So we reject the null hypothesis and conclude that mean will be significantly less than 30, and now there is sufficient evidence to prove the parking space will be less than 30.