Categorical data is a statistical data type consisting of categorical variables. Categorical data is the data which can be divided into groups, this data can not be measured numerically.

There are often more informative to categorize the variables into relatively small number of group like race, sex, colour, educational qualification etc, which tells us that they are very qualitative in nature. Whereas, the other datas are quantitative and are often measured numerically.

Categorical data is divided into two main types:
  1. Simple attribute data
  2. Manifold attribute data

Simple attribute data: The data identified on the basis of single characteristic is called as simple attribute data.

Example: The number of students in a class room based on the characteristic sex. Here, we consider the characteristic as boys and girls. So, this data will be a simple attribute.

Manifold attribute data: The data identified on the basis of more than one characteristic is called as manifold attribute data.

Example: The number of students on a class room based on the characteristic sex and marks. Here, we consider the characteristic boys and girls and also below average and above average.

Categorical data analysis involves two-way data tables, which presents categorical data by counting the number of observation that fall into each group for two variables. One is shown in the rows and the other in columns.

Say, if we have to conduct a survey of 34 individuals, according to their hair and eye color, then two way data table will look as shown below:
 Eye Colour  Brown  Green  Black  Blue  Total 
 Hair Colour          
 Brown  1
 Black  4 11 
Grey   2
 Total 10  14   34
A table which displays the categories along with the associated frequencies and relative frequencies is called a frequency table of categorical data. You will get a better understanding on this concept, by going through the problem below:

Solved Example

Question: Total score of the students obtained in examination are: 113, 118, 120, 125, 128, 130, 135, 140. Count the number of items in each class and put the total in second column.
Given data: 113, 118, 120, 125, 128, 130, 135, 140

Step 1:
Arrange the data into different classes with corresponding frequencies mark as tally symbols.

 Score(x)  Tally Mark
 113 - 120  III
 125 - 130  III
 135 - 140  II

Step 2:

Count the tally marks and record the frequencies (result).

Score(x)  Tally Mark 

 113 - 120  III   3
 125 - 130  III  3
 135 - 140  II  2

Bar chart is one of the most commonly used diagrammatic display of the data. It gives us the clear idea of a particular data. It is constructed using the set of bars which are placed on the common base line, horizontally or vertically with equal distance.

There are six types of bar graphs and they are:
  1. Simple Bar Graph
  2. Multiple Bar Graph
  3. Percentage Bar Graph
  4. Subdivided Bar Graph
  5. Deviation Bar Graph
  6. Stack Bar Graph

Among these, Simple bar graph is most commonly used to display the categorical data.

Simple Bar Graph:

In simple bar graph, separate bars are drawn to display the data of different charatristics.

Solved Example

Question: The following data represents the population of a country based on the characteristic sex. Draw a Simple Bar diagram displaying the data.

Sex   Population
 Male   35000
 Female   48000

The simple bar graph is shown as follows:

Bar Chart of Categorical Data

Pie chart gives us the circular display of the data. It is divided into different disjoint parts. And, each part displays the related information.

Solved Example

Question: Draw a Pie chart for the given data.
Name of Game  % of the student playing in school 
 Vollyball  15%
 Cricket  30%
 Football  50%
 Throwball  5%

The pie chart for the above data is given as follows:

Pie Chart of Categorical Data

ANOVA or Analysis of Variance helps us to know the difference between groups on some variable. ANOVA involves both Parametric(score data) and non-parametric(ranking/ordering) data.

Types of ANOVA:
  1. One - way between groups: This is the simplest form of ANOVA. Here, ANOVA can be used to compare variables between different groups.
  2. One - way repeated measures: This form of ANOVA is used, when you have to measure something a few times on a single group.
  3. Two - way between groups: This form of ANOVA is used to look at complex groupings.
  4. Two - way repeated measures: This form of ANOVA is used to measure the repeated structure.
Given below is an example problem based on Categorical data.

Solved Example

Question: A class consists of 70 students. Find the percentage of the students scored marks in each interval.

 Marks  Number of students
 0 - 5  5
 6 - 10  8
 11 - 16  7
 17 - 22  10
 23 - 28  13
 29 - 34  18
 35 - 40  9

Given: Total number of students = 70
Here, the different marks of the student scored in maths denotes the categorical data.
Now, lets find the percentage for each category of students.

 Marks Number of students  Percentage 
 0 - 5  5  7.142
 6 - 10  8  11.428
 11 - 16  7  10
 17 - 22  10  14.285
 23 - 28  13  18.751
 29 - 34  18  25.714
 35 - 40  9  12.857