In statistics a variable can be classified as a categorical or quantitative type. Categorical variable is also known as qualitative variable. Commonly it is also known as an nominal variable and it takes on values that are labels.

For Example: Color of a bag, name of the fruits, etc.,
In this type the variables record a response as a set of categories and the variable can take on one of a limited number of possible values.

There will be no intrinsic ordering in the categories. In a pure categorical variable we can assign categories but it is impossible to order the variables. If a variable is said to have a clear ordering then it is of the ordinal type.

## Definition

A categorical variable measures something and identifies a group to which the thing belongs. They describe a quality or characteristic of a data unit like what type or which category. They tend to be represented by a non numeric value and fall into mutually exclusive and exhaustive categories. Sometimes a categorical variable is stored as a string.
In categorical variable we can categorize persons according to their race, ethnicity, cities, location etc., In a categorical variable, range will be countable and is of mostly like lurking variables type (yes, no, don't know).
For Example: Different types of Blood group.
A, B, AB, O-, AB+  etc.

## Notation

Numerical values can be assigned to categorical variables from 1 to n in a n way categorical variable. In categorical variable numbers play an arbitrary role and have no significance label for an assigned value. They exist on a nominal scale and they logically represent a separate concept.
Some of the mathematical operations can be applied. Central tendency of a set of categorical variables is given by its mode. We can easily count the number of items which occurs more frequently in the study considered.

## Categorical Independent Variable

In an study all the experiments involve some kind of variables. A variable can be something that we wish to measure, it can also be something that we manipulate or something which could be controlled. Now we will discuss about independent variables in experimental and non experimental study.
Based on a given study an independent variable is said to be manipulated in an experiment in order to observe the effect on a dependent variable or an outcome variable. Independent variable is also known as experimental or predictor variable.Independent variable is the antecedent. Ordinary least squares approach is used in the study for estimating the parameters. The values of the independent variable are under experimenter control. It is better not to use independent variables when writing about non experimental designs.

## Categorical Dependent Variable

A variable dependent on the independent variable is known as the dependent variable. Categorical dependent variable can be binary, ordinal, nominal or event count variable. Dependent variable will be the consequent.
In an experimental study for a categorical dependent variable to analyze the parameters it will be insufficient to use ordinary least squares method to produce the best linear unbiased estimator. So the best alternative will be to use the regression models for estimating parameters.

It also depends on the maximum likelihood estimate for the better analysis.  In an experiment it is not possible to have a dependent variable without an independent variable. It is what we measure in the experiment and what will be affected during the experiment.

## Multiple Regression Categorical Variables

Multiple regression is a linear transformation of the X variables such that the sum of squared deviations of the observed and predicted Y is minimimized.
A multiple regression model is of the type

Y$_{i}^{'}$ = b$_{0} + b_{1}X_{1i} + b_{2}X_{2i} + ....... + b_{k}X_{ki}$

In the above model the b values are called regression weights and can be computed to minimize the sum of squared deviations.

$\sum_{i = 1}^{N}(Y_{i}-Y_{i}^{'})^{2}$
Once a researcher incorporates more than two level in a multiple regression model it becomes necessary that the results be interpretable. Then the steps include recoding the variable into a number of separate dichotomous variables called dummy coding.

Categorical variables can be incorporated into regression analysis, only when they are properly prepared and interpreted. To perform a regression with a categorical variable it is necessary that it must first be coded. Once it is coded it becomes easier to analyze the relationship between a categorical and continuous variable through a box plot.

## Regression with Categorical Variables

In order to analyze the quantitative data, a variable should be converted to an independent variable in regression analysis or of a dependent variable in logistic regression. Categorical variables in regression typically include three main coding systems namely dummy coding, effects coding and contrast coding.
Depending on the study one can use any of the three types and the regression equation will be of the type:
Y = mx  + c
m : slope
x : Explanatory variable
c : Y intercept
Values can take different meaning based on the coding system.