Probability distributions when applied to grouped random variables gives rise to joint probability distributions. In this page we will focus only on two dimensional distributions though it can be applied to higher dimensions also.
Joint probability distributions describe situations where by both outcomes represented by random variables occur.

Joint probability distribution can be expressed in terms of joint probability density function or joint probability mass function and is defined as below:
                                                       $f(x, y)$ = $P(X = x, Y = y)$.

Discrete Probability Distributions
Discrete random variables when paired they give rise to discrete joint probability distributions.
The probability function, also known as the probability mass function for a joint probability distribution $f(x,y)$ is defined as
1) $f(x, y)$ $\geq$ 0 for all $(x, y)$

2) $\sum_{x}$$\sum_{y}$ $f(x, y)$ = 1

3) $f(x, y)$ = $P(X = x, Y = y)$

4) If $x$ and $y$ are independent then $f(x, y)$ = $f(x)$ $\times $f(y)$
Continuous Probability Distributions
From a group of continuous random variables continuous joint probability distributions arise.
They are characterized by the joint density function,
The joint density function should have:

1) $f(x, y)$ $\geq$ 0 for all $(x, y)$

2) $ \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}$ $f(x, y)$ $dx dy$ = 1

3) For any region $A$ lying in the $xy$ plane $P$[$(X, Y)$ $\in$ $A$] = $\int \int_{A}$ $f(x, y) dx dy$
Let $X$ and $Y$ be two discrete random variables which are defined on a sample space. Let $p(x, y)$  be a  function such that

$p(x, y)$ = $P(X = x ; Y = y)$

Then $p(x, y)$ is called joint probability function of $X$ and $Y$.

Let $p$ $_{1}(x)$ be the probability function of $X$ and let $p$ $_{2}(y)$ be the probability function of $Y$. Then $p$ $_{1}(x)$ and $p$ $_{2}(y)$  are

called marginal probability functions of $X$ and $Y$ respectively.

Here $p$ $_{1}(x)$ = $P$[$X$ = $x$]

 = $\sum_{y}$  $P$[$X$ = $x$ ; $Y$ = $y$]

 = $\sum_{y}$ $p(x, y)$

p$_{2}(y)$ = $P$[$Y$ = $y$]

 = $\sum_{x}$ $P$[$X$ = $x$ ; $Y$ = $y$]

 = $\sum_{x}$ $p(x, y)$

Thus, $p$ $_{1}(x)$ = $\sum_{y}$ $p (x, y)$  and $p$ $_{2}(y)$ = $\sum_{x}$ $p(x, y)$ 
Given below is an example on joint probability distribution.

Example: For the following probability distribution, find $E(3X + 5)$, $E(2X^{2} - 5)$, $Var(X)$, $Var(2X)$, $Var(4X - 3)$, $Var(-X)$, $Var(-5X + 2)$, $SD(X)$ and $SD(2X)$.

$x$
 10  12
$p(x)$  0.8  0.2

Solution : Given below is the table
$x$
$p(x)$
$x$$^{2}$ $x.p(x)$
$x$$^{2}$ $p(x)$ 
10
 0.8  100  8.0  80
12  0.2 144  2.4  28.8
   $\sum$ = 1    $\sum$ = 10.4  $\sum$ = 108.8

$E(X)$ = $\sum$ $x$ $\times$ $p(x)$
= 10.4

$E(X^{2})$ = $\sum$ $x^{2} p(x)$
= 108.8

$E(3X + 5 )$ = 3 $E(X)$ + 5
= 3 $\times$ 10.4 + 5
= 36.2

$E(2X^{2} - 5)$ = 2$E(X^{2})$ - 5
= 2 $\times$ 108.8 - 5
= 212.6

$Var(X)$ = $E(X^{2})$ - $[E(X)]^{2}$
= 108.8 - (10.4)$^{2}$
= 0.64

$Var(2X)$ = 2$^{2}$ $\times Var(X)$
= 4 $\times$ 0.64
= 2.56

$Var(4X - 3)$ = 4$^{2}$ $\times Var(X)$
=  4$^{2}$ $\times$  0.64
= 10.24

$Var(-X)$ = (-1)$^{2}$ $\times Var(X)$ 
= $Var(X)$
= 0.64

$Var(-5X + 2)$ = (-5)$^{2}$ $\times Var(X)$
= 25 $\times$ 0.64
= 16

$S.D (X)$ = $\sqrt{Var(X)}$
= $\sqrt{0.64}$
= 0.8

$S.D (2X)$ = $\sqrt{var(2X)}$
= $\sqrt{2.56}$
= 1.6