Joint Probability Distributions
In the section on
probability distributions, we looked at discrete and continuous distributions
but we only focused on single random variables. Probability distributions can, however,
be applied to grouped random variables which gives rise to joint probability distributions.
Here we’re going to focus on 2dimensional distributions (i.e. only two random variables)
but higher dimensions (more than two variables) are also possible.
Since all random variables are divided into discrete and continuous random variables,
we have end up having both discrete and continuous joint probability distributions.
These distributions are not so different from the one variable distributions we
just looked at but understanding some concepts might require one to have knowledge
of multivariable calculus at the back of their mind.
Essentially, joint probability distributions describe situations where by both outcomes
represented by random variables occur. While we only X to represent the random
variable, we now have X and Y as the pair of random variables.
Joint probability distributions are defined in the form below:
where by the above represents the probability that events x and y
occur at the same time.
The Cumulative Distribution Function (CDF) for a joint probability distribution
is given by:
Discrete Joint Probability Distributions
Discrete random variables when paired give rise to discrete joint probability distributions.
As with single random variable discrete probability distribution, a discrete joint
probability distribution can be tabulated as in the example below.
The table below represents the joint probability distribution obtained for the outcomes
when a die is flipped and a coin is tossed.
f(x,y)  1  2  3  4  5  6  Row Totals 
Heads  a  b  c  d  e  f  α 
Tails  g  h  i  j  k  l  β 
Column Totals  γ  δ  ε  ζ  θ  ψ  ω 
In the table above, x = 1, 2, 3, 4, 5, 6 as outcomes when the die is tossed
while y = Heads, Tails are outcomes when the coin is flipped. The letters
a through l represent the joint probabilities of the different events
formed from the
combinations of x and y while the Greek letters represent
the totals and ω should equal to 1. The row sums and column sums are referred
to as the marginal probability distribution functions (PDF).
We shall see in a moment how to obtain the different probabilities but first let
us define the probability mass function for a joint discrete probability
distribution.
The probability function, also known as the probability mass function for a joint
probability distribution f(x,y) is defined such that:

f(x,y) ≥ 0 for all (x,y)
Which means that the joint probability should always greater or equal to zero as
dictated by the fundamental rule of probability. 
∑_{x} ∑_{y} f(x,y) = 1
Which means that the sum of all the joint probabilities should equal to one for
a given sample space.  f(x,y) = P(X =x, Y = y)
The mass probability function f(x,y) can be calculated in a number of different
ways depend on the relationship between the random variables X and Y.
As we saw in the section on
probability concepts, these two variables can be either independent or dependent.
If X and Y are Independent:
In the example we gave above, flipping a coin and tossing a die are independent
random variables, the outcome from one event does not in any way affect the outcome
in the other events. Assuming that the coin and die were both fair, the probabilities
given by a through l can be obtained by multiplying the probabilities
of the different x and y combinations.
For example: P(X = 2, Y = Tails) is given by
Since we claimed that the coin and the die are fair, the probabilities a
through l should be the same.
The marginal PDF’s, represented by the Greek letters should be the probabilities
you expect when you obtain each of the outcomes.
For example:
The table thus becomes:
f(x,y)  1  2  3  4  5  6  Row Totals 
Heads  ^{1}⁄_{12}  ^{1}⁄_{12}  ^{1}⁄_{12}  ^{1}⁄_{12}  ^{1}⁄_{12}  ^{1}⁄_{12}  ^{1}⁄_{2} 
Tails  ^{1}⁄_{12}  ^{1}⁄_{12}  ^{1}⁄_{12}  ^{1}⁄_{12}  ^{1}⁄_{12}  ^{1}⁄_{12}  ^{1}⁄_{12} 
Column Totals  ^{1}⁄_{6}  ^{1}⁄_{6}  ^{1}⁄_{6}  ^{1}⁄_{6}  ^{1}⁄_{6}  ^{1}⁄_{6}  1 
If X and Y are Dependent:
If X and Y are dependent variables, their joint probabilities are
calculated using their different relationships as in the example below.
Given a bag containing 3 black balls, 2 blue balls and 3 green balls, a random sample
of 4 balls is selected. Given that X is the number of black balls and Y
is the number of blue balls, find the joint probability distribution of X
and Y.
Solution:
The random variables X and Y are dependent since they are picked from the same sample
space such that if any one of them is picked, the probability of picking the other
is affected. So we solve this problem by using
combinations.
We’ve been told that there are 4 possible outcomes of X i.e {0,1,2,3} where
by you can pick none, one, two or three black balls; and similarly for Y
there are 3 possible outcomes {0,1,2} i.e. none, one or two blue balls.
The joint probability distribution is given by the table below:
f(x,y)  0  1  2  3  Row Totals 
0  
1  
2  
Column Totals  1 
To fill out the table, we need to calculate the different entries. We know the total
number of black balls to be 3, the total number of blue balls to be 2, the total
sample need to be 4 and the total number of balls in the bag to be 3+2+3 = 8.
We find the joint probability mass function f(x,y) using combinations as:
What the above represents are the different number of ways we can pick each of the
required balls. We substitute for the different values of x (0,1,2,3) and
y (0,1,2) and solve i.e.
f(0,0) is a special case. We don’t calculate this and we outright claim that
the probability of obtaining zero black balls and zero blue balls is zero. This
is because of the size of the entire population relative to the sample space. We
need 4 balls from a bag of 8 balls, in order not to pick black nor blue balls, we
would need there to be at least 4 green balls. But we only have 3 green balls so
we know that as a rule we must have at least either one black or blue ball in the
sample.
f(3,2) doesn’t exist since we only need 4 balls.
From the above, we obtain the joint probability distribution as:
f(x,y)  0  1  2  3  Row Totals 
0  0  ^{3}⁄_{70}  ^{9}⁄_{70}  ^{3}⁄_{70}  ^{15}⁄_{70} 
1  ^{2}⁄_{70}  ^{18}⁄_{70}  ^{18}⁄_{70}  ^{2}⁄_{70}  ^{40}⁄_{70} 
2  ^{3}⁄_{70}  ^{9}⁄_{70}  ^{3}⁄_{70}  ^{15}⁄_{70}  
Column Totals  ^{5}⁄_{70}  ^{30}⁄_{70}  ^{30}⁄_{70}  ^{5}⁄_{70}  1 
Continuous Joint Probability Distribution
Continuous Joint Probability Distributions arise from groups of continuous random
variables.
Continuous joint probability distributions are characterized by the Joint Density
Function, which is similar to that of a single variable case, except that
this is in two dimensions.
The joint density function f(x,y) is characterized by the following:
 f(x,y) ≥ 0, for all (x,y)
 ∫^{∞}_{∞} ∫^{∞}_{∞}
f(x,y) dx dy = 1 
For any region A lying in the xy plane,
The marginal probability density functions are given by
whereby the above is the probability distribution of random variable X alone.
The probability distribution of the random variable Y alone, known as its marginal
PDF is given by
Example:
A certain farm produces two kinds of eggs on any given day; organic and nonorganic.
Let these two kinds of eggs be represented by the random variables X and Y respectively.
Given that the joint probability density function of these variables is given by
a) Find the marginal PDF of X
b) Find the marginal PDF of Y
c) Find the P(X ≤ ^{1}⁄_{2}, Y ≤ ^{1}⁄_{2})
Solution:
a) The marginal PDF of X is given by g(x) where
b) The marginal PDF of Y is given by h(y) where
c) P(X ≤ ^{1}⁄_{2}, Y ≤ ^{1}⁄_{2}
Mixed Joint Probability Distribution
So far we’ve looked pairs of random variables where both variables are either discrete
or continuous. A joint pair of random variables can also be composed of one discrete
and one continuous random variable. This gives rise to what is known as a mixed
joint probability distribution.
The density function for a mixed probability distribution is given by
where by X is a continuous random variable and Y is a discrete random variable,
g(x) is the marginal pdf of X.
The cumulative distribution function is given by
Conditional Probability Distribution
Conditional Probability Distributions arise from joint probability distributions
where by we need to know that probability of one event given that the other event
has happened, and the random variables behind these events are joint.
Conditional probability distributions can be discrete or continuous, but the follow
the same notation i.e.
where the above is the conditional probability of X given that Y = y.
The conditional probability of variable Y given that X = x is given by:
The conditional probability distribution for a discrete set of random variables
can be found from:
where the above is the probability that X lies between a and b given
that Y = y.
For a set of continuous random variables, the above probability is given as:
Two random variables are said to be statistically independent if their conditional
probability distribution is given by the following:
where g(x) is the marginal pdf of X and h(y) is the marginal pdf of Y.