Probability/Joint Distributions and Independence

Motivation

Suppose we are given a pmf of a discrete random variable $X$ and a pmf of a discrete random variable $Y$ . For example, $f_{X} (x) = (𝟏 {x = 0} + 𝟏 {x = 1}) / 2 and f_{Y} (y) = (𝟏 {y = 0} + 𝟏 {y = 2}) / 2$ We cannot tell the relationship between $X$ and $Y$ with only such information. They may be related or not related.

For example, the random variable $X$ may be defined as $X = 1$ if head comes up and $X = 0$ otherwise from tossing a fair coin, and the random variable $Y$ may be defined as $Y = 2$ if head comes up and $Y = 0$ otherwise from tossing the coin another time. In this case, $X$ and $Y$ are unrelated.

Another possibility is that the random variable $Y$ is defined as $Y = 2 X$ if head comes up in the first coin tossing, and $Y = 0$ otherwise. In this case, $X$ and $Y$ are related.

Yet, in the above two examples, the pmf of $X$ and $Y$ are exactly the same.

Therefore, to tell the Template:Colored em between $X$ and $Y$ , we define the Template:Colored em cumulative distribution function, or joint cdf.

Joint distributions

Template:Colored definition Sometimes, we may want to know the random behaviour in one of the random variables involved in a joint cdf. We can do this by computing the marginal cdf from joint cdf. The definition of marginal cdf is as follows: Template:Colored definition Template:Colored remark Template:Colored proposition

Proof. When we set the arguments other than $i$ -th argument to be $\infty$ , e.g. $X_{1} \leq \infty \Leftrightarrow \lim_{x \to \infty} X_{1} \leq x$ , the joint cdf becomes $\begin{matrix} ℙ (X_{1} \leq \infty \cap \dots \cap X_{i - 1} \leq \infty \cap X_{i} \leq x \cap X_{i + 1} \leq \infty \cap \dots \cap X_{n} \leq \infty) & = \underset{1}{\underset{⏟}{ℙ (X_{1} \leq \infty \cap \dots \cap X_{i - 1} \leq \infty)}} ℙ (X_{i} \leq x) \underset{1}{\underset{⏟}{ℙ (X_{i + 1} \leq \infty \cap \dots \cap X_{n} \leq \infty)}} by independence \\ = ℙ (X_{i} \leq x) \\ = F_{X_{i}} (x) \end{matrix}$ $◻$

Template:Colored remark Template:Colored example Similar to the one-variable case, we have joint pmf and joint pdf. Also, analogously, we have marginal pmf and marginal pdf.

Template:Colored definition Template:Colored definition Template:Colored proposition

Proof. Consider the case in which there are only two random variables, say $X$ and $Y$ . Then, we have $\sum_{y} f (x, y) = \sum_{y} ℙ (X = x \cap Y = y) = ℙ (X = x) by law of total probability .$ Similarly, in general case, we have $\begin{matrix} \sum_{u_{n}} f (u_{1}, \dots, u_{i - 1}, x, u_{i + 1}, \dots, u_{n}) & = \sum_{u_{n}} ℙ (X_{1} \leq u_{1} \cap \dots \cap X_{i - 1} u_{i - 1} \cap X_{i} \leq x \cap X_{i + 1} \leq u_{i + 1} \cap \dots \cap X_{n - 1} \leq u_{n - 1} \cap X_{n} \leq u_{n}) \\ = ℙ (X_{1} \leq u_{1} \cap \dots \cap X_{i - 1} u_{i - 1} \cap X_{i} \leq x \cap X_{i + 1} \leq u_{i + 1} \cap \dots \cap X_{n - 1} \leq u_{n - 1}) by law of total probability . \end{matrix}$ Then, we perform similar process on each of the other variables ( $n - 2$ left), with one extra summation sign added for each process. Thus, in total we will have $n - 1$ summation sign, and we will finally get the desired result. $◻$

Template:Colored remark Template:Colored example Template:Colored exercise Template:Colored exercise Template:Hide For Template:Colored em continuous random variables, the definition is generalized version of the one for continuous random variables (univariate case). Template:Colored definition Template:Colored remark Template:Colored definition Template:Colored proposition

Proof. Recall the proposition about obtaining marginal cdf from joint cdf. We have $\begin{matrix} F_{X_{i}} (x) & = F (\infty, \dots, \infty, \overset{i -th position}{\overset{⏞}{x}}, \infty, \dots, \infty) \\ \Rightarrow & \int_{- \infty}^{x} f_{X_{i}} (u) d u & = \int_{- \infty}^{\infty} \dots \int_{- \infty}^{x} \dots \int_{- \infty}^{\infty} f (u_{1}, \dots, u_{n}) d u_{n} \dots d u_{i} \dots d u_{1} by definitions \\ \Rightarrow & \frac{d}{d x} \int_{- \infty}^{x} f_{X_{i}} (u) d u & = \frac{d}{d x} \int_{- \infty}^{\infty} \dots \int_{- \infty}^{x} \dots \int_{- \infty}^{\infty} f (u_{1}, \dots, u_{n}) d u_{n} \dots d u_{i} \dots d u_{1} \\ \Rightarrow & f_{X_{i}} (x) & = \underset{n - 1 integrations}{\underset{⏟}{\int_{- \infty}^{\infty} \dots \int_{- \infty}^{\infty}}} f (u_{1}, \dots, u_{i - 1}, x, u_{i + 1}, \dots, u_{n}) d u_{1} \dots d u_{i - 1} d u_{i + 1} \dots d u_{n} by fundamental theorem of calculus \end{matrix}$

$◻$

Template:Colored proposition

Proof. It follows from using fundamental theorem of calculus $n$ times.

$◻$

Template:Colored example Template:Colored exercise

Independence

Recall that multiple events are independent if the probability for the intersection of them equals the product of probabilities of each event, by definition. Since ${X \in A}$ is also an event, we have a natural definition of independence for Template:Colored em as follows: Template:Colored definition Template:Colored remark Template:Colored theorem

Proof. Partial:

Only if part: If random variables $X_{1}, X_{2}, \dots, X_{n}$ are independent, $ℙ (X_{1} \in A_{1} \cap \dots \cap X_{n} \in A_{n}) = ℙ (X_{1} \in A_{1}) \dots ℙ (X_{n} \in A_{n})$ for each $n$ and for each subset $A_{1}, A_{2}, \dots, A_{n} \subseteq ℝ$ . Setting $A_{1} = (- \infty, x_{1}), \dots, A_{n} = (- \infty, x_{n})$ , and we have $ℙ (X_{1} \leq x_{1} \cap \dots \cap X_{n} \leq x_{n}) = ℙ (X_{1} \leq x_{1}) \dots ℙ (X_{n} \leq x_{n}) ⟹ F (x_{1}, \dots, x_{n}) = F_{X_{1}} (x_{1}) \dots F_{X_{n}} (x_{n}) .$ Thus, we obtain the result for the joint cdf part.

For the joint pdf part, $\begin{matrix} F (x_{1}, \dots, x_{n}) & = F_{X_{1}} (x_{1}) \dots F_{X_{n}} (x_{n}) \\ \Rightarrow & \frac{\partial^{n}}{\partial x_{1} \dots \partial x_{n}} F (x_{1}, \dots, x_{n}) & = \frac{\partial^{n}}{\partial x_{1} \dots \partial x_{n}} (F_{X_{1}} (x_{1}) \dots F_{X_{n}} (x_{n})) \\ \Rightarrow & f (x_{1}, \dots, x_{n}) & = f_{X_{n}} (x_{n}) \frac{\partial^{n}}{\partial x_{1} \dots \partial x_{n - 1}} (F_{X_{1}} (x_{1}) \dots F_{X_{n - 1}} (x_{n - 1})) \\ = f_{X_{n}} (x_{n}) f_{X_{n - 1}} (x_{n - 1}) \frac{\partial^{n}}{\partial x_{1} \dots \partial x_{n - 2}} (F_{X_{1}} (x_{1}) \dots F_{X_{n - 2}} (x_{n - 2})) \\ = \dots = f_{X_{1}} (x_{1}) \dots f_{X_{n}} (x_{n}) \end{matrix}$

$◻$

Template:Colored remark Template:Colored example Template:Colored exercise Template:Colored proposition Template:Colored example Template:Colored exercise

Sum of independent random variables (optional)

In general, we use joint cdf, pdf or pmf to determine the distribution of sum of independent random variables by first principle. In particular, there are some interesting results related to the distribution of Template:Colored em of Template:Colored em random variables. Template:Collapse top Template:Colored proposition

Proof.

Continuous case:

cdf: $\begin{matrix} F_{X + Y} (z) & = ℙ (X + Y \leq z) & by definition \\ = \iint_{x + y \leq z} f_{X} (x) f_{Y} (y) d x d y & by definition and independence \\ = \int_{- \infty}^{\infty} \int_{- \infty}^{z - y} f_{X} (x) f_{Y} (y) d x d y & by Fubini's theorem \\ = \int_{- \infty}^{\infty} (\int_{- \infty}^{z - y} f_{X} (x) d x) f_{Y} (y) d y \\ = \int_{- \infty}^{\infty} F_{X} (z - y) f_{Y} (y) d y & by definition . \end{matrix}$

/\                                     
//\ y                                
///\|
////*
////|\
////|/\
////|//\ x+y=z <=> x=z-y
////|///\
////|////\
----*-----*--------------- x 
////|//////\
////|///////\

-->: -infty to z-y
^
|: -infty to infty
 
*--*
|//| : x+y <= z
*--*

pdf: $\begin{matrix} f_{X + Y} (z) & = \frac{d}{d z} \int_{- \infty}^{\infty} F_{X} (z - y) f_{Y} (y) d y \\ = \int_{- \infty}^{\infty} \frac{d}{d z} F_{X} (z - y) f_{Y} (y) d y & by fundamental theorem of calculus \\ = \int_{- \infty}^{\infty} f_{X} (z - y) f_{Y} (y) d y . \end{matrix}$

$◻$

Template:Colored remark Template:Colored example Template:Colored proposition

Proof.

Let $E_{i} = {X = i} \cap {Y = n - i}$ .
For each nonnegative integer $n$ ,

${X + Y = n} = E_{0} \cup E_{1} \cup \dots \cup E_{n} .$

Since $E_{i} \cap E_{j} = \emptyset$ for each $i \neq j$ , $E_{i}$ 's are pairwise disjoint.
Hence, by extended P3 and independence of $X$ and $Y$ , $ℙ (X + Y = n) = ℙ (X = 0) ℙ (Y = n) + ℙ (X = 1) ℙ (Y = n - 1) + \dots + ℙ (X = n) ℙ (Y = 0) .$
The result follows by definition.

$◻$

Template:Colored example Template:Colored proposition

Proof.

The pmf of $X_{1} + X_{2}$ is

$\begin{matrix} f_{X_{1} + X_{2}} (a) & = \sum_{k = 0}^{n} \frac{e^{- λ_{1}} λ_{1}^{k}}{k!} \cdot \frac{e^{- λ_{2}} λ_{2}^{n - k}}{(n - k)!} & by the proposition about convolution of pmf's \\ = e^{- λ_{1} - λ_{2}} \sum_{k = 0}^{n} \frac{λ_{1}^{k} \cdot λ_{2}^{n - k}}{k! (n - k)!} \\ = \frac{e^{- (λ_{1} + λ_{2})}}{n!} \underset{= (λ_{1} + λ_{2})^{n}}{\underset{⏟}{\sum_{k = 0}^{n} \frac{n!}{k! (n - k)!} \cdot λ_{1}^{k} \cdot λ_{2}^{n - k}}} & by binomial theorem . \end{matrix}$

This pmf as the pmf of $Pois (λ_{1} + λ_{2})$ , and so $X_{1} + X_{2} \sim Pois (λ_{1} + λ_{2})$ .
We can extend this result to $n$ Poisson r.v.'s by induction.

$◻$

Template:Colored example Template:Collapse bottom

Order statistics

Template:Colored definition Template:Colored proposition

Proof.

Consider the event ${X_{(k)} \leq x}$ .

                          Possible positions of x
                      |<--------------------->
    *---*----...------*----*------...--------*
X  (1)  (2)          (k)  (k+1)             (n)
                      |----------------------> when x moves RHS like this, >=k X_i are at the LHS of x

We can see from the above figure that ${X_{(k)} \leq x} = {at least k of the X_{i}'s are \leq x}$ .
Let no. of $X_{i}$ 's that are less than or equal to $x$ be $N$ .
Since $N \sim Binom (n, ℙ (X_{i} \leq x)) \overset{def}{=} Binom (n, F (x))$ (because for each $X_{i}$ , we can treat $X_{i} \leq x$ and $X_{i} > x$ be the two outcomes in a Bernoulli trial),
The cdf is

$ℙ (X_{(k)} \leq x) = ℙ (N \geq k) = \sum_{j = k}^{n} (\binom{n}{j}) (F (x))^{j} (1 - F (x)) .$

$◻$

Template:Colored example

Poisson process

Template:Colored definition There are several important properties for Poisson process. Template:Colored proposition

Proof.

The time to $n$ -th event is $X_{1} + \dots + X_{n}$ , with each following $Exp (λ)$ .
It suffices to prove that $X_{1} + X_{2} \sim Gamma (2, λ)$ , and then the desired result follows by induction.
$\begin{matrix} f_{X_{1} + X_{2}} (z) & = λ^{2} \int_{- \infty}^{\infty} 𝟏 {\underset{x \leq z}{\underset{⏟}{z - x \geq 0}}} 𝟏 {x \geq 0} e^{- λ (z - x)} e^{- λ x} d x & by proposition about convolution of pdf's \\ = λ^{2} \int_{0}^{z} e^{- λ (z - x) - λ x} d x \\ = λ^{2} \int_{0}^{z} e^{- λ z} d x \\ = λ^{2} z e^{- λ z} \\ = \frac{λ^{2} z e^{- λ z}}{Γ (2)} & since Γ (2) = 1! = 1, \end{matrix}$

which is the pdf of

Γ (2, λ)

, as desired.

$◻$

Template:Colored remark Template:Colored proposition

Proof. For each nonnegative integer $n$ , let $V$ be the interarrival time between the $n$ -th and $n + 1$ -th arrival, and $W$ be the time to $n$ th event, starting from the beginning of the fixed time interval (we can treat the start to be time zero because of the memoryless property). The joint pdf of $(V, W)$ is $\begin{matrix} f (v, w) & = f_{V} (v) f_{W} (w) & by independence \\ = \underset{pdf of Exp (λ)}{\underset{⏟}{(λ e^{- λ v})}} \underset{pdf of Gamma (n, λ)}{\underset{⏟}{(\frac{λ^{n} w^{n - 1} e^{- λ w}}{(n - 1)!})}} . \end{matrix}$ Let $N$ the number of arrivals within the fixed time interval. The pmf of $N$ is $\begin{matrix} ℙ (N = n) & = ℙ (W \leq t \cap \underset{V > t - W}{\underset{⏟}{V + W > t}}) \\ = \int_{0}^{t} \int_{t - w}^{\infty} \underset{joint pdf of (V, W)}{\underset{⏟}{f (v, w)}} d v d w \\ = \int_{0}^{t} \int_{t - w}^{\infty} (λ e^{- λ v}) (\frac{λ^{n} w^{n - 1} e^{- λ w}}{(n - 1)!}) d v d w \\ = \int_{0}^{t} \frac{λ^{n} w^{n - 1} e^{- λ w}}{(n - 1)!} \int_{t - w}^{\infty} λ e^{- λ v} d v d w \\ = \frac{λ^{n}}{(n - 1)!} \int_{0}^{t} w^{n - 1} e^{- λ w} (0 - (- e^{- λ (t - w)})) d w \\ = \frac{λ^{n} e^{- λ t}}{(n - 1)!} \int_{0}^{t} w^{n - 1} d w \\ = \frac{λ^{n} e^{- λ t}}{(n - 1)!} \cdot (\frac{t^{n}}{n} - 0) \\ = \frac{e^{- λ t} (λ t)^{n}}{n!} \end{matrix}$ which is the pmf of $Pois (λ t)$ . The result follows.

$◻$

Template:Colored proposition

Proof. For each $t > 0$ , $\begin{matrix} ℙ (T > t) & = ℙ (T_{1} > t \cap \dots \cap T_{n} > t) \\ = ℙ (T_{1} > t) \dots ℙ (T_{n} > t) & by independence \\ = [1 - (\underset{cdf of Exp (λ_{1})}{\underset{⏟}{1 - e^{- λ_{1} t}}})] \dots [1 - (\underset{cdf of Exp (λ_{n})}{\underset{⏟}{1 - e^{- λ_{n} t}}})] \\ = e^{- t (λ_{1} + \dots + λ_{n})} \\ \Rightarrow & ℙ (T \leq t) & = 1 - e^{- t (λ_{1} + \dots + λ_{n})} \\ \Rightarrow & T & \sim Exp (λ_{1} + λ_{2} + \dots + λ_{n}) \end{matrix}$

$◻$

Template:Colored example Template:Colored exercise

Template:Nav

Probability/Joint Distributions and Independence

Contents

Motivation

Joint distributions

Independence

Sum of independent random variables (optional)

Order statistics

Poisson process

Navigation menu

Probability/Joint Distributions and Independence

Motivation

Joint distributions

Independence

Sum of independent random variables (optional)

Order statistics

Poisson process

Navigation menu

Search