Probability/Random Variables

Random variable

Motivation

In many experiments, there may be so many possible outcomes in the sample space that we may want to instead work with a "summary variable" for those outcomes. For example, suppose a poll is conducted for 100 different people to ask them whether they agree with a certain proposal. Then, to keep track of the answers from those 100 people completely, we may first use a number to indicate the response:

number "1" for "agree".
number "0" for "disagree".

(For simplicity, we assume that there are only these two responses available.) After that, to record which person answer which response, we use a vector with 100 numbers for the record. For example, $(1, 0, 1, 0, 0, \dots, 1, 0, 0)$ , etc. Since for every coordinate in the vector, there are two choices: "0" or "1", there are in total $2^{100} \approx 1.268 \times 1 0^{30}$ different vectors in the sample space (denoted by $Ω$ )! Hence, it is very tedious and complicated to work with that many outcomes in the sample space $Ω$ . Instead, we are often only interested in how many "agree" and "disagree" are there, instead of which person answers which response, since the number of "agree" and "disagree" determines whether the proposal is agreed by majority of them, and thus captures the essence of the poll.

Hence, it is more convenient to define a variable $X$ which gives the number of "1"s in the 100 coordinates in every outcome in the sample space $Ω$ . Then, $X$ can only take 101 possible values: 0,1,2,...,100, which is much fewer than the number of outcomes in the original sample space.

Through this, we can change the original experiment to a new experiment, where the variable $X$ takes one of the 101 possible values according to certain probabilities. For this new experiment, the sample space becomes ${0, 1, \dots, 100}$ .

During the above process of defining the variable $X$ (called Template:Colored em), we have actually (implicitly) defined a function where the domain is the original sample space, and the range is ${0, 1, \dots, 100}$ . Usually, we take the codomain of the random variable to be the set of all real numbers $ℝ$ . That is, we define the random variable $X : Ω \to ℝ$ by $X (ω) = number of 1s in the coordinates of ω$ for every $ω \in Ω$ .

Definition

To define random variable formally, we need the concept of measurable function: Template:Colored definition Template:Colored remark Template:Colored definition Template:Colored remark By defining a random variable $X : Ω \to ℝ$ from a probability space $(Ω, ℱ, ℙ)$ , we actually Template:Colored em a new probability space $(𝒳, ℱ_{X}, ℙ_{X})$ where

The induced sample space $𝒳$ is the Template:Colored em of the random variable $X$ : $𝒳 = {X (ω) : ω \in Ω} \subseteq ℝ$ .
The induced event space $ℱ_{X}$ is a $σ$ -algebra of $𝒳$ . (Here, we follow our previous convention: $ℱ_{X} = 𝒫 (𝒳)$ when $𝒳$ is countable.)
The induced probability measure $ℙ_{X} : ℱ_{X} \to [0, 1]$ is defined by

$ℙ_{X} (E) = ℙ ({X \in E})$

for every

E \in ℱ_{X}

.

It turns out the induced probability measure satisfies all the probability axioms: Template:Colored example After proving this result, it follows that all properties of probability measure discussed previously also apply to the induced probability measure $ℙ_{X}$ . Hence, we can use the properties of probability measure to calculate the probability $ℙ_{X} (E)$ , and hence $ℙ (X \in E)$ , for every set $E \in ℱ_{X}$ . More generally, to calculate the probability $ℙ (X \in B)$ for every $B \in ℬ$ ( $B$ does not necessarily belong to $ℱ_{X}$ ), we notice that ${X \in B} = {X \in B \cap 𝒳}$ , and it turns out that $B \cap 𝒳 \in ℱ_{X}$ . Hence, we can calculate $ℙ (X \in B)$ by considering $ℙ_{X} (B \cap 𝒳)$ . Template:Colored example Template:Colored exercise Sometimes, even it is infeasible to list out all sample points in the sample space, we can also determine the probability related to the random variable. Template:Colored example A special kind of random variable that is quite useful is the indicator random variable, which is a special case of Template:Colored em: Template:Colored definition Template:Colored remark Template:Colored example Template:Colored example

Cumulative distribution function

For every random variable $X$ , there is function associating with it, called the Template:Colored em (cdf) of $X$ : Template:Colored definition Template:Colored example We can see from the cdf in the example above that the cdf is not necessarily continuous. There are several discontinuities at the jump points. But we can notice that at each jump point the cdf takes the value at the Template:Colored em of the jump, by the definition of cdf (the inequality involved includes also the equality). Loosely speaking, this suggests that the cdf is Template:Colored em. However, the cdf is not Template:Colored em in general.

In the following, we will discuss three Template:Colored em properties of cdf. Template:Colored theorem

Proof. Only if part ( $F$ is cdf $\Rightarrow$ these three properties):

(i) It follows the axioms of probability since $F$ is defined to be a probability.

(ii) $\begin{matrix} x \leq y & \Rightarrow {X \leq x} \subseteq {X \leq y} \\ \Rightarrow ℙ (X \leq x) \leq ℙ (X \leq y) & by monotonicity \\ \Rightarrow F (x) \leq F (y) & by definition \end{matrix}$

(iii) Fix an arbitrary positive sequence $ϵ_{1} > ϵ_{2} > \dots$ with $\lim_{n \to \infty} ϵ_{n} = 0$ . Define $E_{n} = {X \leq x + ϵ_{n}}$ for each positive number $n$ . It follows that $E_{1} \supset E_{2} \supset \dots$ . Then, $ℙ (X \leq x) = ℙ \underset{{X \leq x + 0}}{\underset{⏟}{(\lim_{n \to \infty} E_{n})}} = ℙ (\lim_{n \to \infty} E_{1} \cap E_{2} \cap \dots E_{n}) = \lim_{n \to \infty} ℙ (E_{1} \cap \dots \cap E_{n}) = \lim_{n \to \infty} ℙ (E_{n}) = \lim_{n \to \infty} ℙ (X \leq x + ϵ_{n})$ It follows that $F (x) = \lim_{n \to \infty} F (x + ϵ_{n})$ for each $ϵ_{1} > ϵ_{2} > \dots$ with $ϵ_{n} \to 0$ as $n \to \infty$ . That is, $\lim_{h \to 0^{+}} F (x + h) = F (x)$ which is the definition of right-continuity.

If part is more complicated. The following is optional. Outline:

Draw an arbitrary curve satisfying the three properties.
Throw a fair coin infinitely many times.
Encode each result into a binary number, e.g. $H H T \dots \to 0.110 \dots$
Transform each binary number to a decimal number, e.g. $0.110 \dots \to 1 (2^{- 1}) + 1 (2^{- 2}) = 0.75 \dots$ . Then, the decimal number is a random variable $U \in [0, 1]$ .
Use this decimal number as the input of the inverse function of the arbitrarily drawn curve, and we get a value, which is also a random variable, say $X$ .
Then, we obtain a cdf of the random variable $X$ $F (x) = ℙ (X \leq x) = ℙ (U \leq F (x))$ , if we throw a fair coin infinitely many times.

$◻$

Sometimes, we are only interested in the values $x$ such that $ℙ (X = x) \neq 0$ , which are more 'important'. Roughly speaking, the values are actually the elements of the Template:Colored em of $X$ , which is defined in the following. Template:Colored definition Template:Colored remark Template:Colored example Template:Colored remark Template:Colored exercise

Discrete random variables

Template:Colored definition Template:Colored example Template:Colored exercise Often, for discrete random variable, we are interested in the probability that the random variable takes a specific value. So, we have a function that gives the corresponding probability for each specific value taken, namely Template:Colored em. Template:Colored definition Template:Colored remark Template:Colored example Template:Colored exercise

Continuous random variables

Suppose $X$ is a discrete random variable. Partitioning $S$ into small disjoint intervals $[x_{1}, x_{1} + Δ x_{1}], \dots$ gives $ℙ (X \in S) = ℙ (X \in ⋃_{i} [x_{i} + Δ x_{i}]) = \sum_{i} ℙ (X \in [x_{i} + x_{i} + Δ x_{i}]) = \sum_{i} \underset{probability per unit}{\underset{⏟}{\frac{ℙ (X \in [x_{i} + x_{i} + Δ x_{i}])}{Δ x_{i}}}} \cdot Δ x_{i} .$ In particular, the probability per unit can be interpreted as the density of the probability of $X$ over the interval. (The higher the density, the more probability is distributed (or allocated) to that interval).

Taking limit, $\lim_{Δ x_{i} \to 0} \sum_{i} \underset{density}{\underset{⏟}{\frac{ℙ (X \in [x_{i} + x_{i} + Δ x_{i}])}{Δ x_{i}}}} \cdot Δ x_{i} = \int_{S} \underset{density}{\underset{⏟}{f (x)}} d x,$ in which, intuitively and non-rigorously, $f (x) d x$ can be interpreted as the probability over 'infinitesimal' interval $[x, x + d x]$ , i.e. $ℙ (X \in [x, d x])$ , and $f (x)$ can be interpreted as the density of the probability over the 'infinitesimal' interval, i.e. $\frac{ℙ (X \in [x, d x])}{d x}$ .

These motivate us to have the following definition. Template:Colored definition Template:Colored remark The name Template:Colored em r.v. comes from the result that the cdf of this kind of r.v. is continuous. Template:Colored proposition

Proof. Since $\lim_{h \to 0} F (x + h) = \lim_{h \to 0} \int_{- \infty}^{x + h} f (u) d u = \int_{- \infty}^{x} f (x) d x = F (x)$ (Riemann integral is continuous), the cdf is continuous.

$◻$

Template:Colored example Template:Colored exercise Template:Colored proposition

Proof. This follows from fundamental theorem of calculus: $F^{'} (x) = \frac{d}{d x} \int_{- \infty}^{x} f (u) d u = f (x) .$

$◻$

Template:Colored remark Without further assumption, pdf is Template:Colored em unique, i.e. a random variable may have multiple pdf's, since, e.g., we may set the value of pdf to be a real number at a single point outside its support (without affecting the probabilities, since the value of pdf at a single point is zero regardless of the value), and this makes another valid pdf for a random variable. To tackle this, we conventionally set $f (x) = 0$ for each $x \notin supp (X)$ to make the pdf become unique, and make the calculation more convenient. Template:Colored example Template:Colored exercise

Mixed random variables

You may think that a random variable can either be discrete or continuous after reading the previous two sections. Actually, this is wrong. A random variable can be neither discrete nor continuous. An example of such random variable is Template:Colored em random variable, which is discussed in this section. Template:Colored theorem Template:Colored remark An example of singular random variable is the Cantor distribution function (sometimes known as Devil's Staircase), which is illustrated by the following graph. The graph pattern keeps repeating when you enlarge the graph.

Template:Colored example Template:Colored exercise Template:Hide $◻$ Template:Nav

Template:BookCat

Probability/Random Variables

Contents

Random variable

Motivation

Definition

Cumulative distribution function

Discrete random variables

Continuous random variables

Mixed random variables

Navigation menu

Probability/Random Variables

Random variable

Motivation

Definition

Cumulative distribution function

Discrete random variables

Continuous random variables

Mixed random variables

Navigation menu

Search