Probability/Conditional Distributions

From testwiki
Jump to navigation Jump to search

Template:Nav


Motivation

Suppose there is an earthquake. Let X be the number of casualties and Y be the Richter scale of the earthquake.

(a) Without given anything, what is the distribution of X?

(b) Given that Y=1 , what is the distribution of X?

(c) Given that Y=9 , what is the distribution of X?

Template:Colored remark Are your answers to (a),(b),(c) different?

In (b) and (c), we have the Template:Colored em distribution of X given Y=1, and the Template:Colored em distribution of X given Y=9 respectively.

In general, we have Template:Colored em of X given Y (Template:Colored em observing the value of Y), or X given Y=y (Template:Colored em observing the value of Y).

Conditional distributions

Recall the definition of Template:Colored em: (A|B)=(AB)(B), in which A,B are events, with (B)>0. Applying this definition to Template:Colored em X,Y, we have (X=x|Y=y)=(X=xY=y)(Y=y)=f(x,y)fY(y), where f(x,y) is the joint pmf of X and Y, and fY(y) is the marginal pmf of Y. It is natural to call such conditional probability as Template:Colored em, right? We will denote such conditional probability as fX|Y(x|y). Then, this is basically the definition of Template:Colored em pmf: Template:Colored em pmf of X given Y=y is the conditional probability (X=x|Y=y). Naturally, we will expect that Template:Colored em is defined similarly. This is indeed the case: Template:Colored definition Template:Colored remark To understand the definition more intuitively for the continuous case, consider the following diagram.

Top view:
     
        |
        |
        *---------------* 
        |               |
        |               |
fixed y *===============* <--- corresponding interval
        |               |
        |               |
        *---------------*
        |
        *---------------- x

Side view:

          *  
         / \ 
        *\  *  /                                           
       /|#\   \
   |  / |##\ / *---------*
   | *  |###\            /\
   | |\ |##/#\----------/--\     
   | | \|#/###*--------*   /                             
   | |  \/############/#\ /                              
   | |y *\===========/===*                               
   | | /  *---------*   /                                
   | |/              \ /                                 
   | *----------------*                                  
   |/                                                    
   *------------------------- x                          


Front view:
             
    |
    |
    |               
    *\     
    |#\    
    |##\   
    |###\             
    |####\   <------ Area: f_Y(y)
    |#####*--------*  
    |###############\ 
    *================*-------------- x

*---*
|###| : corresponding cross section from joint pdf
*---*   

We can see that when we are conditioning Y=y, we take a "slice" out from the region under joint pdf, and the area of the "whole slice" is the area between the Template:Colored em joint pdf f(x,y) with fixed y and variable x, and the x-axis. Since the area is given by f(x,y)dx=fY(y), while according to the probability axioms, the area should equal 1. Hence, we scale down the area of "slice" by a factor of fY(y), by dividing the univariate joint pdf f(x,y) by fY(y). After that, the curve at the top of scaled "slice" is the graph of the conditional pdf f(x,y)fY(y).

Now, we have discussed the case where both random variables are discrete or continuous. How about the case where one of them is discrete and another one is continuous? In this case, there is no "joint probability function" of these two random variables, since one is discrete and another is continuous! But, we can still define the conditional probability function in some other ways. To motivate the following definition, let FX|Y(x|y) be the conditional probability (Xx|Y=y). Then, differentiating FX|Y(x|y) with respect to x should yield the conditional pdf fX|Y(x|y). So, we have fX|Y(x|y)=ddxFX|Y(x|y)=limh0(Xx+h|Y=y)(Xx|Y=y)h=limh0(x<Xx+h|Y=y)h=limh0(Y=y|x<Xx+h)(x<Xx+h)h(Y=y)=limh0(Y=y|x<Xx+h)(x<Xx+h)h(Y=y)=limh0(Y=y|xXx+h)(Y=y)limh0(x<Xx+h)h=(Y=y|X=x)ddxFX(x)(Y=y)=(Y=y|X=x)fX(x)(Y=y). Thus, it is natural to have the following definition. Template:Colored definition Now, how about the case where X is discrete and Y is continuous? In this case, let us use the above definition for the motivation of definition. However, we should interchange X and Y so that the assumptions are still satisfied. Then, we get fY|X(y|x)=(X=x|Y=y)fY(y)(X=x). In this case, X is discrete, so it is natural to define the conditional pmf of X given Y=y as (X=x|Y=y) in the expression. Now, after rearranging the terms, we get (X=x|Y=y)=fY|X(y|x)(X=x)fY(y). Thus, we have the following definition. Template:Colored definition Based on the definitions of conditional probability functions, it is natural to define the Template:Colored em cdf as follows. Template:Colored definition Template:Colored remark Graphical illustration of the definition (continuous random variables):

Top view:
     
        |
        |
        *---------------* 
        |               |
        |               |
fixed y *=========@=====* <--- corresponding interval
        |         x     |
        |               |
        *---------------*
        |
        *---------------- 

Side view:

          *  
         / \ 
        *\  *  /                                           
       /|#\   \
   |  / |##\ / *---------*
   | *  |###\            /\
   | |\ |##/#\----------/--\     
   | | \|#/###*--------*   /                             
   | |  \/#########   / \ /                              
   | |y *\========@==/===*                               
   | | /  *-------x-*   /                                
   | |/              \ /                                 
   | *----------------*                                  
   |/                                                    
   *------------------------- x                          


Front view:

    |
    |
    |
    *\      
    |#\    
    |##\              
    |###\             
    |####\   <------------- Area: f_Y(y)         
    |#####*--------*  
    |###########    \ 
    *==========@=====*--------------  
               x
*---*
|###| : the desired region from the cross section from joint pdf, whose area is the probability from the cdf
*---*   

If Y=𝟏{A} for some event A, we have some special notations for simplicity:

  • the conditional probability function of X given Y=y becomes

fX|Y(x|y)={f(x|A),y=1;f(x|Ac),y=0.

  • the conditional cdf of X given Y=y becomes

FX|Y(x|y)=(Xx|Y=y)={F(x|A),y=1;F(x|Ac),y=0. Template:Colored proposition

Proof. Recall the definition of independence between two random variables:

X,Y are independent if

f(x,y)=fX(x)fY(y)

for each x,y.

Since fX|Y(x|y)=f(x,y)fX(x)fY(y)fY(y)=fX(x) and fY|X(y|x)=f(y,x)fY(y)fX(x)fX(x)=fY(y) for each x,y, we have the desired result.

Template:Colored remark

We can extend the definition of conditional probability function and cdf to groups of random variables, for joint cdf's and joint probability functions, as follows: Template:Colored definition Then, we also have a similar proposition for determining independence of two random vectors. Template:Colored proposition

Proof. The definition of independence between two random vectors is

  • 𝐗=(X1,,Xr)T,𝐘=(Y1,,Ys)T are independent if

f(x1,,xr,y1,,ys)=f𝐗(x1,,xr)f𝐘(y1,,ys)

for each x1,,xr,y1,,ys.

Since f𝐗|𝐘(x1,,xr|y1,,ys)=f(x1,,xr,y1,,ys)f𝐗(x1,,xr)f𝐘(y1,,ys)f𝐘(y1,,ys)=f𝐗(x1,,xr) and f𝐘|𝐗(y1,,ys|x1,,xr)=f(y1,,ys,x1,,xr)f𝐘(y1,,ys)f𝐗(x1,,xr)f𝐗(x1,,xr)=f𝐘(y1,,ys) for each x1,,xr,y1,,ys, we have the desired result.

Conditional distributions of bivariate normal distribution

Recall from the [[../Important Distributions]] chapter that the joint pdf of 𝒩2(μ,Σ) is f(x,y)=12πσXσY1ρ2exp(12(1ρ2)((xμXσX)22ρ(xμXσX)(yμYσY)+(yμYσY)2)),(x,y)2, and X𝒩(μX,σX2) and Y𝒩(μY,σY2) in this case. in which ρ=ρ(X,Y) and σX,σY are positive. Template:Colored proposition

Proof.

  • First, the conditional pdf

fX|Y(x|y)= def f(x,y)fY(y)=12πσXσY1ρ2exp(12(1ρ2)((xμXσX)22ρ(xμXσX)(yμYσY)+(yμYσY)2))/12πσY2exp((yμY)2/2σY2)=12πσX2(1ρ2)exp(12(1ρ2)((xμXσX)22ρ(xμXσX)(yμYσY)+(yμYσY)2)+(yμY)2/2σY2)=12πσX2(1ρ2)exp(12(1ρ2)((xμXσX)22ρ(xμXσX)(yμYσY)+(yμYσY)2(1ρ2)(yμYσY)2))=12πσX2(1ρ2)exp(12σX2(1ρ2)((xμX)22ρσXσY(xμX)(yμY)+(ρσXσY(yμY))2))=12πσX2(1ρ2)exp(12σX2(1ρ2)((xμX)(ρσXσY(yμY)))2)=12πσX2(1ρ2)exp(12σX2(1ρ2)(xμXρσXσY(yμY))2)

  • Then, we can see that X|(Y=y)𝒩(μX+ρσXσY(yμY),σX2(1ρ2)),
  • and by symmetry (interchanging X and Y, and also interchanging x and y), Y|(X=x)𝒩(μY+ρσYσX(xμX),σY2(1ρ2)).

Conditional version of concepts

We can obtain Template:Colored em version of concepts previously established for 'unconditional' distributions analogously for Template:Colored em distributions by substituting 'unconditional' cdf, pdf or pmf, i.e. F() or f(), by their Template:Colored em counterparts, i.e. F(|) or f(|).

Conditional independence

Template:Colored definition Template:Colored remark Template:Colored example Template:Colored example

Conditional expectation

Template:Colored definition Template:Colored remark Similarly, we have conditional version of law of the unconscious statistician. Template:Colored proposition Template:Colored proposition

Proof. 𝔼[g(X)|Y]={xg(x)fX|Y(x|Y)=xg(x)fX(x)=𝔼[g(X)],X is discrete;g(x)fX|Y(x|Y)dx=g(x)fX(x)dx=𝔼[g(X)],X is continuous.

Template:Colored remark Template:Colored example The properties of 𝔼[] still hold for conditional expectations 𝔼[|Y], with Template:Colored em 'unconditional' expectation replaced by Template:Colored em expectation and some suitable modifications, as follows: Template:Colored proposition

Proof. The proof is similar to the one for 'unconditional' expectations.

Template:Colored remark The following theorem about conditional expectation is quite important. Template:Colored theorem

Proof. 𝔼[𝔼[g(X)|Y]]={y𝔼[g(X)|Y=y]fY(y)=x(yg(x)fX|Y(x|y)f(x,y)/fY(y)fY(y))=xg(x)(yf(x,y)fX(x))=𝔼[g(X)],X is discrete;𝔼[g(X)|Y=y]fY(y)dy=(g(x)fX|Y(x|y)f(x,y)/fY(y)dx)fY(y)dy=g(x)(f(x,y)dyfX(x))dx=𝔼[g(X)],X is continuous.

Template:Colored remark Template:Colored corollary

Proof.

  • First,

𝔼[𝟏{A}|Y]=1((𝟏{A}=1|Y)+0((𝟏{A}=0|Y)=(A|Y).

  • Then, using law of total expectation,

𝔼Y[(A|Y)]= above 𝔼Y[𝔼[𝟏{A}|Y]]=𝔼[𝟏{A}]=(A).

Template:Colored remark Template:Colored corollary

Proof. Define Y=i if Ai occurs, in which i is a positive integer. Then, 𝔼[X]=𝔼Y[𝔼X[X|Y]]=i=1𝔼X[X|Y=i](Y=i)=i=1𝔼[X|Ai](Ai)

Template:Colored remark Template:Colored example Template:Colored corollary

Proof. By the formula of expectation computed by weighted average of conditional expectations, 𝔼[X𝟏{A}]=𝔼[X𝟏{A}1|A](A)+𝔼[X𝟏{A}0|Ac](Ac)=𝔼[X|A](A), and the result follows if (A)>0.

Template:Colored remark After defining Template:Colored em expectation, we can also have Template:Colored em variance, covariance and correlation coefficient, since variance, covariance, and correlation coefficient are built upon expectation.

Conditional expectations of bivariate normal distribution

Template:Colored proposition

Proof.

  • The result follows from the proposition about conditional distributions of bivariate normal distribution readily.


Conditional variance

Template:Colored definition Similarly, we have properties of Template:Colored em variance which are similar to that of variance. Template:Colored proposition

Proof. The proof is similar to the one for properties of variance.

Beside law of total expectation, we also have law of total variance, as follows: Template:Colored proposition

Proof. 𝔼[Var(X|Y)]+Var(𝔼[X|Y])=𝔼[𝔼[X2|Y](𝔼[X|Y])2]+𝔼[(𝔼[X|Y])2](𝔼[𝔼[X|Y]])2=𝔼[𝔼[X2|Y]]+𝔼[(𝔼[X|Y])2]+𝔼[(𝔼[X|Y])2](𝔼[𝔼[X|Y]])2=𝔼[X2](𝔼[X])2by law of total expectation=Var(X)

Template:Colored remark

Conditional variances of bivariate normal distribution

Template:Colored proposition

Proof.

  • The result follows from he proposition about conditional distributions of bivariate normal distribution readily.

Template:Colored remark

Conditional covariance

Template:Colored definition Template:Colored proposition

Conditional correlation coefficient

Template:Colored definition Template:Colored remark

Conditional quantile

Template:Colored definition Template:Colored remark

Template:Nav