Probability/Properties of Distributions

From testwiki
Jump to navigation Jump to search

Template:Nav


Introduction

Recall that pdf (or cdf) describes the random behaviour of a random variable Template:Colored em . However, we may sometimes find the pdf (or cdf) to be too complicated, and only want to know some Template:Colored em about the random variable. In view of this, we study some properties of distributions in this chapter, which provide Template:Colored em descriptions of the random behaviour of the random variable.

Some examples of such partial descriptions include

  • location (e.g. pdf is 'located' at left or right?),
  • dispersion (e.g. 'sharp' of 'flat' pdf?),
  • skewness (e.g. pdf is symmetric, skewed to left, or skewed to right?), and
  • tail property (e.g. pdf have 'light' or 'heavy' tails?).

We can Template:Colored em describe them, but such descriptions are quite subjective and inaccurate. To give a more objective and accurate measure to such descriptions, we evaluate them Template:Colored em using some quantitative measures derived from the pdf (or cdf) of the random variable.

We will discuss some of such quantitative measures in this chapter. Among these, the Template:Colored em is the most important one, since many of other properties base upon the concept of Template:Colored em.

Expectation

We have different alternative names for expectation, e.g. expected value and mean. Template:Colored definition Template:Colored remark Template:Colored example Template:Colored example Template:Colored exercise In the following, we introduce a useful result that gives the relationship between expectation and probability, we can use expectation to ease the computation of probability using this result. Template:Colored proposition

Proof. Let X=๐Ÿ{E}. Since X=๐Ÿ{E}Ber(โ„™(E)) (which is a discrete random variable), ๐”ผ[X]=0[โ„™(X=0)]+1[โ„™(X=1)]=โ„™(๐Ÿ{E}=1)=โ„™(E).

When there are multiple random variables involved, we may derive the joint pmf or pdf first to compute the expectation, but it can be quite difficult and complicated to do so. Practically, we use the following theorem more often. Template:Colored theorem Template:Colored remark The proof is quite complicated, and hence we skip it. In the following, we will introduce several properties of expectation that can help us to simplify computations of the expectation.

Template:Colored proposition

Proof.

Template:Colored em:

for continuous random variables X,Y, ๐”ผ[αX+βY+γ]=(αx+βy+γ)f(x,y)joint pdfdxdy=αxf(x,y)dymarginal pdf of Xdx+βyf(x,y)dxmargianl pdf of Ydy+γf(x,y)dxdy1=αxfX(x)dx๐”ผ[X]+βyfY(y)dy๐”ผ[Y]+γ=α๐”ผ[X]+β๐”ผ[Y]+γ. Similarly, for discrete random variables X,Y, ๐”ผ[αX+βY+γ]=xy(αx+βy+γ)f(x,y)=αxxyf(x,y)+βyyxf(x,y)+γxyf(x,y)=αxfX(x)+βyfY(y)+γ(1)=α๐”ผ[X]+β๐”ผ[Y]+γ.

Template:Colored em:

For continuous random variable X, 0X0xfX(x)0. Similarly, for discrete random variable X, x0X0xfX(x)0.

Template:Colored em:

For random variables X,Y that are either both discrete or both continuous, XYXY0๐”ผ[X]๐”ผ[Y]=linearity๐”ผ[XY]nonnegativity0.

Template:Colored em:

|X|X|X|monotonicity๐”ผ[|X|]๐”ผ[X]๐”ผ[|X|]

Template:Colored em:

For continuous random variables X,Y, ๐”ผ[XY]=xyf(x,y)joint pdfdxdy=xyfX(x)fY(y)marginal pdf'sdxdy=yfY(y)xfX(x)dxindependent from ydy=xfX(x)dxyfY(y)dy=๐”ผ[X]๐”ผ[Y]. Similarly, for discrete random variables X,Y, ๐”ผ[XY]=xyxyf(x,y)joint pmf=yxxyfX(x)fY(y)marginal pmf's=(xxfX(x))(yyfY(y))=๐”ผ[X]๐”ผ[Y].

Template:Colored remark

Mean of some distributions of a discrete random variable

Template:Colored proposition

Proof.

  • ๐”ผ[X]=0โ„™(X=0)=0+1โ„™(X=1)=p=p.
  • Since Y=X1++Xn, in which X1,,Xn are i.i.d. and follow Ber(p) [1],
  • ๐”ผ[Y]=๐”ผ[X1++Xn]= linearity ๐”ผ[X1]++๐”ผ[Xn]=p++pn times=np.

Template:Colored proposition

Proof. ๐”ผ[X]=k=0k(λkeλk!)โ„™(X=k)=λ(0+k=1k1=0k(λk1eλk(k1)!)โ„™(X=k1))=λ(0+1)=λ.

Template:Colored proposition

Proof.

  • Since

๐”ผ[X]=k=0k(1p)kpโ„™(X=k)=k=0(k1)(1p)kp+k=0(1p)kpโ„™(X=k)=1=(01)(1p)0p=p+((1p)k1=0(k1)(1p)k1p)+1=p+(1p)๐”ผ[X]+1,

  • it follows that p๐”ผ[X]=1p๐”ผ[X]=1pp..
  • Since Y=X1++Xk in which X1,,Xk are i.i.d., and follow Geo(p) [2],
  • ๐”ผ[Y]=๐”ผ[X1]++๐”ผ[Xk]=1pp++1ppk times=k(1p)p.

Template:Colored proposition

Proof.

  • Since X=X1++Xn in which X1,,XnBer(K/N) (each of the Bernoulli r.v.'s indicates whether the corresponding draw of ball is of type 1, with probability K/N without knowing the results of other draws [3], since each draw is equally likely to be any of the N balls) [4] ,
  • it follows that ๐”ผ[X]=๐”ผ[X1]++๐”ผ[Xn]=KN++KNn times=nKN.


Mean of some distributions of a continuous random variable

We will introduce the formulas for mean of some distributions of a Template:Colored em random variable, which are relatively simpler. Template:Colored proposition

Proof. ๐”ผ[X]=abxbadx=12(ba)(b2a2)=(ba)(b+a)2(ba).

Template:Colored proposition

Proof.

  • It suffices to prove the formula for mean of gamma r.v.'s, since exponential and chi-squared r.v.'s are essentially special cases of gamma r.v.'s, and thus we can simply substitute some values into the formula for mean of gamma r.v.'s to obtain the formulas for them.
  • ๐”ผ[X]=0xλαxα1eλxΓ(α)dx=αλ0λα+1xα+11eλxΓ(α+1)dx=F()=1,F is the cdf of Gamma(α+1,λ),=αλ.
  • Since Exp(λ)Gamma(1,λ), ๐”ผ[Y]=1/λ by substituting α=1.
  • Since χν2Gamma(ν/2,1/2), ๐”ผ[Z]=(ν/2)/(1/2)=ν by substituting α=ν/2 and λ=1/2.

Template:Colored proposition

Proof.

  • We use similar approach from the previous proof.

๐”ผ[X]=01xΓ(α+β)Γ(α)Γ(β)xα1(1x)β1dx=αα+β01Γ(α+β+1)Γ(α+1)Γ(β)xα+11(1x)β1dxF(1)=1,F is the cdf of Beta(α+1,β),=αα+β.

Template:Colored proposition

Proof. ๐”ผ[X]=๐”ผ[Xθ]+θby linearity,=θ+1π(xθ)11+(xθ)2dx=θ+1πu1+u2du,let u=xθdu=dx,=θ+1π(0u1+u2du+0u1+u2du)=θ+1π(12[ln(1+u2)]u=u=0+12[ln(1+u2)]u=0u=)=θ+1π(+undefined).

Template:Colored proposition

Proof.

  • Let Z=Xμσ๐’ฉ(0,1).
  • ๐”ผ[Z]=xφ(x)dx=12πxex2/2dx=12πeudu,let u=x22du=dx=12π(e=0e=0)=0.
  • It follows that ๐”ผ[X]=๐”ผ[σZ+μ]=σ๐”ผ[Z]=0+μ=μ.

Examples

Template:Colored example Template:Colored exercise Let us illustrate the usefulness of fundamental bridge between probability and expectation by giving a proof to inclusion-exclusion using this bridge. Template:Colored example

Probability generating functions

An application of expectation is Template:Colored em. As suggested by its name, it can Template:Colored em probabilities in some sense. Template:Colored definition Template:Colored remark

Variance (and standard deviation)

Indeed, Template:Colored em is a special case of Template:Colored em, and is related to Template:Colored em in some sense. Template:Colored definition Template:Colored definition Template:Colored definition Since (X๐”ผ[X])2 is the squared deviation of the value of X from its mean, in view of the definition of variance, we can see that variance measure the Template:Colored em (or Template:Colored em) of distribution, since it is what we would Template:Colored em of the squared deviation if we are to take an observation of the random variable.

Another term which is closed related is Template:Colored em. Template:Colored definition Template:Colored remark Template:Colored proposition

Proof.

  • alternative expression for variance:
Let μ=๐”ผ[X] for clearer expression.

๐”ผ[(Xμ)2]=๐”ผ[X22Xμ+μ2]=๐”ผ[X2]2μ๐”ผ[X]μ+μ2=๐”ผ[X2]μ2, and the result follows.

  • invariance under change in location parameter:

Var(X+a)=๐”ผ[(X+a๐”ผ[X+a]๐”ผ[X]+a)2]=๐”ผ[(X๐”ผ[X])2]=Var(X).

  • nonnegativity: it follows from (X๐”ผ[X])20.
  • zero variance implies non-randomness:
Let μ=๐”ผ[X] for clearer expression. Consider the event En={|Xμ|n1}, in which n is a positive integer.
Since 0=Var(X)=๐”ผ[(Xμ)2]๐”ผ[(Xμ)2๐Ÿ{En}1]=๐”ผ[|Xa|2๐Ÿ{En}]๐”ผ[n2constant๐Ÿ{En}]=n20โ„™(En)00,
we have 0n2โ„™(En)00โ„™(En)0โ„™(En)=0.
Thus,

โ„™(|Xμ|>0Xμ)=โ„™(n=1En)=a lemmalimnโ„™(En)0=0โ„™(X=μ)=1โ„™(Xμ)0=1

  • additivity under independence:
For each random variable X and Y that are independent with means μ,ν respectively,

Var(X+Y)=๐”ผ[(X+Y๐”ผ[X+Y])2]Var(X+Y)=๐”ผ[(X+Yμν)2]by linearity=๐”ผ[(Xμ)2]Var(X)+๐”ผ[(Yν)2]Var(Y)+2๐”ผ[(Xμ)(Yν)]by linearity=Var(X)+Var(Y)+2๐”ผ[XY]2ν๐”ผ[X]2μ๐”ผ[Y]+2μνby linearity=Var(X)+Var(Y)+2๐”ผ[X]๐”ผ[Y]μν2νμ2μν+2μνby independence of X,Y=Var(X)+Var(Y)+2μν2νμ=Var(X)+Var(Y). Thus, inductively, Var(X1++Xn)=Var(X1)+Var(X2++Xn)==Var(X1)++Var(Xn) if X1,,Xn are independent.

Variance of some distributions of a discrete random variable

Template:Colored proposition

Proof.

  • ๐”ผ[X2]=0โ„™(X=0)+1โ„™(X2=1X=1)=p since X is nonnegative.
  • It follows that Var(X)=๐”ผ[X2](๐”ผ[X])2=pp2=p(1p).
  • Similar to the proof for the mean of Bernoulli and binomial r.v.'s, Y=X1++Xn in which X1,,Xn are i.i.d. and follow Ber(p).
  • Because of the Template:Colored em (from i.i.d. property), Var(Y)=Var(X1)++Var(Xn)n times=np(1p).

Template:Colored proposition

Proof.

  • ๐”ผ[X2]=k=0k2(λkeλk!)โ„™(X=k)=λ(0+k=1k1=0k(kλk1eλk(k1)!))=λ(k1=0(k1)eλλk1(k1)!๐”ผ[X]+k1=0eλλk1(k1)!โ„™(X=k1)=1)=λ(λ+1).
  • Hence, Var(X)=๐”ผ[X2](๐”ผ[X])2=λ(λ+1)λ2=λ.

Template:Colored proposition

Proof.

  • Since

๐”ผ[X]=k=0k2(1p)kpโ„™(X=k)=k=0(k1+1)2(1p)kpโ„™(X=k)=k=0(k1)2(1p)kp+k=02(k1)(1p)kp+k=0(1p)kpโ„™(X=k)=1=(01)2(1p)0p=p+(1p)k1=0(k1)2(1p)k1p+2(01)(1p)0p=2p+2(1p)k1=0(k1)(1p)k1p+1=p+(1p)๐”ผ[X2]2p+2(1p)๐”ผ[X](1p)/p+1=(1p)๐”ผ[X2]+2(1p)2p+1p,

  • it follows that p๐”ผ[X2]=2(1p)2p+1p๐”ผ[X2]=2(1p)2+p(1p)p2.
  • Hence, Var(X)=๐”ผ[X2](๐”ผ[X])2=2(1p)2+p(1p)p2(1p)2p2=(1p)2+p(1p)p2=(1p)(1p+p)p2.
  • Similarly, Y=X1++Xk in which X1,,Xk are i.i.d., and follow Geo(p) [5].
  • Because of the independence, Var(Y)=Var(X1)++Var(Xk)=1pp2++1pp2k times=k(1p)p2.

Variance of some distributions of a continuous random variable

Template:Colored proposition

Proof. Var(X)=๐”ผ[X2](๐”ผ[X])2=abx2badx(b+a2)2=1ba(b3/3a3/3)(a+b2)2=13(ba)(b3a3)(a+b2)2=13(ba)(ba)(b2+ba+a2)a2+2ab+b24=4b2+4ab+4a23b262ab3a212=b22ab+a212=(ba)212.

Template:Colored proposition

Proof.

  • Similarly, it suffices to prove the formula for variance of gamma r.v.'s.
  • ๐”ผ[X2]=0x2λαxα1eλxΓ(α)dx=(α+1)αλ20λα+2xα+21eλxΓ(α+2)dx=F()=1,F is the cdf of Gamma(α+2,λ),=(α+1)αλ2.
  • It follows that Var(X)=๐”ผ[X2](๐”ผ[X]2)=(α+1)αλ2α2λ2=αλ2.
  • Since Exp(λ)Gamma(1,λ), Var(Y)=1/λ2 by substituting α=1.
  • Since χν2Gamma(ν/2,1/2), Var(Z)=(ν/2)/(1/2)2=2ν by substituting α=ν/2 and λ=1/2.

Template:Colored proposition

Proof.

  • ๐”ผ[X2]=01x2Γ(α+β)Γ(α)Γ(β)xα1(1x)β1dx=(α+1)α(α+β+1)(α+β)01Γ(α+β+2)Γ(α+2)Γ(β)xα+21(1x)β1dxF(1)=1,F is the cdf of Beta(α+2,β),=(α+1)α(α+β+1)(α+β).
  • It follows that Var(X)=๐”ผ[X2](๐”ผ[X])2=(α+1)α(α+β+1)(α+β)α2(α+β)2=(α+1)(α)(α+β)α2(α+β+1)(α+β)2(α+β+1)=α(α2+αβ+α+βα2αβα)(α+β)2(α+β+1)=αβ(α+β)2(α+β+1).

Template:Colored proposition

Proof. It follows from the proposition about undefined mean of Cauchy r.v.'s and the formula Var(X)=๐”ผ[X2](๐”ผ[X])2 (arbitrary term minus undefined term is undefined).

Template:Colored proposition

Proof.

  • Let Z=Xμσ๐’ฉ(0,1).
  • ๐”ผ[Z2]=x2φ(x)dx=12πx2ex2/2dx=12πxd(ex2/2)=12π([xex2/2]ex2/2dx)by integration by parts,=12π(00ex2/2dx)since exponential function  much faster than linear function, or by L'Hospital rule,=φ(x)dx=Φ()=1=1.
  • It follows that Var(Z)=๐”ผ[Z2](๐”ผ[Z])2=10=1.
  • Hence, Var(X)=Var(σZ+μ)=σ2Var(Z)=σ2.

Template:Colored exercise

Coefficient of variation

Template:Colored definition Template:Colored remark Template:Colored example Template:Colored remark

Quantile

Then, we will discuss Template:Colored em. In particular, Template:Colored em and Template:Colored em range are quite related to Template:Colored em. Template:Colored definition Template:Colored remark The following are some terminologies related to Template:Colored em. Template:Colored definition Template:Colored example Template:Colored definition Template:Colored definition Template:Colored example Template:Colored definition Template:Colored em and Template:Colored em measure centrality and dispersion respectively. Recall that Template:Colored em and Template:Colored em measure the same things respectively. One advantage of Template:Colored em and Template:Colored em is Template:Colored em, since they are always defined, while Template:Colored em and Template:Colored em can be infinite, and they fail to measure centrality and dispersion in those occasions. However, Template:Colored em and Template:Colored em also have some disadvantages, e.g. they may be more difficult to be computed, and may not be very accurate. Template:Colored example Template:Colored exercise

Mode

Mode is another measure of centrality. Template:Colored definition Template:Colored remark Template:Colored example Template:Colored remark

Covariance and correlation coefficients

In this section, we will discuss two important properties of Template:Colored em distributions, namely Template:Colored em and Template:Colored em. As we will see, covariance is related to variance in some sense, and correlation coefficient is closed related to correlation. Template:Colored definition Template:Colored definition Both Template:Colored em and Template:Colored em measure Template:Colored em between X and Y. As we will see, ρ(X,Y)[1,1], X,Y are more highly correlated as |ρ(X,Y)| increases, and X has a linear relationship with Y if |ρ(X,Y)|=1.

Template:Colored proposition

Proof.

(i) Cov(X,Y)=๐”ผ[(X๐”ผ[X])(Y๐”ผ[Y])]=๐”ผ[(Y๐”ผ[Y])(X๐”ผ[X])]=Cov(Y,X) (ii) Cov(X,X)=๐”ผ[(X๐”ผ[X])(X๐”ผ[X])]=๐”ผ[(X๐”ผ[X])2]=Var(X) (iii) Cov(X,Y)=๐”ผ[(X๐”ผ[X])(Y๐”ผ[Y])]=๐”ผ[XYX๐”ผ[Y]Y๐”ผ[X]+๐”ผ[X]๐”ผ[Y]]=๐”ผ[XY]๐”ผ[Y]๐”ผ[X]๐”ผ[X]๐”ผ[Y]+๐”ผ[X]๐”ผ[Y]by linearity=๐”ผ[XY]๐”ผ[X]๐”ผ[Y] (iv) Cov(i=1n(aiXi+c),j=1m(bjYj+d))=๐”ผ[(i=1n(aiXi+c)i=1n๐”ผ[aiXi+c])(j=1m(bjYj+d)j=1m๐”ผ[bjYj+d])]=๐”ผ[i=1n(aiXi๐”ผ[aiXi])j=1m(bjYj๐”ผ[bjYj])]=๐”ผ[i=1nj=1m(aiXi๐”ผ[aiXi])(bjYj๐”ผ[bjYj])]=i=1nj=1m๐”ผ[(aiXiai๐”ผ[Xi])(bjYjbj๐”ผ[Yj])]by linearity=i=1nj=1maibj๐”ผ[Xi๐”ผ[Xi]]๐”ผ[Yj๐”ผ[Yj]]=i=1nj=1maibjCov(Xi,Yj) (v) Var(i=1nXi)=(ii)Cov(i=1nXi,j=1nXj)=(iv)i=1nj=1nCov(X1,Xj)=1i=jnCov(Xi,Xj)+1ijnCov(Xi,Xj)=(ii)i=1nVar(Xi)+1i<jnCov(Xi,Xj)+1j<inCov(Xi,Xj)=(i)i=1nVar(Xi)+21i<jnCov(Xi,Xj)

Then, we will discuss about Template:Colored em. The following is the definition of Template:Colored em between correlation between two random variables. Template:Colored definition Template:Colored remark Covariance and correlation coefficient are Template:Colored em, but they have differences. In particular, Cov(X,Y) depends on Template:Colored em of X and Y, not just their relationship. Thus, this number is affected by the variances, and does not measure their relationship accurately. On the other hand, ρ(X,Y) Template:Colored em for Template:Colored em of X and Y, and therefore measures their relationships more Template:Colored em.

The following is one of the most important properties of correlation coefficient. Template:Colored proposition

Proof. For each random variable X,Y,

Template:Colored em: prove that ρ(X,Y)1Cov(X,Y)Var(X)Var(Y)1. To get rid of the square root to make the proof neater, we square both side of the inequality, and get Cov(X,Y)2Var(X)Var(Y)1Cov(X,Y)2Var(Y)Var(X)Var(X)+Cov(X,Y)2Var(Y)0.

Recall that Var()0. So, one way to prove the rightmost inequality is expressing its left side as Var(), as follows: Var(X)Cov(X,Y)2)Var(Y)=Var(X)+(Cov(X,Y))Var(Y))2Var(Y)2(Cov(X,Y)Var(Y))=(iv,v)Var(XCov(X,Y)Var(Y)Y). Thus, the result follows.

Template:Colored remark Then, we will define several terminologies related to correlation coefficient. Template:Colored definition

Then, we will state an important result that is related to independence and correlation. Intuitively, you may think that 'independent' is the same as 'uncorrelated'. However, this is wrong. Indeed, 'independent' is Template:Colored em than 'uncorrelated'. Template:Colored proposition

Proof. For each independent random variable X,Y with mean μ,ν respectively, Cov(X,Y)=๐”ผ[(Xμ)(Yν)]=independence๐”ผ[Xμ]๐”ผ[Yν]=linearity(๐”ผ[X]μμ)(๐”ผ[Y]νν)=0

However, converse is Template:Colored em true, as we will see in the following example. Template:Colored example Template:Colored exercise Template:Nav

  1. โ†‘ Each of the Bernoulli r.v.'s acts as an indicator for the success of the corresponding trial. Since, there are n independent Bernoulli trials, there are n such indicators.
  2. โ†‘ Each geometric r.v. shows the number of failure for the corresponding success.
  3. โ†‘ since this probability is unconditional, because the corresponding mean is also unconditional, so that their sum is also unconditional mean (as in the proposition)
  4. โ†‘ X1,,Xn are Template:Colored em, but we can still use the linearity of expectation, since it does not require independence.
  5. โ†‘ Each geometric r.v. shows the number of failure for the corresponding success.