Probability/Properties of Distributions
Introduction
Recall that pdf (or cdf) describes the random behaviour of a random variable Template:Colored em . However, we may sometimes find the pdf (or cdf) to be too complicated, and only want to know some Template:Colored em about the random variable. In view of this, we study some properties of distributions in this chapter, which provide Template:Colored em descriptions of the random behaviour of the random variable.
Some examples of such partial descriptions include
- location (e.g. pdf is 'located' at left or right?),
- dispersion (e.g. 'sharp' of 'flat' pdf?),
- skewness (e.g. pdf is symmetric, skewed to left, or skewed to right?), and
- tail property (e.g. pdf have 'light' or 'heavy' tails?).
We can Template:Colored em describe them, but such descriptions are quite subjective and inaccurate. To give a more objective and accurate measure to such descriptions, we evaluate them Template:Colored em using some quantitative measures derived from the pdf (or cdf) of the random variable.
We will discuss some of such quantitative measures in this chapter. Among these, the Template:Colored em is the most important one, since many of other properties base upon the concept of Template:Colored em.
Expectation
We have different alternative names for expectation, e.g. expected value and mean. Template:Colored definition Template:Colored remark Template:Colored example Template:Colored example Template:Colored exercise In the following, we introduce a useful result that gives the relationship between expectation and probability, we can use expectation to ease the computation of probability using this result. Template:Colored proposition
Proof. Let . Since (which is a discrete random variable),
When there are multiple random variables involved, we may derive the joint pmf or pdf first to compute the expectation, but it can be quite difficult and complicated to do so. Practically, we use the following theorem more often. Template:Colored theorem Template:Colored remark The proof is quite complicated, and hence we skip it. In the following, we will introduce several properties of expectation that can help us to simplify computations of the expectation.
Proof.
for continuous random variables , Similarly, for discrete random variables ,
For continuous random variable , Similarly, for discrete random variable ,
For random variables that are either both discrete or both continuous,
For continuous random variables , Similarly, for discrete random variables ,
Mean of some distributions of a discrete random variable
Proof.
Proof.
- Since in which (each of the Bernoulli r.v.'s indicates whether the corresponding draw of ball is of type 1, with probability without knowing the results of other draws [3], since each draw is equally likely to be any of the balls) [4] ,
- it follows that
Mean of some distributions of a continuous random variable
We will introduce the formulas for mean of some distributions of a Template:Colored em random variable, which are relatively simpler. Template:Colored proposition
Proof.
Proof.
- It suffices to prove the formula for mean of gamma r.v.'s, since exponential and chi-squared r.v.'s are essentially special cases of gamma r.v.'s, and thus we can simply substitute some values into the formula for mean of gamma r.v.'s to obtain the formulas for them.
- Since , by substituting .
- Since , by substituting and .
Proof.
- We use similar approach from the previous proof.
Proof.
Proof.
- Let .
- It follows that .
Examples
Template:Colored example Template:Colored exercise Let us illustrate the usefulness of fundamental bridge between probability and expectation by giving a proof to inclusion-exclusion using this bridge. Template:Colored example
Probability generating functions
An application of expectation is Template:Colored em. As suggested by its name, it can Template:Colored em probabilities in some sense. Template:Colored definition Template:Colored remark
Variance (and standard deviation)
Indeed, Template:Colored em is a special case of Template:Colored em, and is related to Template:Colored em in some sense. Template:Colored definition Template:Colored definition Template:Colored definition Since is the squared deviation of the value of from its mean, in view of the definition of variance, we can see that variance measure the Template:Colored em (or Template:Colored em) of distribution, since it is what we would Template:Colored em of the squared deviation if we are to take an observation of the random variable.
Another term which is closed related is Template:Colored em. Template:Colored definition Template:Colored remark Template:Colored proposition
Proof.
- alternative expression for variance:
- Let for clearer expression.
and the result follows.
- invariance under change in location parameter:
- nonnegativity: it follows from .
- zero variance implies non-randomness:
- Let for clearer expression. Consider the event , in which is a positive integer.
- Since
- we have .
- Thus,
- additivity under independence:
- For each random variable and that are independent with means respectively,
Thus, inductively, if are independent.
Variance of some distributions of a discrete random variable
Proof.
- since is nonnegative.
- It follows that .
- Similar to the proof for the mean of Bernoulli and binomial r.v.'s, in which are i.i.d. and follow .
- Because of the Template:Colored em (from i.i.d. property),
Proof.
- Hence,
Proof.
- Since
- it follows that .
- Hence, .
- Similarly, in which are i.i.d., and follow [5].
- Because of the independence,
Variance of some distributions of a continuous random variable
Proof.
Proof.
- Similarly, it suffices to prove the formula for variance of gamma r.v.'s.
- It follows that
- Since , by substituting .
- Since , by substituting and .
Proof.
- It follows that
Proof. It follows from the proposition about undefined mean of Cauchy r.v.'s and the formula (arbitrary term minus undefined term is undefined).
Proof.
- Let .
- It follows that .
- Hence, .
Coefficient of variation
Template:Colored definition Template:Colored remark Template:Colored example Template:Colored remark
Quantile
Then, we will discuss Template:Colored em. In particular, Template:Colored em and Template:Colored em range are quite related to Template:Colored em. Template:Colored definition Template:Colored remark The following are some terminologies related to Template:Colored em. Template:Colored definition Template:Colored example Template:Colored definition Template:Colored definition Template:Colored example Template:Colored definition Template:Colored em and Template:Colored em measure centrality and dispersion respectively. Recall that Template:Colored em and Template:Colored em measure the same things respectively. One advantage of Template:Colored em and Template:Colored em is Template:Colored em, since they are always defined, while Template:Colored em and Template:Colored em can be infinite, and they fail to measure centrality and dispersion in those occasions. However, Template:Colored em and Template:Colored em also have some disadvantages, e.g. they may be more difficult to be computed, and may not be very accurate. Template:Colored example Template:Colored exercise
Mode
Mode is another measure of centrality. Template:Colored definition Template:Colored remark Template:Colored example Template:Colored remark
Covariance and correlation coefficients
In this section, we will discuss two important properties of Template:Colored em distributions, namely Template:Colored em and Template:Colored em. As we will see, covariance is related to variance in some sense, and correlation coefficient is closed related to correlation. Template:Colored definition Template:Colored definition Both Template:Colored em and Template:Colored em measure Template:Colored em between and . As we will see, , are more highly correlated as increases, and has a linear relationship with if .
Proof.
(i) (ii) (iii) (iv) (v)
Then, we will discuss about Template:Colored em. The following is the definition of Template:Colored em between correlation between two random variables. Template:Colored definition Template:Colored remark Covariance and correlation coefficient are Template:Colored em, but they have differences. In particular, depends on Template:Colored em of and , not just their relationship. Thus, this number is affected by the variances, and does not measure their relationship accurately. On the other hand, Template:Colored em for Template:Colored em of and , and therefore measures their relationships more Template:Colored em.
The following is one of the most important properties of correlation coefficient. Template:Colored proposition
Proof. For each random variable ,
Template:Colored em: prove that . To get rid of the square root to make the proof neater, we square both side of the inequality, and get .
Recall that . So, one way to prove the rightmost inequality is expressing its left side as , as follows: Thus, the result follows.
Template:Colored remark Then, we will define several terminologies related to correlation coefficient. Template:Colored definition
Then, we will state an important result that is related to independence and correlation. Intuitively, you may think that 'independent' is the same as 'uncorrelated'. However, this is wrong. Indeed, 'independent' is Template:Colored em than 'uncorrelated'. Template:Colored proposition
Proof. For each independent random variable with mean respectively,
However, converse is Template:Colored em true, as we will see in the following example. Template:Colored example Template:Colored exercise Template:Nav
- โ Each of the Bernoulli r.v.'s acts as an indicator for the success of the corresponding trial. Since, there are independent Bernoulli trials, there are such indicators.
- โ Each geometric r.v. shows the number of failure for the corresponding success.
- โ since this probability is unconditional, because the corresponding mean is also unconditional, so that their sum is also unconditional mean (as in the proposition)
- โ are Template:Colored em, but we can still use the linearity of expectation, since it does not require independence.
- โ Each geometric r.v. shows the number of failure for the corresponding success.