Statistics/Point Estimation
Introduction
Usually, a random variable resulting from a random experiment is Template:Colored em to follow a certain distribution with an unknown (but Template:Colored em) parameter (vector) [1] [2] ( is a positive integer, and its value depends on the distribution), taking value in a set , called the parameter space. Template:Colored remark For example, suppose the random variable is assumed to follow a normal distribution . Then, in this case, the parameter vector is unknown, and the parameter space . It is often useful to Template:Colored em those unknown parameters in some ways to "understand" the random variable better. We would like to make sure the estimation should be "good" [3] enough, so that the understanding is more accurate.
Intuitively, the (realization of) Template:Colored em should be useful. Indeed, the estimators introduced in this chapter are all based on the random sample in some sense, and this is what Template:Colored em mean. To be more precise, let us define Template:Colored em and Template:Colored em. Template:Colored definition Template:Colored remark Template:Colored example In the following, we will introduce two well-known point estimators, which are actually quite "good", namely Template:Colored em and Template:Colored em.
Maximum likelihood estimator (MLE)
As suggested by the name of this estimator, it is the estimator that Template:Colored em some kind of "likelihood". Now, we would like to know what "likelihood" should we maximize to estimate the unknown parameter(s) (in a "good" way). Also, as mentioned in the introduction section, the estimator is based on the random sample in some sense. Hence, this "likelihood" should be also based on the random sample in some sense.
To motivate the definition of maximum likelihood estimator, consider the following example. Template:Colored example Template:Colored remark Intuitively, with these particular realizations (fixed), we would like to find a value of that maximizes this probability, i.e.,, makes the realizations obtained to be the one that is "most probable" or "with maximum likelihood". Now, let us formally define the terms related to MLE. Template:Colored definition Template:Colored remark Template:Colored definition Template:Colored remark Now, let us find the MLE of the unknown parameter in the previous coin flipping example. Template:Colored example Sometimes, there is constraint imposed on the parameter when we are finding its MLE. The MLE of the parameter in this case is called a Template:Colored em MLE. We will illustrate this in the following example. Template:Colored example To find the MLE, we sometimes use methods other than derivative test, and we do not need to find the log-likelihood function. Let us illustrate this in the following example. Template:Colored example In the following example, we will find the MLE of a parameter vector. Template:Colored example Template:Colored exercise Template:Colored example
Method of moments estimator (MME)
For maximum likelihood estimation, we need to utilize the likelihood function, which is found from the joint pmf of pdf of the random sample from a distribution. However, we may not know exactly the pmf of pdf of the distribution in practice. Instead, we may just know some information about the distribution, e.g. mean, variance, and some moments (th moment of a random variable is , we denote it by for simplicity). Such moments often contain information about the unknown parameter. For example, for a normal distribution , we know that and . Because of this, when we want to estimate the parameters, we can do this through estimating the moments.
Now, we would like to know how to estimate the moments. We let be the th Template:Colored em [4], where 's are independent and identically distributed. By Template:Colored em (assuming the conditions are satisified), we have
- (this can be seen from replacing the "" by "" in the weak law of large number, then the conditions are still satisfied, and so we can still apply the weak law of large number)
In general, we have , since the conditions are still satisfied after replacing the "" by "" in the weak law of large number.
Because of these results, we can estimate the -th moment using the -th sample moment , and the estimation is "better" when is large. For example, in the above normal distribution example, we can estimate by and by , and these estimators are actually called the Template:Colored em.
To be more precise, we have the following the definition of the Template:Colored em: Template:Colored definition Template:Colored remark Template:Colored example Template:Colored remark Template:Colored example Template:Colored exercise
Properties of estimator
In this section, we will introduce some criteria for evaluating how "good" a point estimator is, namely Template:Colored em, Template:Colored em and Template:Colored em.
Unbiasedness
For to be a "good" estimator of a parameter , a desirable property of is that its expected value equals the value of the parameter , or at least close to the value. Because of this, we introduce a value, namely Template:Colored em, to measure how close is the mean of to . Template:Colored definition We will also define some terms related to bias. Template:Colored definition Template:Colored definition Template:Colored remark Template:Colored example Template:Colored example
Efficiency
We have discussed how to evaluate the unbiasedness of estimators. Now, if we are given two unbiased estimators, and , how should we compare their goodness? Their goodness is the same if we are only comparing them in terms of unbiasedness. Therefore, we need another criterion in this case. One possible way is to compare their Template:Colored em, and the one with smaller variance is better, since on average, the estimator is less deviated from its mean, which is the value of the unknown parameter by the definition of unbiased estimator, and thus the one with smaller variance is more accurate in some deviation sense. Indeed, an unbiased estimator can still have a large variance, and thus deviate a lot from its mean. Such estimator is unbiased since the positive deviations and negative deviations somehow cancel out each other. This is the idea of Template:Colored em.
Template:Colored definition Template:Colored remark Actually, for the variance of unbiased estimator, since the mean of the unbiased estimator is the unknown paramter , it measures the mean of the squared deviation from , and we have a specific term for this deviation, namely Template:Colored em (MSE). Template:Colored definition Template:Colored remark Notice that in the definition of MSE, we do not specify that to be an unbiased estimator. Thus, in the definition may be biased. We have mentioned that when is unbiased, then its variance is actually its MSE. In the following, we will give a more general relationship between and , not just for unbiased estimators. Template:Colored proposition
Proof. By definition, we have and . From these, we are motivated to write as desired.
Template:Colored example Template:Colored proposition
Proof.
- "if" part is simple. Assume and . Then, .
- "only if" part: we can use proof by contrapositive, i.e., proving that if Template:Colored em , then .
- Case 1: when , it means since the variance is nonnegative. Also, . It follows that , i.e., the MSE does not equal zero.
- Case 2: when , it means . Also, . It follows that , i.e., the MSE does not equal zero.
Uniformly minimum-variance unbiased estimator
Now, we know that the smaller the variance of an unbiased estimator, the more efficient (and "better") it is. Thus, it is natural that we want to know what is the Template:Colored em efficient (i.e., the "best") unbiased estimator, i.e., the unbiased estimator with the smallest variance. We have a specific name for such unbiased estimator, namely Template:Colored em [5]. To be more precise, we have the following definition for UMVUE: Template:Colored definition Indeed, UMVUE is Template:Colored em, i.e., there is exactly one unbiased estimator with the smallest variance among all unbiased estimators, and we will prove it in the following. Template:Colored proposition
Proof. Assume that is an UMVUE of , and is another UMVUE of . Define the estimator . Since , is an unbiased estimator of .
Now, we consider the variance of . Thus, we now have either or . If the former is true, then is Template:Colored em an UMVUE of by definition, since we can find another unbiased estimator, namely , with smaller variance than it. Hence, we must have the latter, i.e., This implies when we apply the covariance inequality, the equality holds, i.e., which means is increasing linearly with , i.e., we can write for some constants and .
Now, we consider the covariance . On the other hand, since the equality holds in the covariance inequality, and (since they are both UMVUE), Thus, we have .
It remains to show that to prove that , and therefore conclude that is Template:Colored em.
From above, we currently have , as desired.
Cramer-Rao lower bound
Without using some results, it is quite difficult to determine the UMVUE, since there are many (perhaps even infinitely many) possible unbiased estimator, so it is quite hard to ensure that one particular unbiased estimator is relative more efficient than every other possible unbiased estimators.
Therefore, we will introduce some approaches that help us to find the UMVUE. For the first approach, we find a Template:Colored em [6] on the variances of all possible unbiased estimators. After getting such lower bound, if we can find an unbiased estimator with variance to be exactly equal to the lower bound, then the lower bound is the minimum value of the variances, and hence such unbiased estimator is an UMVUE by definition. Template:Colored remark A common way to find such lower bound is to use the Template:Colored em (CRLB), and we get the CRLB through Template:Colored em. Before stating the inequality, let us define some related terms. Template:Colored definition Template:Colored remark For the regularity conditions which allow interchange of derivative and integral, they include
- the partial derivatives involved should exist, i.e., the (natural log) of the functions involved is differentiable
- the integrals involved should be differentiable
- the support does not depend on the parameter(s) involved
We have some results that assist us to compute the Fisher information. Template:Colored proposition
Proof.
Proof. Now, it suffices to prove that , which is true since
Template:Colored remark Template:Colored theorem
Proof. Since is an unbiased estimator of , we have by definition . By definition of expectation, we have where is the likelihood function. Thus, Consider the covariance inequality: . We have ( by remark about Fisher information)
Template:Colored remark Template:Colored example Sometimes, we cannot use the CRLB method for finding UMVUE, because
- the regularity conditions may not be satisfied, and thus we cannot use the Cramer-Rao inequality, and
- the variance of the unbiased estimator may not be equal to the CRLB, but we cannot conclude that it is not an UMVUE, because it may be the case that the CRLB is not attainable at all, and the smallest variance among all unbiased estimators is actually the variance of that estimator, which is larger than the CRLB.
We will illustrate some examples for these two cases in the following. Template:Colored example Template:Colored example Template:Colored remark Since the CRLB is sometimes attainable and sometimes not, it is natural to question that Template:Colored em can the CRLB be attained. In other words, we would like to know the Template:Colored em for the CRLB, which are stated in the following corollary. Template:Colored corollary
Proof. Considering the proof for Cramer-Rao inequality, we have We can write as (by result about covariance). Also, (by result about variance). Thus, we have where is the correlation coefficient between two random variables. This means increases or decreases linearly with , i.e., for some constants . Now, it suffices to show that the constant is actually zero.
We know that (since is an unbiased estimator of ), and (from remark about Fisher information). Thus, applying expectations on both side gives Then, the result follows.
Template:Colored remark Template:Colored example Template:Colored remark Template:Colored example Template:Colored remark We have discussed MLE previously, and MLE is actually a "best choice" asymptotically (i.e., as the sample size ) according to the following theorem. Template:Colored theorem
Proof. Template:Colored em: we consider the Taylor series of order 2 for , and we will get where is between and . Since is the MLE of , from the derivative test, we know that (we apply regularity condition to ensure the existence of this derivative). Hence, we have Since by central limit theorem, Furthermore, we apply the weak law of large number to show that It can be shown in a quite complicated way (and using regularity conditions) that Considering and , using property of convergence in probability, we have Considering and , and using Slutsky's theorem, we have where , and hence . It follows that This means and thus as desired.
Template:Colored remark Since we are not able to use the CRLB to find UMVUE in some situations, we will introduce another method to find UMVUE in the following, which uses the concepts of Template:Colored em and Template:Colored em.
Sufficiency
Intuitively, a Template:Colored em , which is a function of a given random sample , contains all information needed for estimating the unknown parameter (vector) . Thus, the statistic itself is "sufficient" for estimating the unknown parameter (vector) .
Formally, we can define and describe Template:Colored em as follows: Template:Colored definition Template:Colored remark Template:Colored example Template:Colored remark Let us state the above remark about transformation of sufficient statistic formally below. Template:Colored proposition Now, we discuss a theorem that helps us to check the sufficiency of a statistic, namely (Fisher-Neyman) Template:Colored em. Template:Colored theorem
Proof. Since the proof for continuous case is quite complicated, we will only give a proof for the discrete case. For simplicity of presentation, let , , , and , and hence there are notations for different types of pmfs from these. By definition, . Also, we have . Thus, we can write .
"only if" () direction: Assume is a sufficient statistic. Then, we choose and , which does not depend on by the definition of sufficient statistic. It remains to verify that the equation actually holds for this choice.
Hence,
"if" () direction: Assume we can write . Then, Now, we aim to show that does not depend on , which means is a sufficient statistic for . We have which does not depend on , as desired.
Template:Colored remark Template:Colored example For some "nice" distributions, which belong to Template:Colored em, sufficient statistics can be found using another alternative method easily and more conveniently. This method works because of the "nice" form of the pdf or pmf of those distributions, which can be characterized as follows: Template:Colored definition Template:Colored remark Template:Colored example Template:Colored theorem
Proof. Since the distribution belongs to the exponential family, the joint pdf or pmf of can be expressed as From here, for applying the factorization theorem, we can identify the Template:Color part of the function as "", and the Template:Color part of the function as "". We can notice that the Template:Color part of the function depends on only through . The result follows.
Template:Colored example Now, we will start discussing how is sufficient statistic related to UMVUE. We begin our discussion by Template:Colored em. Template:Colored theorem
Proof. Assume is an arbitrary unbiased estimator of , and is a sufficient statistic of .
First, we prove that is an unbiased estimator of . Before proving the unbiasedness, we should ensure that is actually an estimator, i.e., it is a statistic, which is a function of random sample, and needs to be independent from (so that it is calculable): since is a function of random sample, and is a sufficient statistic, which make the conditional distribution of , given , Template:Colored em of . Also, is a function of , and thus is also a function of random sample.
Now, we prove that is an unbiased estimator of : since , is an unbiased estimator of .
Next, we prove that : by law of total variance, we have as desired.
Template:Colored remark To actually determine the UMVUE, we need another theorem, called Template:Colored em, which is based on Rao-Blackwell theorem, and requires the concept of Template:Colored em.
Completeness
Template:Colored definition When a random sample is from a distribution in exponential family, then a complete statistic can also be founded easily, similar to the case for sufficient statistic. Template:Colored theorem
Proof. Omitted.
Template:Colored remark Template:Colored theorem
Proof. Assume is a Template:Colored em for and .
Since is a sufficient statistic for , we can apply the Rao-Blackwell theorem. From Rao-Blackwell theorem, if is an arbitrary unbiased estimator of , then is another unbiased estimator where .
To prove that is the unique UMVUE of , we proceed to show that Template:Colored em of the choice of the unbiased estimator of , we get the Template:Colored em from the Rao-Blackwell theorem (with probability 1). Then, we will have Template:Colored em possible unbiased estimator of , (with probability 1) [7], which means is the UMVUE, and is also the Template:Colored em UMVUE since we always get the same [8].
Assume that is Template:Colored em unbiased estimator of (). By Rao-Blackwell theorem again, there is an unbiased estimator () where . Since both and are unbiased estimators of , we have for each , Since is a complete statistic, we have which means (with probability 1), i.e., we get the same from the Rao-Blackwell theorem in this case (with probability 1).
Template:Colored remark Template:Colored example Template:Colored remark Template:Colored example Template:Colored exercise
Consistency
In the previous sections, we have discussed Template:Colored em and Template:Colored em. In this section, we will discuss another property called Template:Colored em. Template:Colored definition Template:Colored remark Template:Colored proposition
Proof. Assume is an (asymptotically) unbiased estimator of an unknown parameter and as . Since is an (asymptotically) unbiased estimator of , we have (this is true for both asymptotically unbiased estimator and unbiased estimator of ). In addition to this, we have by assumption that . By definition of mean squared error, these imply that . Thus, as , we have by Chebyshov's inequality (notice that exist from above), for each , Since probability is nonnegative (), and this probability is less than or equal to an expression that tends to be 0 as , we conclude that this probability tends to be zero as . That is, is a Template:Colored em of .
Template:Colored remark Template:Colored example Template:Nav Template:BookCat
- โ For the parameter vector, it contains all parameters governing the distribution.
- โ We will simply use "" when we do not know whether it is parameter vector or just a single parameter. We may use instead if we know it is indeed a parameter vector.
- โ We will discuss some criterion for "good" in the #Properties of estimator section.
- โ For each positive integer , always exist, unlike .
- โ "uniformly" means that the variance is minimum compared to other unbiased estimators, Template:Colored em (i.e., for each possible value of ). That is, the variance is not just minimum for a particular value of , but all possible values of .
- โ This is different from the minimum value. For Template:Colored em, it only needs to be smaller than all variances involved, and there may not be any variance that actually achieve this lower bound. However, for the minimum value, it has to be one of the values of the variance.
- โ Notice that this is a stronger result than the result in the Rao-Blackwell theorem, where the latter only states that , Template:Colored em
- โ Indeed, we know that UMVUE must be unique from previous proposition. However, in this argument, when we show that is UMVUE, we also automatically show that it is unique.