Statistics/Point Estimation

From testwiki
Jump to navigation Jump to search

Template:Nav

Introduction

Usually, a random variable X resulting from a random experiment is Template:Colored em to follow a certain distribution with an unknown (but Template:Colored em) parameter (vector) [1] θโ„k [2] (k is a positive integer, and its value depends on the distribution), taking value in a set Θ, called the parameter space. Template:Colored remark For example, suppose the random variable X is assumed to follow a normal distribution ๐’ฉ(μ,σ2). Then, in this case, the parameter vector θ=(μ,σ)Θ is unknown, and the parameter space Θ={(μ,σ):μโ„,σ>0}. It is often useful to Template:Colored em those unknown parameters in some ways to "understand" the random variable X better. We would like to make sure the estimation should be "good" [3] enough, so that the understanding is more accurate.

Intuitively, the (realization of) Template:Colored em X1,,Xn should be useful. Indeed, the estimators introduced in this chapter are all based on the random sample in some sense, and this is what Template:Colored em mean. To be more precise, let us define Template:Colored em and Template:Colored em. Template:Colored definition Template:Colored remark Template:Colored example In the following, we will introduce two well-known point estimators, which are actually quite "good", namely Template:Colored em and Template:Colored em.

Maximum likelihood estimator (MLE)

As suggested by the name of this estimator, it is the estimator that Template:Colored em some kind of "likelihood". Now, we would like to know what "likelihood" should we maximize to estimate the unknown parameter(s) (in a "good" way). Also, as mentioned in the introduction section, the estimator is based on the random sample in some sense. Hence, this "likelihood" should be also based on the random sample in some sense.

To motivate the definition of maximum likelihood estimator, consider the following example. Template:Colored example Template:Colored remark Intuitively, with these particular realizations (fixed), we would like to find a value of p that maximizes this probability, i.e.,, makes the realizations obtained to be the one that is "most probable" or "with maximum likelihood". Now, let us formally define the terms related to MLE. Template:Colored definition Template:Colored remark Template:Colored definition Template:Colored remark Now, let us find the MLE of the unknown parameter p in the previous coin flipping example. Template:Colored example Sometimes, there is constraint imposed on the parameter when we are finding its MLE. The MLE of the parameter in this case is called a Template:Colored em MLE. We will illustrate this in the following example. Template:Colored example To find the MLE, we sometimes use methods other than derivative test, and we do not need to find the log-likelihood function. Let us illustrate this in the following example. Template:Colored example In the following example, we will find the MLE of a parameter vector. Template:Colored example Template:Colored exercise Template:Colored example

Method of moments estimator (MME)

For maximum likelihood estimation, we need to utilize the likelihood function, which is found from the joint pmf of pdf of the random sample from a distribution. However, we may not know exactly the pmf of pdf of the distribution in practice. Instead, we may just know some information about the distribution, e.g. mean, variance, and some moments (rth moment of a random variable X is ๐”ผ[Xr], we denote it by μr for simplicity). Such moments often contain information about the unknown parameter. For example, for a normal distribution ๐’ฉ(μ,σ2), we know that μ=μ1 and σ2=μ2(μ1)2. Because of this, when we want to estimate the parameters, we can do this through estimating the moments.

Now, we would like to know how to estimate the moments. We let mr=i=1nXirn be the rth Template:Colored em [4], where Xi's are independent and identically distributed. By Template:Colored em (assuming the conditions are satisified), we have

  • X=m1p๐”ผ[X]=μ1
  • m2p๐”ผ[X2]=μ2 (this can be seen from replacing the "X" by "X2" in the weak law of large number, then the conditions are still satisfied, and so we can still apply the weak law of large number)

In general, we have mrpμr, since the conditions are still satisfied after replacing the "X" by "Xr" in the weak law of large number.

Because of these results, we can estimate the r-th moment μr using the r-th sample moment mr, and the estimation is "better" when n is large. For example, in the above normal distribution example, we can estimate μ by m1 and σ2 by m2(m1)2, and these estimators are actually called the Template:Colored em.

To be more precise, we have the following the definition of the Template:Colored em: Template:Colored definition Template:Colored remark Template:Colored example Template:Colored remark Template:Colored example Template:Colored exercise

Properties of estimator

In this section, we will introduce some criteria for evaluating how "good" a point estimator is, namely Template:Colored em, Template:Colored em and Template:Colored em.

Unbiasedness

For θ^ to be a "good" estimator of a parameter θ, a desirable property of θ^ is that its expected value equals the value of the parameter θ, or at least close to the value. Because of this, we introduce a value, namely Template:Colored em, to measure how close is the mean of θ^ to θ. Template:Colored definition We will also define some terms related to bias. Template:Colored definition Template:Colored definition Template:Colored remark Template:Colored example Template:Colored example

Efficiency

We have discussed how to evaluate the unbiasedness of estimators. Now, if we are given two unbiased estimators, θ^ and θ~, how should we compare their goodness? Their goodness is the same if we are only comparing them in terms of unbiasedness. Therefore, we need another criterion in this case. One possible way is to compare their Template:Colored em, and the one with smaller variance is better, since on average, the estimator is less deviated from its mean, which is the value of the unknown parameter by the definition of unbiased estimator, and thus the one with smaller variance is more accurate in some deviation sense. Indeed, an unbiased estimator can still have a large variance, and thus deviate a lot from its mean. Such estimator is unbiased since the positive deviations and negative deviations somehow cancel out each other. This is the idea of Template:Colored em.

Template:Colored definition Template:Colored remark Actually, for the variance of unbiased estimator, since the mean of the unbiased estimator is the unknown paramter θ, it measures the mean of the squared deviation from θ, and we have a specific term for this deviation, namely Template:Colored em (MSE). Template:Colored definition Template:Colored remark Notice that in the definition of MSE, we do not specify that θ^ to be an unbiased estimator. Thus, θ^ in the definition may be biased. We have mentioned that when θ^ is unbiased, then its variance is actually its MSE. In the following, we will give a more general relationship between MSE(θ^) and Var(θ^), not just for unbiased estimators. Template:Colored proposition

Proof. By definition, we have MSE(θ^)=๐”ผ[(θ^θ)2] and Var(θ^)=๐”ผ[(θ^๐”ผ[θ^])2]. From these, we are motivated to write MSE(θ^)=๐”ผ[(θ^θ)2]=๐”ผ[((θ^๐”ผ[θ^])+(๐”ผ[θ^]θ))2]=๐”ผ[(θ^๐”ผ[θ^])2+2(θ^๐”ผ[θ^])(๐”ผ[θ^]θ)constant+(๐”ผ[θ^]θ)2]=Var(θ^)+2(๐”ผ[θ^]θ)๐”ผ[θ^๐”ผ[θ^]]=๐”ผ[θ^]๐”ผ[θ^]=0+[Bias(θ^)]2=Var(θ^)+[Bias(θ^)]2, as desired.

Template:Colored example Template:Colored proposition

Proof.

  • "if" part is simple. Assume limnVar(θ^)=0 and limnBias(θ^)=0. Then, limn(Var(θ^)+(Bias(θ^))2)=0limnMSE(θ^)=0.
  • "only if" part: we can use proof by contrapositive, i.e., proving that if limnVar(θ^)0 Template:Colored em limnBias(θ^)=0, then limnMSE(θ^)0.
  • Case 1: when limnVar(θ^)0, it means limnVar(θ^)>0 since the variance is nonnegative. Also, limn(Bias(θ^))20. It follows that limnMSE(θ^)>0, i.e., the MSE does not equal zero.
  • Case 2: when limnBias(θ^)0, it means limn(Bias(θ^))2>0. Also, limnVar(θ^)0. It follows that limnMSE(θ^)>0, i.e., the MSE does not equal zero.

Template:Colored remark

Uniformly minimum-variance unbiased estimator

Now, we know that the smaller the variance of an unbiased estimator, the more efficient (and "better") it is. Thus, it is natural that we want to know what is the Template:Colored em efficient (i.e., the "best") unbiased estimator, i.e., the unbiased estimator with the smallest variance. We have a specific name for such unbiased estimator, namely Template:Colored em [5]. To be more precise, we have the following definition for UMVUE: Template:Colored definition Indeed, UMVUE is Template:Colored em, i.e., there is exactly one unbiased estimator with the smallest variance among all unbiased estimators, and we will prove it in the following. Template:Colored proposition

Proof. Assume that W is an UMVUE of τ(θ), and W is another UMVUE of τ(θ). Define the estimator W*=12(W+W). Since ๐”ผ[W*]=12(๐”ผ[W]+๐”ผ[W])=12(τ(θ+θ)=τ(θ), W* is an unbiased estimator of τ(θ).

Now, we consider the variance of W*. Var(W*)=14Var(W+W)=14[Var(W)+Var(W)+2Cov(W,W)]14Var(W)+14Var(W)+12Var(W)Var(W)(covariance inequality)=14Var(W)+14Var(W)+12(Var(W))2(Var(W)=Var(W) since W and W are both UMVUE)=12Var(W)+12Var(W)(Var(W)>0)=Var(W). Thus, we now have either Var(W*)<Var(W) or Var(W*)=Var(W). If the former is true, then W is Template:Colored em an UMVUE of τ(θ) by definition, since we can find another unbiased estimator, namely W*, with smaller variance than it. Hence, we must have the latter, i.e., Var(W*)=Var(W). This implies when we apply the covariance inequality, the equality holds, i.e., Cov(W,W)=Var(W)Var(W)ρ(W,W)=1, which means W is increasing linearly with W, i.e., we can write W=aW+b for some constants a>0 and b.

Now, we consider the covariance Cov(W,W). Cov(W,W)= above Cov(W,aW+b)= properties aCov(W,W)= property aVar(W). On the other hand, since the equality holds in the covariance inequality, and Var(W)=Var(W) (since they are both UMVUE), Cov(W,W)=Var(W)Var(W)=(Var(W))2=Var(W). Thus, we have a=1.

It remains to show that b=0 to prove that W=W, and therefore conclude that W is Template:Colored em.

From above, we currently have W=W+b๐”ผ[W]=๐”ผ[W]+bτ(θ)=τ(θ)+bb=0, as desired.

Template:Colored remark

Cramer-Rao lower bound

Without using some results, it is quite difficult to determine the UMVUE, since there are many (perhaps even infinitely many) possible unbiased estimator, so it is quite hard to ensure that one particular unbiased estimator is relative more efficient than every other possible unbiased estimators.

Therefore, we will introduce some approaches that help us to find the UMVUE. For the first approach, we find a Template:Colored em [6] on the variances of all possible unbiased estimators. After getting such lower bound, if we can find an unbiased estimator with variance to be exactly equal to the lower bound, then the lower bound is the minimum value of the variances, and hence such unbiased estimator is an UMVUE by definition. Template:Colored remark A common way to find such lower bound is to use the Template:Colored em (CRLB), and we get the CRLB through Template:Colored em. Before stating the inequality, let us define some related terms. Template:Colored definition Template:Colored remark For the regularity conditions which allow interchange of derivative and integral, they include

  1. the partial derivatives involved should exist, i.e., the (natural log) of the functions involved is differentiable
  2. the integrals involved should be differentiable
  3. the support does not depend on the parameter(s) involved

We have some results that assist us to compute the Fisher information. Template:Colored proposition

Proof. โ„n(θ)=๐”ผ[(lnโ„’(θ;๐ฑ)θ)2]=Var(lnโ„’(θ;๐ฑ)θ)by above remark=Var(θ(lni=1nf(Xi;θ)))(โ„’(θ;๐ฑ)=i=1nf(xi;θ))=Var(θ(i=1nlnf(Xi;θ)))=Var(i=1nθlnf(Xi;θ))by linearity of differentiation=i=1nVar(θlnf(Xi;θ))by independence=nVar(θlnf(Xi;θ))by identically distributed property=n๐”ผ[(lnf(X;θ)θ)2]by above remark=nโ„(θ).

Template:Colored proposition

Proof. ๐”ผ[2lnf(X;θ)θ2]=๐”ผ[θ(lnf(X;θ)θ)]=๐”ผ[θ(1f(X;θ)f(X;θ)θ)]=๐”ผ[1f(X;θ)2f(X;θ)θ2f(X;θ)θ1(f(X;θ))2f(X;θ)θ]=๐”ผ[1f(X;θ)2f(X;θ)θ2(f(X;θ)θ)21(f(X;θ))2]=๐”ผ[1f(X;θ)2f(X;θ)θ2]๐”ผ[(lnf(X;θ)θ)2]=๐”ผ[1f(X;θ)2f(X;θ)θ2]โ„(θ) Now, it suffices to prove that ๐”ผ[1f(X;θ)2f(X;θ)θ2]=0, which is true since ๐”ผ[1f(X;θ)2f(X;θ)θ2]=1f(x;θ)2f(x;θ)θ2f(x;θ)dx=2f(x;θ)θ2dx=2θ2f(x;θ)dx=2θ2(1)=0.

Template:Colored remark Template:Colored theorem

Proof. Since W is an unbiased estimator of τ(θ), we have by definition ๐”ผ[W]=τ(θ). By definition of expectation, we have ๐”ผ[W]=wโ„’(θ;๐ฑ)dxndx1 where โ„’(θ;๐ฑ) is the likelihood function. Thus, wโ„’(θ;๐ฑ)dxndx1=τ(θ)θwโ„’(θ;๐ฑ)dxndx1=θτ(θ)θ(wโ„’(θ;๐ฑ))dxndx1=τ(θ)wθ(โ„’(θ;๐ฑ))1โ„’(θ;๐ฑ)โ„’(θ;๐ฑ)dxndx1=τ(θ)wlnโ„’(θ;๐ฑ)θโ„’(θ;๐ฑ)dxndx1=τ(θ)๐”ผ[Wlnโ„’(θ;๐ฑ)θ]=τ(θ)๐”ผ[WS(θ;๐—)]=τ(θ)(S(θ;๐—)=lnโ„’(θ;๐ฑ)θ)๐”ผ[WS(θ;๐—)]๐”ผ[W]๐”ผ[S(θ;๐—)]=0=τ(θ)(๐”ผ[S(θ;๐—)]=0 by remark about Fisher information)Cov(W,S(θ;๐—))=τ(θ) Consider the covariance inequality: (Cov(X,Y))2Var(X)Var(Y). We have (Cov(W,S(θ;๐—)))2Var(W)Var(S(θ;๐—))(τ(θ))2Var(W)Var(S(θ;๐—))Var(W)(τ(θ))2Var(S(θ;๐—))=(τ(θ))2โ„n(θ). (โ„n(θ)=Var(S(θ;๐—)) by remark about Fisher information)

Template:Colored remark Template:Colored example Sometimes, we cannot use the CRLB method for finding UMVUE, because

  • the regularity conditions may not be satisfied, and thus we cannot use the Cramer-Rao inequality, and
  • the variance of the unbiased estimator may not be equal to the CRLB, but we cannot conclude that it is not an UMVUE, because it may be the case that the CRLB is not attainable at all, and the smallest variance among all unbiased estimators is actually the variance of that estimator, which is larger than the CRLB.

We will illustrate some examples for these two cases in the following. Template:Colored example Template:Colored example Template:Colored remark Since the CRLB is sometimes attainable and sometimes not, it is natural to question that Template:Colored em can the CRLB be attained. In other words, we would like to know the Template:Colored em for the CRLB, which are stated in the following corollary. Template:Colored corollary

Proof. Considering the proof for Cramer-Rao inequality, we have Var(W)=(τ(θ))2โ„n(θ)(Cov(W,S(θ;๐—)))2=Var(W)Var(S(θ;๐—)) We can write Cov(W,S(θ;๐—)) as Cov(Wτ(θ)constant,S(θ;๐—)) (by result about covariance). Also, Var(W)=Var(Wτ(θ)constant) (by result about variance). Thus, we have (Cov(Wτ(θ),S(θ;๐—)))2=Var(Wτ(θ))Var(S(θ;๐—))(Cov(Wτ(θ),S(θ;๐—)))2Var(Wτ(θ))Var(S(θ;๐—))=1(Cov(S(θ;๐—),Wτ(θ)))2Var(Wτ(θ))Var(S(θ;๐—))=1(ρ(S(θ;๐—),Wτ(θ)))2=1ρ(S(θ;๐—),Wτ(θ))=±1 where ρ(,) is the correlation coefficient between two random variables. This means S(θ;๐—) increases or decreases linearly with Wτ(θ), i.e., S(θ;๐—)=k(Wτ(θ))+c for some constants c,k. Now, it suffices to show that the constant c is actually zero.

We know that ๐”ผ[W]=τ(θ) (since W is an unbiased estimator of τ(θ)), and ๐”ผ[S(θ;๐—)]=0 (from remark about Fisher information). Thus, applying expectations on both side gives ๐”ผ[S(θ;๐—)]=k๐”ผ[Wτ(θ)]+c๐”ผ[S(θ;๐—)]=k(๐”ผ[W]τ(θ)=0)+c0=0+cc=0. Then, the result follows.

Template:Colored remark Template:Colored example Template:Colored remark Template:Colored example Template:Colored remark We have discussed MLE previously, and MLE is actually a "best choice" asymptotically (i.e., as the sample size n) according to the following theorem. Template:Colored theorem

Proof. Template:Colored em: we consider the Taylor series of order 2 for ddθlnโ„’(θ), and we will get ddθlnโ„’(θ^)=ddθlnโ„’(θ)+(θ^θ)d2dθ2lnโ„’(θ)+12(θ^θ)2d3dθ3lnโ„’(θ)|θ=θ* where θ* is between θ and θ^. Since θ^ is the MLE of θ, from the derivative test, we know that ddθlnโ„’(θ^)=0 (we apply regularity condition to ensure the existence of this derivative). Hence, we have ddθlnโ„’(θ)+(θ^θ)d2dθ2lnโ„’(θ)+12(θ^θ)2d3dθ3lnโ„’(θ)|θ=θ*=0n(θ^θ)d2dθ2lnโ„’(θ)n2(θ^θ)2d3dθ3lnโ„’(θ)|θ=θ*=nddθlnโ„’(θ)n(θ^θ)=ddθlnโ„’(θ)/nn1d2dθ2lnโ„’(θ)(2n)1(θ^θ)d3dθ3lnโ„’(θ)|θ=θ*. Since Var(i=1nlnf(Xi;θ)θ)=i=1nVar(lnf(Xi;θ)θ)=i=1n๐”ผ[(lnf(Xi;θ)θ)2]=nโ„(θ)(1), by central limit theorem, ddθlnโ„’(θ)n=1ni=1nlnf(Xi;θ)θd๐’ฉ(0,(1/n)nI(θ))๐’ฉ(0,โ„(θ)). Furthermore, we apply the weak law of large number to show that n1d2dθ2lnโ„’(θ)=1ni=1n2lnf(Xi;θ)θ2p๐”ผ[2lnf(Xi;θ)θ2]=โ„(θ)(2). It can be shown in a quite complicated way (and using regularity conditions) that (2n)1(θ^θ)d3dθ3lnโ„’(θ)|θ=θ*p0.(3). Considering (2) and (3), using property of convergence in probability, we have n1d2dθ2lnโ„’(θ)(2n)1(θ^θ)d3dθ3lnโ„’(θ)|θ=θ*pโ„(θ)+0=โ„(θ)(4). Considering (1) and (4), and using Slutsky's theorem, we have n(θ^θ)=ddθlnโ„’(θ)/nn1d2dθ2lnโ„’(θ)(2n)1(θ^θ)d3dθ3lnโ„’(θ)|θ=θ*dYโ„(θ) where Y๐’ฉ(0,โ„(θ)), and hence Yโ„(θ)๐’ฉ(0,โ„(θ)[โ„(θ)]2)๐’ฉ(0,1/โ„(θ)). It follows that n(θ^θ)d๐’ฉ(0,1/โ„(θ)). This means θ^θd๐’ฉ(0,1/(nโ„(θ)))๐’ฉ(0,1/โ„n(θ)), and thus θ^θ1/โ„n(θ)d๐’ฉ(0,1/(nโ„(θ))1/โ„n(θ)=nโ„(θ))๐’ฉ(0,1) as desired.

Template:Colored remark Since we are not able to use the CRLB to find UMVUE in some situations, we will introduce another method to find UMVUE in the following, which uses the concepts of Template:Colored em and Template:Colored em.

Sufficiency

Intuitively, a Template:Colored em T(X1,,Xn), which is a function of a given random sample X1,,Xn, contains all information needed for estimating the unknown parameter (vector) θ. Thus, the statistic T(X1,,Xn) itself is "sufficient" for estimating the unknown parameter (vector) θ.

Formally, we can define and describe Template:Colored em as follows: Template:Colored definition Template:Colored remark Template:Colored example Template:Colored remark Let us state the above remark about transformation of sufficient statistic formally below. Template:Colored proposition Now, we discuss a theorem that helps us to check the sufficiency of a statistic, namely (Fisher-Neyman) Template:Colored em. Template:Colored theorem

Proof. Since the proof for continuous case is quite complicated, we will only give a proof for the discrete case. For simplicity of presentation, let ๐—=(X1,,Xn), T=T(X1,,Xn), ๐ฑ=(x1,,xn), and t=T(x1,,xn), and hence there are notations for different types of pmfs from these. By definition, f๐—|T(๐ฑ|t;θ)=f๐—|T(๐ฑ,t). Also, we have ๐—=๐ฑ๐—=๐ฑT(๐—)=T(๐ฑ)๐—=๐ฑT=t. Thus, we can write f๐—,T(๐ฑ,t;θ)=f๐—(๐ฑ;θ)(*).

"only if" () direction: Assume T is a sufficient statistic. Then, we choose g(t;θ)=fT(t;θ) and h(๐ฑ)=f๐—|T(๐ฑ|t), which does not depend on θ by the definition of sufficient statistic. It remains to verify that the equation actually holds for this choice.

Hence, f๐—(๐ฑ;θ)=f๐—,T(๐ฑ,t;θ)= def f๐—|T(๐ฑ|t;θ)fT(t;θ)= sufficiency f๐—|T(๐ฑ|t)fT(t;θ)=h(๐ฑ)g(t;θ).

"if" () direction: Assume we can write f๐—(๐ฑ;θ)=g(t;θ)h(๐ฑ). Then, fT(t;θ)= marginal pmf ๐ฑf๐—,T(๐ฑ,t;θ)= (*) ๐ฑf๐—(๐ฑ;θ)= assumption ๐ฑg(t;θ)h(๐ฑ)=g(t;θ)independent from ๐ฑ๐ฑh(๐ฑ). Now, we aim to show that f๐—|T(๐ฑ|t) does not depend on θ, which means T is a sufficient statistic for θ. We have f๐—|T(๐ฑ|t)= def f๐—,T(๐ฑ,t;θ)fT(t;θ)= (*) f๐—(๐ฑ;θ)fT(t;θ)=g(t;θ)h(๐ฑ)assumptiong(t;θ)๐ฑh(๐ฑ)above=h(๐ฑ)๐ฑh(๐ฑ), which does not depend on θ, as desired.

Template:Colored remark Template:Colored example For some "nice" distributions, which belong to Template:Colored em, sufficient statistics can be found using another alternative method easily and more conveniently. This method works because of the "nice" form of the pdf or pmf of those distributions, which can be characterized as follows: Template:Colored definition Template:Colored remark Template:Colored example Template:Colored theorem

Proof. Since the distribution belongs to the exponential family, the joint pdf or pmf of X1,,Xn can be expressed as f(x1,,xn;θ)=j=1n[h(xj)g(θ)exp(i=1sηi(θ)Ti(xj))]=[j=1nh(xj)](g(θ))nexp(j=1ni=1sηi(θ)Ti(xj))=[j=1nh(xj)](g(θ))nexp(i=1sj=1nηi(θ)Ti(xj))(changing summation order, where the upper bounds are constants)=[j=1nh(xj)](g(θ))nexp(i=1sηi(θ)independent from jj=1nTi(xj))=[j=1nh(xj)](g(θ))nexp(η1(θ)j=1nT1(xj)++ηs(θ)j=1nTs(xj)). From here, for applying the factorization theorem, we can identify the Template:Color part of the function as "h(x1,,xn)", and the Template:Color part of the function as "g(T(x1,,xn);θ)". We can notice that the Template:Color part of the function depends on x1,,xn only through (j=1nT1(xj),,j=1nTs(xj)). The result follows.

Template:Colored example Now, we will start discussing how is sufficient statistic related to UMVUE. We begin our discussion by Template:Colored em. Template:Colored theorem

Proof. Assume W is an arbitrary unbiased estimator of τ(θ), and T is a sufficient statistic of θ.

First, we prove that φ(T) is an unbiased estimator of τ(θ). Before proving the unbiasedness, we should ensure that φ(T) is actually an estimator, i.e., it is a statistic, which is a function of random sample, and needs to be independent from θ (so that it is calculable): since W is a function of random sample, and T is a sufficient statistic, which make the conditional distribution of W, given T, Template:Colored em of θ. Also, φ(T)=๐”ผ[W|T] is a function of W, and thus is also a function of random sample.

Now, we prove that φ(T) is an unbiased estimator of τ(θ): since ๐”ผ[φ(T)]=๐”ผ[๐”ผ[W|T]]= law of total expectation ๐”ผ[W]= unbiasedness τ(θ), φ(T) is an unbiased estimator of τ(θ).

Next, we prove that Var(φ(T))Var(W): by law of total variance, we have Var(W)=Var(๐”ผ[W|T])+๐”ผ[Var(W|T)]= def Var(φ(T))+๐”ผ[Var(W|T)0]0Var(φ(T)), as desired.

Template:Colored remark To actually determine the UMVUE, we need another theorem, called Template:Colored em, which is based on Rao-Blackwell theorem, and requires the concept of Template:Colored em.

Completeness

Template:Colored definition When a random sample X1,,Xn is from a distribution in exponential family, then a complete statistic can also be founded easily, similar to the case for sufficient statistic. Template:Colored theorem

Proof. Omitted.

Template:Colored remark Template:Colored theorem

Proof. Assume T is a Template:Colored em for θ and ๐”ผ[φ(T)]=τ(θ).

Since T is a sufficient statistic for θ, we can apply the Rao-Blackwell theorem. From Rao-Blackwell theorem, if W is an arbitrary unbiased estimator of τ(θ), then φ(T) is another unbiased estimator where Var(φ(T))Var(W).

To prove that φ(T) is the unique UMVUE of τ(θ), we proceed to show that Template:Colored em of the choice of the unbiased estimator W of τ(θ), we get the Template:Colored em φ(T) from the Rao-Blackwell theorem (with probability 1). Then, we will have Template:Colored em possible unbiased estimator W of τ(θ), Var(φ(T))Var(W) (with probability 1) [7], which means φ(T) is the UMVUE, and is also the Template:Colored em UMVUE since we always get the same φ(T) [8].

Assume that W is Template:Colored em unbiased estimator of τ(θ) (WW). By Rao-Blackwell theorem again, there is an unbiased estimator ψ(T)=๐”ผ[W|T] (ψ(T)φ(T)) where Var(ψ(T))Var(W). Since both φ(T) and ψ(T) are unbiased estimators of τ(θ), we have for each θΘ, ๐”ผ[φ(T)]=๐”ผ[ψ(T)]๐”ผ[φ(T)ψ(T)]=0. Since T is a complete statistic, we have โ„™(φ(T)ψ(T)=0)=1โ„™(φ(T)=ψ(T))=1, which means φ(T)=ψ(T) (with probability 1), i.e., we get the same φ(T) from the Rao-Blackwell theorem in this case (with probability 1).

Template:Colored remark Template:Colored example Template:Colored remark Template:Colored example Template:Colored exercise

Consistency

In the previous sections, we have discussed Template:Colored em and Template:Colored em. In this section, we will discuss another property called Template:Colored em. Template:Colored definition Template:Colored remark Template:Colored proposition

Proof. Assume θ^ is an (asymptotically) unbiased estimator of an unknown parameter θ and Var(θ^)0 as n. Since θ^ is an (asymptotically) unbiased estimator of θ, we have limnBias(θ^)=0 (this is true for both asymptotically unbiased estimator and unbiased estimator of θ). In addition to this, we have by assumption that limnVar(θ^)=0. By definition of mean squared error, these imply that limnMSE(θ^)=0limn๐”ผ[(θ^θ)2]=0. Thus, as n, we have by Chebyshov's inequality (notice that MSE(θ^)=๐”ผ[(θ^θ)2] exist from above), for each ε>0, โ„™(|θ^θ|>ε)๐”ผ[(θ^θ)2]ε20ε2=0. Since probability is nonnegative (0), and this probability is less than or equal to an expression that tends to be 0 as n, we conclude that this probability tends to be zero as n. That is, θ^ is a Template:Colored em of θ.

Template:Colored remark Template:Colored example Template:Nav Template:BookCat

  1. โ†‘ For the parameter vector, it contains all parameters governing the distribution.
  2. โ†‘ We will simply use "θ" when we do not know whether it is parameter vector or just a single parameter. We may use θ instead if we know it is indeed a parameter vector.
  3. โ†‘ We will discuss some criterion for "good" in the #Properties of estimator section.
  4. โ†‘ For each positive integer r, mr always exist, unlike μr.
  5. โ†‘ "uniformly" means that the variance is minimum compared to other unbiased estimators, Template:Colored em (i.e., for each possible value of θΘ). That is, the variance is not just minimum for a particular value of θ, but all possible values of θ.
  6. โ†‘ This is different from the minimum value. For Template:Colored em, it only needs to be smaller than all variances involved, and there may not be any variance that actually achieve this lower bound. However, for the minimum value, it has to be one of the values of the variance.
  7. โ†‘ Notice that this is a stronger result than the result in the Rao-Blackwell theorem, where the latter only states that Var(φ(T))Var(W), Template:Colored em
  8. โ†‘ Indeed, we know that UMVUE must be unique from previous proposition. However, in this argument, when we show that φ(T) is UMVUE, we also automatically show that it is unique.