Statistics/Multivariate Data Analysis

From testwiki
Jump to navigation Jump to search

Distributions

Multivariate Normal

The multivariate normal is just an extension of the normal distribution to the multivariate case. The simplest definition of the multivariate normal distribution can be given as follows:

Template:Definition

At first glance, the definition seems rather abstract and esoteric. After all, the univariate normal distribution has a specific form of density and a specific characteristic function, both of which are mathematically valid characterisations of any probability distribution. However, this kind of definition is necessary to deal with the case where Σ is not strictly positive definite. In the case where Σ is positive definite, it can be shown via Gauss-Markov theorem that the density function of 𝐗, f𝐗(𝐱)=12π|Σ|12e12(𝐱μ)TΣ1(𝐱μ). However, this will not be true when Σ is singular, as in that case the density function will not exist. But a definition based on the characteristic function will still work. A piecewise density function can still be derived based on the eigenvalues of Σ, but it is not a true density.

Matrix-variate Normal

We will first need to develop some notation. Let Xm×n be a matrix with columns c(1),c(2),,c(n). Then we define the column vector vec(X):=[c(1)c(2)c(n)], and we call it the vectorisation of X. Template:Definition The reader here should notice that this is simply imposing a normal distribution on the vectorisation of X. Thus, many of the results that are true for multivariate normal random vector will also be true for the vectorisation of matrix variate normal random variable.

Now that we have a definition of the multivariate and matrix-variate normal distribution, our next aim should be to find a similar analogue of the univariate χ(p)2 distribution with p degrees of freedom and Student's t distribution, both of which are very closely related to the univariate normal distribution. We know that if Xi𝒩(μi,σi2) i{1,2,n} then i=1n(Xiμi)2σi2χ(n)2. What would be an analogue of this for the multivariate case?

Wishart Distribution

Template:Definition Although there does exist a form of density for the Wishart distribution, it is not necessary to prove most of the results we will require. An important thing to note, however, is that if S follows a Wishart distribution, then 𝐚TS𝐚𝐚TΣ𝐚χ(n)2. This result can be easily proved by multiplying S on the left and right by 𝐚Tand 𝐚, and then using the fact that 𝐚T𝐗𝒩(𝐚Tμ,𝐚TΣ𝐚).

Methodology

  1. Principal Component Analysis
  2. Canonical Correlation Analysis

Template:BookCat