Linear Algebra/Singular Value Decomposition

From testwiki
Jump to navigation Jump to search

Singular Value Decomposition

Given any m×n matrix A, the singular value decomposition (SVD) is A=UΣVH where U is an m×m unitary matrix, V is an n×n unitary matrix, and Σ is an m×n diagonal matrix where all off-diagonal entries are 0 and the diagonal entries are all non-negative real values. The diagonal entries of Σ are referred to as "singular values".

As an example, consider the shear transformation A=(1201). The singular value decomposition of A is:

(1201)=(0.92390.38270.38270.9239)(2.4142000.4142)(0.38270.92390.92390.3827)

The set of all unit length vectors v such that |v|=1 form a sphere of radius 1 around the origin. When A is applied to this sphere, it becomes an ellipsoid. The principal radii of this ellipsoid are the singular values, and their directions form the columns of U.


Existence of the singular value decomposition

One fact that is not immediately obvious is that the singular value decomposition always exists:

Template:TextBox

In essence, any linear transformation is a rotation, followed by stretching or shrinking parallel to each axis (with some dimensions added or zeroed out of existence), followed by another rotation.

The following proof will demonstrate that the singular value decomposition always exists. An outline of the proof will be given first:

Proof outline

We need to prove that an arbitrary linear transform A is a rotation: VH, followed by scaling parallel to each axis: Σ, and lastly followed by another rotation: U, giving A=UΣVH.

If the columns of A are already mutually orthogonal, then the first rotation is not necessary: V=I. The entries of Σ are the lengths of the vectors formed by the columns of A, and U is a rotation that rotates the elementary basis vectors of m to be parallel with the columns of A.

In most cases however, the columns of A are not mutually orthogonal. In this case, the rotation VH is non-trivial. A=(AV)VH, so V must be chosen so that the columns of AV are mutually orthogonal. Let V=(v1v2vn). We need to choose orthonormal vectors v1,v2,,vn so that Av1,Av2,,Avn are all mutually orthogonal. This can be done iteratively. Imagine that we have chosen v1 so that when given any vector v that is orthogonal to v1, that Av1 is orthogonal to Av. The effort now switches to finding an orthonormal set of vectors v2,v3,,vn confined to the space of vectors that are perpendicular to v1 such that Av2,Av3,,Avn are mutually orthogonal.

Let V1 be a unitary matrix with v1 as the first column. Factoring V1 from the left side of V to get V=V1Vnew results in a new set of orthonormal vectors that are the columns of Vnew. The goal of having the columns of AV be mutually orthogonal is converted to having the columns of (AV1)Vnew be mutually orthogonal with AV1 effectively replacing A. v1 transforms to e1, and the space of vectors orthogonal to v1 transforms to the space spanned by the standard basis vectors e2,e3,,en. The first column of AV1 is Av1 and so is orthogonal to all other columns.

If U1 is a unitary matrix where the first column of U1 is Av1 normalized to unit length, then factoring U1 from the left side of AV1 to get A1=U1HAV1 results in a matrix in which the first column is parallel to the standard basis vector e1. The first column of AV1 is orthogonal to all other columns, so the first column of A1 is orthogonal to all other columns, so hence the first row of A1 contains all 0s except for the first column.

v2,v3,,vn can now be determined recursively with the dimension reduced to n1, and A is replaced with A1 with the first row and column removed. This forms the inductive component of the coming proof.

Lastly, how do we know that there exists v1 so that when given any vector v that is orthogonal to v1, that Av1 is orthogonal to Av? The answer will be that the unit length v that maximizes |Av| is a valid v1.

We are now ready to give the proof in full detail:

Template:TextBox

Template:BookCat