Linear Algebra/Topic: The Method of Powers

Template:Navigation

In practice, calculating eigenvalues and eigenvectors is a difficult problem. Finding, and solving, the characteristic polynomial of the large matrices often encountered in applications is too slow and too hard. Other techniques, indirect ones that avoid the characteristic polynomial, are used. Here we shall see such a method that is suitable for large matrices that are "sparse" (the great majority of the entries are zero).

Suppose that the $n \times n$ matrix $T$ has the $n$ distinct eigenvalues $λ_{1}$ , $λ_{2}$ , ..., $λ_{n}$ . Then $ℝ^{n}$ has a basis that is composed of the associated eigenvectors $⟨ {\vec{ζ}}_{1}, \dots, {\vec{ζ}}_{n} ⟩$ . For any $\vec{v} \in ℝ^{n}$ , where $\vec{v} = c_{1} {\vec{ζ}}_{1} + \dots + c_{n} {\vec{ζ}}_{n}$ , iterating $T$ on $\vec{v}$ gives these.

\begin{matrix} T \vec{v} & = c_{1} λ_{1} {\vec{ζ}}_{1} + c_{2} λ_{2} {\vec{ζ}}_{2} + \dots + c_{n} λ_{n} {\vec{ζ}}_{n} \\ T^{2} \vec{v} & = c_{1} λ_{1}^{2} {\vec{ζ}}_{1} + c_{2} λ_{2}^{2} {\vec{ζ}}_{2} + \dots + c_{n} λ_{n}^{2} {\vec{ζ}}_{n} \\ T^{3} \vec{v} & = c_{1} λ_{1}^{3} {\vec{ζ}}_{1} + c_{2} λ_{2}^{3} {\vec{ζ}}_{2} + \dots + c_{n} λ_{n}^{3} {\vec{ζ}}_{n} \\ ⋮ \\ T^{k} \vec{v} & = c_{1} λ_{1}^{k} {\vec{ζ}}_{1} + c_{2} λ_{2}^{k} {\vec{ζ}}_{2} + \dots + c_{n} λ_{n}^{k} {\vec{ζ}}_{n} \end{matrix}

If one of the eigenvalues, say, $λ_{1}$ , has a larger absolute value than any of the other eigenvalues then its term will dominate the above expression. Put another way, dividing through by $λ_{1}^{k}$ gives this,

\frac{T^{k} \vec{v}}{λ_{1}^{k}} = c_{1} {\vec{ζ}}_{1} + c_{2} \frac{λ_{2}^{k}}{λ_{1}^{k}} {\vec{ζ}}_{2} + \dots + c_{n} \frac{λ_{n}^{k}}{λ_{1}^{k}} {\vec{ζ}}_{n}

and, because $λ_{1}$ is assumed to have the largest absolute value, as $k$ gets larger the fractions go to zero. Thus, the entire expression goes to $c_{1} {\vec{ζ}}_{1}$ .

That is (as long as $c_{1}$ is not zero), as $k$ increases, the vectors $T^{k} \vec{v}$ will tend toward the direction of the eigenvectors associated with the dominant eigenvalue, and, consequently, the ratios of the lengths $| T^{k} \vec{v} | / | T^{k - 1} \vec{v} |$ will tend toward that dominant eigenvalue.

For example, (sample computer code for this follows the exercises), because the matrix

T = (\begin{matrix} 3 & 0 \\ 8 & - 1 \end{matrix})

is triangular, its eigenvalues are just the entries on the diagonal, $3$ and $- 1$ . Arbitrarily taking $\vec{v}$ to have the components $1$ and $1$ gives

$\begin{matrix} \vec{v} & T \vec{v} & T^{2} \vec{v} & \dots & T^{9} \vec{v} & T^{10} \vec{v} \\ (\begin{matrix} 1 \\ 1 \end{matrix}) & (\begin{matrix} 3 \\ 7 \end{matrix}) & (\begin{matrix} 9 \\ 17 \end{matrix}) & \dots & (\begin{matrix} 19 683 \\ 39 367 \end{matrix}) & (\begin{matrix} 59 049 \\ 118 097 \end{matrix}) \end{matrix}$

and the ratio between the lengths of the last two is $2.999 9$ .

Two implementation issues must be addressed. The first issue is that, instead of finding the powers of $T$ and applying them to $\vec{v}$ , we will compute ${\vec{v}}_{1}$ as $T \vec{v}$ and then compute ${\vec{v}}_{2}$ as $T {\vec{v}}_{1}$ , etc. (i.e., we never separately calculate $T^{2}$ , $T^{3}$ , etc.). These matrix-vector products can be done quickly even if $T$ is large, provided that it is sparse. The second issue is that, to avoid generating numbers that are so large that they overflow our computer's capability, we can normalize the ${\vec{v}}_{i}$ 's at each step. For instance, we can divide each ${\vec{v}}_{i}$ by its length (other possibilities are to divide it by its largest component, or simply by its first component). We thus implement this method by generating

\begin{matrix} {\vec{w}}_{0} & = {\vec{v}}_{0} / | {\vec{v}}_{0} | \\ {\vec{v}}_{1} & = T {\vec{w}}_{0} \\ {\vec{w}}_{1} & = {\vec{v}}_{1} / | {\vec{v}}_{1} | \\ {\vec{v}}_{2} & = T {\vec{w}}_{2} \\ ⋮ \\ {\vec{w}}_{k - 1} & = {\vec{v}}_{k - 1} / | {\vec{v}}_{k - 1} | \\ {\vec{v}}_{k} & = T {\vec{w}}_{k} \end{matrix}

until we are satisfied. Then the vector ${\vec{v}}_{k}$ is an approximation of an eigenvector, and the approximation of the dominant eigenvalue is the ratio $| {\vec{v}}_{k} | / | {\vec{w}}_{k - 1} | = | {\vec{v}}_{k} |$ .

One way we could be "satisfied" is to iterate until our approximation of the eigenvalue settles down. We could decide, for instance, to stop the iteration process not after some fixed number of steps, but instead when $| {\vec{v}}_{k} |$ differs from $| {\vec{v}}_{k - 1} |$ by less than one percent, or when they agree up to the second significant digit.

The rate of convergence is determined by the rate at which the powers of $| λ_{2} / λ_{1} |$ go to zero, where $λ_{2}$ is the eigenvalue of second largest norm. If that ratio is much less than one then convergence is fast, but if it is only slightly less than one then convergence can be quite slow. Consequently, the method of powers is not the most commonly used way of finding eigenvalues (although it is the simplest one, which is why it is here as the illustration of the possibility of computing eigenvalues without solving the characteristic polynomial). Instead, there are a variety of methods that generally work by first replacing the given matrix $T$ with another that is similar to it and so has the same eigenvalues, but is in some reduced form such as tridiagonal form: the only nonzero entries are on the diagonal, or just above or below it. Then special techniques can be used to find the eigenvalues. Once the eigenvalues are known, the eigenvectors of $T$ can be easily computed. These other methods are outside of our scope. A good reference is Template:Harv.

Exercises

Template:TextBox Template:TextBox Template:TextBox Template:TextBox Template:TextBox Template:TextBox

/Solutions/

This is the code for the computer algebra system Octave that was used to do the calculation above. (It has been lightly edited to remove blank lines, etc.)

Computer Code

>T=[3, 0; 8, -1] T= 3 0 8 -1 >v0=[1; 2] v0= 1 1 >v1=T*v0 v1= 3 7 >v2=T*v1 v2= 9 17 >T9=T**9 T9= 19683 0 39368 -1 >T10=T**10 T10= 59049 0 118096 1 >v9=T9*v0 v9= 19683 39367 >v10=T10*v0 v10= 59049 118096 >norm(v10)/norm(v9) ans=2.9999

Remark: we are ignoring the power of Octave here; there are built-in functions to automatically apply quite sophisticated methods to find eigenvalues and eigenvectors. Instead, we are using just the system as a calculator.

References

Template:Citation.

Template:Navigation

Template:BookCat

Linear Algebra/Topic: The Method of Powers

Exercises

References

Navigation menu

Search