Introduction to Numerical Methods/System of Linear Equations

From testwiki
Revision as of 14:50, 27 April 2021 by imported>ZI Jony (clean up, typo(s) fixed: For example → For example, (2), a invertible → an invertible using AWB)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Objectives

  • define vector and matrix
  • add, subtract, and multiply matrices
  • find the transpose of a square matrix
  • define the inverse of a matrix
  • setup simultaneous linear equations in matrix form

Resources

Vectors and Matrices

A matrix is a rectangular array of things, such as numbers, symbols, or expressions. Matrices are commonly used to express linear transformations and system of linear equations.

A triangular matrix is a special type of square matrices. If all entries of A below the main diagonal are zero, A is called an upper triangular matrix.

aij=0 for all i>j

Similarly if all entries of A above the main diagonal are zero, A is called a lower triangular matrix.

aij=0 for all i<j

If all entries outside the main diagonal are zero, A is called a diagonal matrix.

aij=0 for all ij

This matrix

[142034001]

is upper triangular and this matrix

[100280497]

is lower triangular.

The identity matrix of size n is the n-by-n matrix in which all the elements on the main diagonal are equal to 1 and all other elements are equal to 0.

Matrix Multiplication

Matrix multiplication takes two matrices and produces another matrix. The rule for the operation is illustrated as follows:

[a11a12a31a32]4×2 matrix[b12b13b22b23]2×3 matrix=[x12x13x32x33]4×3 matrix

The values at the intersections marked with circles are:

x12=a11b12+a12b22x13=a11b13+a12b23x32=a31b12+a32b22x33=a31b13+a32b23

Transpose of a Matrix

The transpose of a matrix is defined as follows: The transpose of a matrix A is another matrix AT created by any one of the following equivalent actions:

  • reflect A over its main diagonal (which runs from top-left to bottom-right) to obtain AT
  • write the rows of A as the columns of AT
  • write the columns of A as the rows of AT:

The following figure illustrates the idea:

The transpose AT of a matrix A can be obtained by reflecting the elements along its main diagonal. Repeating the process on the transposed matrix returns the elements to their original position.

Inverse of a Matrix

An n-by-n square matrix A is called invertible if there exists an n-by-n square matrix B such that

𝐀𝐁=𝐁𝐀=𝐈n 

where In denotes the n-by-n identity matrix and the multiplication used is ordinary matrix multiplication. B is called the inverse of A, denoted by A−1.

Represent System of Linear Equations in Matrix Form

2x+yz=8(L1)3xy+2z=11(L2)2x+y+2z=3(L3)

We can represent this system of linear equations in matrix form as shown below:

[2x+yz3xy+2z2x+y+2z]=[8113]

Then we can use matrix multiplication to separate the variables.

[211312212][xyz]=[8113]
[25516481144121][a1a2a3]=[106.8177.2279.2]

The general matrix form for system of linear equations is as follows:

An×nXn×1=Cn×1

Matrix A is the coefficient matrix. X is the solution vector (matrix) and C is the right hand side vector. If we multiply the inverse of A on bother sides we can see the solution is closely related to the inverse of A.

A1AX=A1CIX=A1CX=A1C

Gaussian Elimination

Sources:

Gaussian elimination is an algorithm for solving a system of linear equations, which is similar to finding the inverse of an invertible square matrix. The algorithm consists of a sequence of row reduction operations performed on the associated matrix of coefficients. There are three types of elementary row operations:

  • swapping two rows
  • multiplying a row by a non-zero constant
  • adding a multiple of one row to another row.

For example, the first linear equation (L1, pivot equation) can be used to eliminate x from the next two equations: :

2x+yz=8(L1)3xy+2z=11(L2)2x+y+2z=3(L3)

Then L2 (pivot equation) can be used to eliminate y from L3. This process is called forward elimination. Now L3 can be solved with one unknown z, which can be used to substitute the z in L2 to solve y. This process is called backward substitution.

The algorithm for the Gaussian Elimination method can be implemented as follows:

''' 
x = gauss_elimin(a, b)
Solves [a][x] = [b] by Gauss elimination.
'''
from numpy import dot, array

def gauss_elimin(a, b):
  (rows, cols) = a.shape
  # elimination phase
  for row in range(0, rows-1): # pivot equation/row
    for i in range(row+1, rows):
      if a[i, row] != 0.0:
        factor = a[i, row]/a[row, row]
        a[i, row+1:rows] = a[i, row+1:rows] - factor*a[row, row+1:rows]
        b[i] = b[i] - factor*b[row]
  # back substitution 
  for k in range(rows-1,-1,-1):
    b[k] = (b[k] - dot(a[k, k+1:rows],b[k+1:rows]))/a[k, k]
  return b

a = array([[3, 2.0], [-6, 6]])
b = array([7, 6])
print gauss_elimin(a, b)

Gaussian Elimination with Partial Pivoting

Gaussian elimination method is prone to round-off errors especially when a large number equations allows the error to accumulate. Another problem is the division by zero error. For example, the following matrix will break the method because the second pivot element is zero and the method requires the pivot elements to be non-zero.

[1128001110213]

Gaussian elimination with partial pivoting is similar to the Gaussian elimination except that in the ith forward elimination step the maximum of |ai,i|,|ai+1,i|,...,|an,i| is found and the row that contains the maximum will be swapped with the pivot row. The previous coefficient matrix would look as follows after the swaps.

[1128021300111]

Gauss-Seidel Method

Gaussian elimination is a direct (straightforward) method that transforms the original equations to equivalent ones that are easier to solve. Some systems of equations have no solution because for example the number of equations is less than the number of unknowns or one equation contradicts another equation.

Gauss-Seidel method is an iterative (or indirect) method that starts with a guess at the solution and repeatedly refine the guess till it converges (the convergence criterion is met). This method is advantageous because it is more computationally efficient when the coefficient matrix is large and sparse and the round-off errors can be control by adjusting the error tolerance.

The algorithm consists the following steps:

  1. rewrite each equation for the corresponding unknowns, for example, the first equation is written with x1 on the left-hand side and the second equation with x2 on the left-hand side:
xi=cij=1,jinaijxjaii,i=1,2,3,...,n.
  1. find initial guess for the xi's.
  2. use the rewritten equations to calculate the new estimates - always using the most recent estimates.
  3. repeat the previous step till the largest absolute relative approximate error among all xi's is less than the specified tolerance.

The drawback of most iterative method is that it may not converge when the system of equations has a coefficient matrix that is not diagonally dominant. A system of equations where the coefficient matrix is diagonally dominant does always converge. The matrix A is diagonally dominant if

|aii|ji|aij|for all i,

and

|aii|>ji|aij|for at least one i.

An implementation in Python using numpy simply iterates to produce the solution vector.

LU Decomposition

LU decomposition factors the coefficient matrix A to the product of a lower triangular matrix and an upper triangular matrix: A = LU. The process of deriving L and U from A is called LU decomposition or LU factorization, which is similar to Gaussian elimination method.

Once LU decomposition is done, we can use the system of linear equations represented by A using forward and back substitutions:

AX=CLUX=CL1LUX=L1CUX=L1C

Let L1C=Z, which means LZ=C and Z can be solved by forward elimination. From UX=Z we can solve X using backward elimination.

One method of LU decomposition is similar to Gaussian elimination. For a 3×3:

A=[a11a12a13a21a22a23a31a32a33]=[100l2110l31l321][u11u12u130u22u2300u33].
A=[u11u12u13u11l21u12l21+u22u13l21+u23u11l31u12l31+u22l32u13l31+u23l32+u33]

From the last matrix we can see to eliminate a21 we need to multiple the first row of A by l21 and similarly to eliminate a31 by multiplying the first row by l31. To calculate L we can simply record the multipliers used in the forward elimination steps and the matrix U is the same as the resultant upper triangular matrix from forward elimination.

Sample code in Python

Find the Inverse of a Matrix

Finding the inverse of a matrix is a process related to solving the system of linear equations. If the inverse of a coefficient matrix is found then the system of equations represented by the coefficient matrix is solved.

AX=CA1AX=A1CX=A1C

One way to find the inverse of a matrix is to find each column of the inverse matrix separately by solving a system of linear equations. If AB=I and

A=[a11a12a13a21a22a23a31a32a33] B=[b11b12b13b21b22b23b31b32b33]

then

A[b11b21b31]=LU[b11b21b31]=[100].

We can calculate the first column of B by solving a system of equations using the LU decomposition method. The rest of the columns can be calculated in a similar fashion. As you can see the LU decomposition method is advantageous the LU decomposition is performed once and the result is used many times, which lowers the computational complexity of the whole solution.

Template:BookCat