Calculus/Derivatives of multivariate functions: Difference between revisions

From testwiki
Jump to navigation Jump to search
imported>Texvc2LaTeXBot
m Replacing deprecated latex syntax mw:Extension:Math/Roadmap
 
(No difference)

Latest revision as of 20:06, 22 May 2019

Template:Calculus/Top Nav

The matrix of a linear transformation

Theorem

A linear transformation L:nm amounts to multiplication by a uniquely defined matrix; that is, there exists a unique matrix Am×n such that

vn:L(v)=Av
Proof

We set the column vectors

(a1,ja2,jan,j):=L(ej)

where {e1,,en} is the standard basis of n . Then we define from this

A:=(a1,1a1,nan,1an,n)

and note that for any vector v=(v1,,vn)t of n we obtain

Av=A(j=1nvjej)=j=1nAvjej=j=1nvjL(ej)=L(j=1nvjej)=L(v)

Thus, we have shown existence. To prove uniqueness, suppose there were any other matrix Bm×n with the property that vn:L(v)=Bv . Then in particular,

Bej=L(ej)

which already implies that A=B (since all the columns of both matrices are identical).

How to generalise the derivative

It is not immediately straightforward how one would generalize the derivative to higher dimensions. For, if we take the definition of the derivative at a point x0

limh0f(x0+h)f(x0)h

and insert vectors for h and x0 , we would divide the whole thing by a vector. But this is not defined.

Hence, we shall rephrase the definition of the derivative a bit and cast it into a form where it can be generalized to higher dimensions.

Theorem

Let f: be a one-dimensional function and let x0 . Then f is differentiable at x0 if and only if there exists a linear function l: such that

limh0|f(x0+h)(f(x0)+l(h))||h|=0

We note that according to the above, linear functions l: are given by multiplication by a 1×1-matrix, that is, a scalar.

Proof

First assume that f is differentiable at x0. We set l(h):=f(x0)h and obtain

|f(x0+h)(f(x0)+l(h))||h|=|f(x0+h)f(x0)hf(x0)|

which converges to 0 due to the definition of f(x0) .

Assume now that we are given an l: such that

limh0|f(x0+h)(f(x0)+l(h))||h|=0

Let c be the scalar associated to l . Then by an analogous computation f(x0)=c .

With the latter formulation of differentiability from the above theorem, we may readily generalize to higher dimensions, since division by the Euclidean norm of a vector is defined, and linear mappings are also defined in higher dimensions.

Definition

A function f:mn is called differentiable or totally differentiable at a point x0m if and only if there exists a linear function L:mn such that

limh0f(x0+h)(f(x0)+L(h))h=0

We have already proven that this definition coincides with the usual one in the one-dim. case (that is m=n=1).

We have the following theorem:

Theorem

Let Sm be a set, let x0S be an interior point of S , and let f:Sm be a function differentiable at x0 . Then the linear map L such that

limh0f(x0+h)(f(x0)+L(h))h=0

is unique; that is, there exists only one such map L .

Proof

Since x0 is an interior point of S, we find r>0 such that Br(x0)S . Let now K:mn be any other linear mapping with the property that

limh0f(x0+h)(f(x0)+K(h))h=0

We note that for all vectors of the standard basis {e1,,en} , the numbers λej for 0λ<r are contained within S . Hence, we obtain by the triangle inequality

L(ej)K(ej)=L(λej)K(λej)λejf(x0+λej)(f(x0)+L(λej))λej+f(x0+λej)(f(x0)+K(λej))λej

Taking λ0 , we see that L(ej)=K(ej) . Thus, L and K coincide on all basis vectors, and since every other vector can be expressed as a linear combination of those, by linearity of L and K we obtain L=K .

Thus, the following definition is justified:

Definition

Let f:Sn be a function (where Sm is a subset of m), and let x0 be an interior point of S such that f is differentiable at x0 . Then the unique linear function L such that

limh0f(x0+h)(f(x0)+L(h))h=0

is called the differential of f at x0 and is denoted f(x0):=L .

Directional and partial derivatives

We shall first define directional derivatives.

Definition

Let f:mn be a function, and let vm be a vector. If the limit

limh0f(x0+hv)f(x0)h

exists, it is called directional derivative of f in direction v . We denote it by Dvf(x0) .

The following theorem relates directional derivatives and the differential of a totally differentiable function:

Theorem

Let f:mn be a function that is totally differentiable at x0, and let vm{0} be a nonzero vector. Then Dvf(x0) exists and is equal to f(x0)v .

Proof

According to the very definition of total differentiability,

limh0f(x0+hv)f(x0)|h|vf(x0)v|h|v=0

Hence,

limh0f(x0+hv)f(x0)|h|f(x0)v|h|=0

by multiplying the above equation by v . Noting that

f(x0+hv)f(x0)|h|f(x0)v|h|=f(x0+hv)f(x0)hf(x0)vh

the theorem follows.

A special case of directional derivatives are partial derivatives:

Definition

Let {e1,,em} be the standard basis of m , let x0m and let f:mn be a function such that the directional derivatives Dejf(x0) all exist. Then we set

fxj:=Dejf(x0)

and call it the partial derivative in the direction of xj .

In fact, by writing down the definition of Dejf(x0) , we see that the partial derivative in the direction of xj is nothing else than the derivative of the function yf(x0,1,,x0,j1,y,x0,j+1,,x0,m) in the variable y at the place x0,j . That is, for instance, if

f(x,y,z)=x2+4z3+3xy

then

fx=2x+3y , fy=3x , fz=12z2

that is, when forming a partial derivative, we regard the other variables as constant and derive only with respect to the variable we are considering.

The Jacobian matrix

From the above, we know that the differential of a function f(x0) has an associated matrix representing the linear map thus defined. Under a condition, we can determine this matrix from the partial derivatives of the component functions.

Theorem

Let f:mn be a function such that all partial derivatives exist at x0 and are continuous in each component on Br(x0) for a possibly very small, but positive r>0 . Then f is totally differentiable at x0 and the differential of f is given by left multiplication by the matrix

Jf(x0):=(f1x1f1xmfnx1fnxm)

where f=(f1,,fn) .

The matrix Jf(x0) is called the Jacobian matrix.

Proof
f(x0+h)(f(x0)+Jf(x0)h)h =j=1nfj(x0+h)ejj=1n(fj(x0)+k=1mhkfjxm(x0))ejh
j=1nfj(x0+h)(fj(x0)+k=1mhkfjxm(x0))h

We shall now prove that all summands of the last sum go to 0.

Indeed, let j{1,,n} . Writing again h=(h1,,hm) , we obtain by the one-dimensional mean value theorem, first applied in the first variable, then in the second and so on, the succession of equations

fj(x0+h1e1)fj(x0)=(x0,1+h1x0,1)=h1fjx1(x0+t1e1)
fj(x0+h1e1+h2e2)fj(x0+h1e1)=(x0,2+h2x0,2)=h2fjx2(x0+h1e1+t2e2)
fj(x0+h1e1++hmem)fj(x0+h1e1++hm1em1)=(x0,m+hmx0,m)=hmfjxm(x0+h1e1++hm1em1+tnem)

for suitably chosen tk[x0,k,x0,k+hk] . We can now sum all these equations together to obtain

fj(x0+h)f(x0)=k=1mhkfjxk(x0+l=1k1hlel+tkek)

Let now δ>0 . Using the continuity of the fjxk on Br(x0) , we may choose δk>0 such that

|fjxk(x0+l=1k1hlel+tkek)fjxm(x0)|<ϵm

for |hk|<δk , given that hBr(0) (which we may assume as h0). Hence, we obtain

fj(x0+h)(fj(x0)+k=1mhkfjxm(x0))hhmϵmh

and thus the theorem.

Corollary

If f:mn is continuously differentiable at x0m and vm{0} , then

Dvf(x0)=j=1mvjfxj(x0)
Proof
Dvf(x0)=f(x0)(v)=Jf(x0)v=j=1mvjfxj(x0)

Template:BookCat