Calculus/Inverse function theorem, implicit function theorem

From testwiki
Revision as of 20:27, 13 April 2020 by imported>Svennik (The inverse function theorem)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Template:Calculus/Top Nav

In this chapter, we want to prove the inverse function theorem (which asserts that if a function has invertible differential at a point, then it is locally invertible itself) and the implicit function theorem (which asserts that certain sets are the graphs of functions).

Banach's fixed point theorem

Theorem:

Let (M,d) be a complete metric space, and let f:MM be a strict contraction; that is, there exists a constant 0λ<1 such that

m,nM:d(f(m),f(n))λd(m,n).

Then f has a unique fixed point, which means that there is a unique xM such that f(x)=x. Furthermore, if we start with a completely arbitrary point yM, then the sequence

y,f(y),f(f(y)),f(f(f(y))),

converges to x.

Proof:

First, we prove uniqueness of the fixed point. Assume x,y are both fixed points. Then

d(x,y)=d(f(x),f(y))λd(x,y)(1λ)d(x,y)=0.

Since 0λ<1, this implies d(x,y)=0x=y.

Now we prove existence and simultaneously the claim about the convergence of the sequence y,f(y),f(f(y)),f(f(f(y))),. For notation, we thus set z0:=y and if zn is already defined, we set zn+1=f(zn). Then the sequence (zn)n is nothing else but the sequence y,f(y),f(f(y)),f(f(f(y))),.

Let n0. We claim that

d(zn+1,zn)λnd(z1,z0).

Indeed, this follows by induction on n. The case n=0 is trivial, and if the claim is true for n, then d(zn+2,zn+1)=d(f(zn+1),f(zn))λd(zn+1,zn)λλnd(z1,z0).

Hence, by the triangle inequality,

d(zn+m,zn)j=n+1n+md(zj,zj1)j=n+1n+mλj1d(z1,z0)j=n+1λj1d(z1,z0)=d(z1,z0)λn11λ.

The latter expression goes to zero as n and hence we are dealing with a Cauchy sequence. As we are in a complete metric space, it converges to a limit x. This limit further is a fixed point, as the continuity of f (f is Lipschitz continuous with constant λ) implies

x=limnzn=limnf(zn1)=f(limnzn1)=f(x).

A corollary to this important result is the following lemma, which shall be the main ingredient for the proof of the inverse function theorem:

Lemma:

Let g:Br(0)Br(0) (Br(0)n denoting the closed ball of radius r) be a function which is Lipschitz continuous with Lipschitz constant less or equal 1/2 such that g(0)=0. Then the function

f:Br(0)n,f(x):=g(x)+x

is injective and Br/2(0)f(Br(0)).

Proof:

First, we note that for yBr/2(0) the function

h:Br(0)n,h(z):=yg(z)

is a strict contraction; this is due to

yg(z)(yg(z))=g(z)g(z)12zz.

Furthermore, it maps Br(0) to itself, since for zBr(0)

yg(z)y+g(z0)r2+12zr.

Hence, the Banach fixed-point theorem is applicable to h. Now x being a fixed point of h is equivalent to

f(x)=y,

and thus Br/2(0)f(Br(0)) follows from the existence of fixed points. Furthermore, if f(x)=f(x), then

12xxg(x)g(x)=f(x)x(f(x)x)=xx

and hence x=x. Thus injectivity.

The inverse function theorem

Theorem:

Let f:nn be a function which is continuously differentiable in a neighbourhood x0n such that f(x0) is invertible. Then there exists an open set Un with x0U such that f|U is a bijective function with an inverse f1:f(U)U which is differentiable at x0 and satisfies

(f1)(f(x0))=(f(x0))1.

Proof:

We first reduce to the case f(x0)=0, x0=0 and f(x0)=Id. Indeed, suppose for all those functions the theorem holds, and let now h be an arbitrary function satisfying the requirements of the theorem (where the differentiability is given at x0). We set

h~(x):=h(x0)1(h(x0x)h(x0))

and obtain that h~ is differentiable at 0 with differential Id and h~(0)=0; the first property follows since we multiply both the function and the linear-affine approximation by h(x0)1 and only shift the function, and the second one is seen from inserting x=0. Hence, we obtain an inverse of h~ with it's differential at h~(0)=0, and if we now set

h1(y):=(h~1(h(x0)1(yh(x0)))x0),

it can be seen that h1 is an inverse of h with all the required properties (which is a bit of a tedious exercise, but involves nothing more than the definitions).

Thus let f be a function such that f(0)=0, f is invertible at 0 and f(0)=Id. We define

g(x):=f(x)x.

The differential of this function is zero (since taking the differential is linear and the differential of the function xx is the identity). Since the function g is also continuously differentiable at a small neighbourhood of 0, we find δ>0 such that

gxj(x)<12n2

for all j{1,,n} and xBδ(0). Since further g(0)=f(0)0=0, the general mean-value theorem and Cauchy's inequality imply that for k{1,,n} and xBδ(0),

|gk(x)|=|x,gxj(tkx)|xn12n2

for suitable tk[0,1]. Hence,

g(x)|g1(x)|++|gn(x)|12x (triangle inequality),

and thus, we obtain that our preparatory lemma is applicable, and f is a bijection on Bδ(0), whose image is contained within the open set Bδ/2(0); thus we may pick U:=f1(Bδ/2(0)), which is open due to the continuity of f.

Thus, the most important part of the theorem is already done. All that is left to do is to prove differentiability of f1 at 0. Now we even prove the slightly stronger claim that the differential of f1 at x0 is given by the identity, although this would also follow from the chain rule once differentiability is proven.

Note now that the contraction identity for g implies the following bounds on f:

12xf(x)32x.

The second bound follows from

f(x)f(x)x+x=g(x)+x32x,

and the first bound follows from

f(x)|f(x)xx|=|g(x)x|12x.

Now for the differentiability at 0. We have, by substitution of limits (as f is continuous and f(0)=0):

lim𝐡0f1(𝐡)f1(0)Id(𝐡0)𝐡=lim𝐡0f1(f(𝐡))f(𝐡)f(𝐡)=lim𝐡0𝐡f(𝐡)f(𝐡),

where the last expression converges to zero due to the differentiability of f at 0 with differential the identity, and the sandwhich criterion applied to the expressions

𝐡f(𝐡)32𝐡

and

𝐡f(𝐡)12𝐡.

The implicit function theorem

Theorem:

Let f:n be a continuously differentiable function, and consider the set

S:={(x1,,xn)n|f(x1,,xn)=0}.

If we are given some yS such that nf(y)0, then we find Un1 open with (y1,,yn1)U and g:US such that

y=g(y1,,yn1) and {(z1,,zn1,g(z1,,zn1))|(z1,,zn1)U}S,

where {(z1,,zn1,g(z1,,zn1))|(z1,,zn1)U} is open with respect to the subspace topology of U.

Furthermore, g is a differentiable function.

Proof:

We define a new function

F:nn,F(x1,,xn):=(x1,,xn1,f(x1,,xn)).

The differential of this function looks like this:

F(x)=(1000100101f(x)nf(x))

Since we assumed that nf(y)0, F(y) is invertible, and hence the inverse function theorem implies the existence of a small open neighbourhood V~n containing y such that restricted to that neighbourhood F is itself invertible, with a differentiable inverse F1, which is itself defined on an open set U~ containing F(y). Now set first

U:={(x1,,xn1)|(x1,,xn1,0)U~},

which is open with respect to the subspace topology of n1, and then

g:U,g(x1,,xn1):=πn(F1(x1,,xn1,0)),

the n-th component of F1(x1,,xn1,0). We claim that g has the desired properties.

Indeed, we first note that F1(x1,,xn1,0)=(x1,,xn1,g(x1,,xn1)), since applying F leaves the first n1 components unchanged, and thus we get the identity by observing F(F1(x))=x. Let thus (z1,,zn1)U. Then

f(z1,,zn1,g(z1,,zn1))=(πnF)(F1(z1,,zn1,0))=πn((FF1)(z1,,zn1,0))=0.

Furthermore, the set

{(z1,,zn1,g(z1,,zn1))|(z1,,zn1)U}

is open with respect to the subspace topology on S. Indeed, we show

{(z1,,zn1,g(z1,,zn1))|(z1,,zn1)U}=SV~.

For , we first note that the set on the left hand side is in S, since all points in it are mapped to zero by f. Further,

F(z1,,zn1,g(z1,,zn1))=(z1,,zn1,0)U~

and hence is completed when applying F1. For the other direction, let a point (x1,,xn) in SV~ be given, apply F to get

F((x1,,xn))=(x1,,xn1,0)U~

and hence (x1,,xn1)U; further

(x1,,xn1,g(x1,,xn1))=(x1,,xn)

by applying F to both sides of the equation.

Now g is automatically differentiable as the component of a differentiable function.


Informally, the above theorem states that given a set {xn|f(x)=0}, one can choose the first n1 coordinates as a "base" for a function, whose graph is precisely a local bit of that set.

Template:BookCat