Calculus/Inverse function theorem, implicit function theorem

Template:Calculus/Top Nav

In this chapter, we want to prove the inverse function theorem (which asserts that if a function has invertible differential at a point, then it is locally invertible itself) and the implicit function theorem (which asserts that certain sets are the graphs of functions).

Banach's fixed point theorem

Theorem:

Let $(M, d)$ be a complete metric space, and let $f : M \to M$ be a strict contraction; that is, there exists a constant $0 \leq λ < 1$ such that

\forall m, n \in M : d (f (m), f (n)) \leq λ d (m, n)

.

Then $f$ has a unique fixed point, which means that there is a unique $x \in M$ such that $f (x) = x$ . Furthermore, if we start with a completely arbitrary point $y \in M$ , then the sequence

y, f (y), f (f (y)), f (f (f (y))), \dots

converges to $x$ .

Proof:

First, we prove uniqueness of the fixed point. Assume $x, y$ are both fixed points. Then

d (x, y) = d (f (x), f (y)) \leq λ d (x, y) \Rightarrow (1 - λ) d (x, y) = 0

.

Since $0 \leq λ < 1$ , this implies $d (x, y) = 0 \Rightarrow x = y$ .

Now we prove existence and simultaneously the claim about the convergence of the sequence $y, f (y), f (f (y)), f (f (f (y))), \dots$ . For notation, we thus set $z_{0} : = y$ and if $z_{n}$ is already defined, we set $z_{n + 1} = f (z_{n})$ . Then the sequence $(z_{n})_{n \in ℕ}$ is nothing else but the sequence $y, f (y), f (f (y)), f (f (f (y))), \dots$ .

Let $n \geq 0$ . We claim that

d (z_{n + 1}, z_{n}) \leq λ^{n} d (z_{1}, z_{0})

.

Indeed, this follows by induction on $n$ . The case $n = 0$ is trivial, and if the claim is true for $n$ , then $d (z_{n + 2}, z_{n + 1}) = d (f (z_{n + 1}), f (z_{n})) \leq λ d (z_{n + 1}, z_{n}) \leq λ \cdot λ^{n} d (z_{1}, z_{0})$ .

Hence, by the triangle inequality,

\begin{matrix} d (z_{n + m}, z_{n}) & \leq \sum_{j = n + 1}^{n + m} d (z_{j}, z_{j - 1}) \\ \leq \sum_{j = n + 1}^{n + m} λ^{j - 1} d (z_{1}, z_{0}) \\ \leq \sum_{j = n + 1}^{\infty} λ^{j - 1} d (z_{1}, z_{0}) \\ = d (z_{1}, z_{0}) λ^{n} \frac{1}{1 - λ} \end{matrix}

.

The latter expression goes to zero as $n \to \infty$ and hence we are dealing with a Cauchy sequence. As we are in a complete metric space, it converges to a limit $x$ . This limit further is a fixed point, as the continuity of $f$ ( $f$ is Lipschitz continuous with constant $λ$ ) implies

x = \lim_{n \to \infty} z_{n} = \lim_{n \to \infty} f (z_{n - 1}) = f (\lim_{n \to \infty} z_{n - 1}) = f (x)

.

◻

A corollary to this important result is the following lemma, which shall be the main ingredient for the proof of the inverse function theorem:

Lemma:

Let $g : \overline{B_{r} (0)} \to \overline{B_{r} (0)}$ ( $\overline{B_{r} (0)} \subset ℝ^{n}$ denoting the closed ball of radius $r$ ) be a function which is Lipschitz continuous with Lipschitz constant less or equal $1 / 2$ such that $g (0) = 0$ . Then the function

f : \overline{B_{r} (0)} \to ℝ^{n}, f (x) : = g (x) + x

is injective and $B_{r / 2} (0) \subseteq f (B_{r} (0))$ .

Proof:

First, we note that for $y \in B_{r / 2} (0)$ the function

h : \overline{B_{r} (0)} \to ℝ^{n}, h (z) : = y - g (z)

is a strict contraction; this is due to

‖ y - g (z) - (y - g (z^{'})) ‖ = ‖ g (z^{'}) - g (z) ‖ \leq \frac{1}{2} ‖ z - z^{'} ‖

.

Furthermore, it maps $\overline{B_{r} (0)}$ to itself, since for $z \in \overline{B_{r} (0)}$

‖ y - g (z) ‖ \leq ‖ y ‖ + ‖ g (z - 0) ‖ \leq \frac{r}{2} + \frac{1}{2} ‖ z ‖ \leq r

.

Hence, the Banach fixed-point theorem is applicable to $h$ . Now $x$ being a fixed point of $h$ is equivalent to

f (x) = y

,

and thus $B_{r / 2} (0) \subseteq f (B_{r} (0))$ follows from the existence of fixed points. Furthermore, if $f (x) = f (x^{'})$ , then

\frac{1}{2} ‖ x - x^{'} ‖ \geq ‖ g (x) - g (x^{'}) ‖ = ‖ f (x) - x - (f (x^{'}) - x^{'}) ‖ = ‖ x - x^{'} ‖

and hence $x = x^{'}$ . Thus injectivity. $◻$

The inverse function theorem

Theorem:

Let $f : ℝ^{n} \to ℝ^{n}$ be a function which is continuously differentiable in a neighbourhood $x_{0} \in ℝ^{n}$ such that $f^{'} (x_{0})$ is invertible. Then there exists an open set $U \subseteq ℝ^{n}$ with $x_{0} \in U$ such that $f |_{U}$ is a bijective function with an inverse $f^{- 1} : f (U) \to U$ which is differentiable at $x_{0}$ and satisfies

(f^{- 1})^{'} (f (x_{0})) = (f^{'} (x_{0}))^{- 1}

.

Proof:

We first reduce to the case $f (x_{0}) = 0$ , $x_{0} = 0$ and $f^{'} (x_{0}) = Id$ . Indeed, suppose for all those functions the theorem holds, and let now $h$ be an arbitrary function satisfying the requirements of the theorem (where the differentiability is given at $x_{0}$ ). We set

\tilde{h} (x) : = h^{'} (x_{0})^{- 1} (h (x_{0} - x) - h (x_{0}))

and obtain that $\tilde{h}$ is differentiable at $0$ with differential $Id$ and $\tilde{h} (0) = 0$ ; the first property follows since we multiply both the function and the linear-affine approximation by $h^{'} (x_{0})^{- 1}$ and only shift the function, and the second one is seen from inserting $x = 0$ . Hence, we obtain an inverse of $\tilde{h}$ with it's differential at $\tilde{h} (0) = 0$ , and if we now set

h^{- 1} (y) : = ({\tilde{h}}^{- 1} (h^{'} (x_{0})^{- 1} (y - h (x_{0}))) - x_{0})

,

it can be seen that $h^{- 1}$ is an inverse of $h$ with all the required properties (which is a bit of a tedious exercise, but involves nothing more than the definitions).

Thus let $f$ be a function such that $f (0) = 0$ , $f$ is invertible at $0$ and $f^{'} (0) = Id$ . We define

g (x) : = f (x) - x

.

The differential of this function is zero (since taking the differential is linear and the differential of the function $x \mapsto x$ is the identity). Since the function $g$ is also continuously differentiable at a small neighbourhood of $0$ , we find $δ > 0$ such that

\frac{\partial g}{\partial x_{j}} (x) < \frac{1}{2 n^{2}}

for all $j \in {1, \dots, n}$ and $x \in B_{δ} (0)$ . Since further $g (0) = f (0) - 0 = 0$ , the general mean-value theorem and Cauchy's inequality imply that for $k \in {1, \dots, n}$ and $x \in B_{δ} (0)$ ,

| g_{k} (x) | = | ⟨ x, \frac{\partial g}{\partial x_{j}} (t_{k} x) ⟩ | \leq ‖ x ‖ n \frac{1}{2 n^{2}}

for suitable $t_{k} \in [0, 1]$ . Hence,

‖ g (x) ‖ \leq | g_{1} (x) | + \dots + | g_{n} (x) | \leq \frac{1}{2} ‖ x ‖

(triangle inequality),

and thus, we obtain that our preparatory lemma is applicable, and $f$ is a bijection on $\overline{B_{δ} (0)}$ , whose image is contained within the open set $\overline{B_{δ / 2} (0)}$ ; thus we may pick $U : = f^{- 1} (B_{δ / 2} (0))$ , which is open due to the continuity of $f$ .

Thus, the most important part of the theorem is already done. All that is left to do is to prove differentiability of $f^{- 1}$ at $0$ . Now we even prove the slightly stronger claim that the differential of $f^{- 1}$ at $x_{0}$ is given by the identity, although this would also follow from the chain rule once differentiability is proven.

Note now that the contraction identity for $g$ implies the following bounds on $f$ :

\frac{1}{2} ‖ x ‖ \leq ‖ f (x) ‖ \leq \frac{3}{2} ‖ x ‖

.

The second bound follows from

‖ f (x) ‖ \leq ‖ f (x) - x ‖ + ‖ x ‖ = ‖ g (x) ‖ + ‖ x ‖ \leq \frac{3}{2} ‖ x ‖

,

and the first bound follows from

‖ f (x) ‖ \geq | ‖ f (x) - x ‖ - ‖ x ‖ | = | ‖ g (x) ‖ - ‖ x ‖ | \geq \frac{1}{2} ‖ x ‖

.

Now for the differentiability at $0$ . We have, by substitution of limits (as $f$ is continuous and $f (0) = 0$ ):

\begin{matrix} \lim_{𝐡 \to 0} \frac{‖ f^{- 1} (𝐡) - f^{- 1} (0) - Id (𝐡 - 0) ‖}{‖ 𝐡 ‖} & = \lim_{𝐡 \to 0} \frac{‖ f^{- 1} (f (𝐡)) - f (𝐡) ‖}{‖ f (𝐡) ‖} \\ = \lim_{𝐡 \to 0} \frac{‖ 𝐡 - f (𝐡) ‖}{‖ f (𝐡) ‖}, \end{matrix}

where the last expression converges to zero due to the differentiability of $f$ at $0$ with differential the identity, and the sandwhich criterion applied to the expressions

\frac{‖ 𝐡 - f (𝐡) ‖}{\frac{3}{2} ‖ 𝐡 ‖}

and

\frac{‖ 𝐡 - f (𝐡) ‖}{\frac{1}{2} ‖ 𝐡 ‖}

.

◻

The implicit function theorem

Theorem:

Let $f : ℝ^{n} \to ℝ$ be a continuously differentiable function, and consider the set

S : = {(x_{1}, \dots, x_{n}) \in ℝ^{n} | f (x_{1}, \dots, x_{n}) = 0}

.

If we are given some $y \in S$ such that $\partial_{n} f (y) \neq 0$ , then we find $U \subseteq ℝ^{n - 1}$ open with $(y_{1}, \dots, y_{n - 1}) \in U$ and $g : U \to S$ such that

y = g (y_{1}, \dots, y_{n - 1})

and

{(z_{1}, \dots, z_{n - 1}, g (z_{1}, \dots, z_{n - 1})) | (z_{1}, \dots, z_{n - 1}) \in U} \subseteq S

,

where ${(z_{1}, \dots, z_{n - 1}, g (z_{1}, \dots, z_{n - 1})) | (z_{1}, \dots, z_{n - 1}) \in U}$ is open with respect to the subspace topology of $U$ .

Furthermore, $g$ is a differentiable function.

Proof:

We define a new function

F : ℝ^{n} \to ℝ^{n}, F (x_{1}, \dots, x_{n}) : = (x_{1}, \dots, x_{n - 1}, f (x_{1}, \dots, x_{n}))

.

The differential of this function looks like this:

F^{'} (x) = (\begin{matrix} 1 & 0 & \dots & 0 \\ 0 & 1 & ⋮ \\ ⋮ & ⋱ \\ 0 & \dots & 0 & 1 & 0 \\ \partial_{1} f (x) & \dots & \partial_{n} f (x) \end{matrix})

Since we assumed that $\partial_{n} f (y) \neq 0$ , $F^{'} (y)$ is invertible, and hence the inverse function theorem implies the existence of a small open neighbourhood $\tilde{V} \subseteq ℝ^{n}$ containing $y$ such that restricted to that neighbourhood $F$ is itself invertible, with a differentiable inverse $F^{- 1}$ , which is itself defined on an open set $\tilde{U}$ containing $F (y)$ . Now set first

U : = {(x_{1}, \dots, x_{n - 1}) | (x_{1}, \dots, x_{n - 1}, 0) \in \tilde{U}}

,

which is open with respect to the subspace topology of $ℝ^{n - 1}$ , and then

g : U \to ℝ, g (x_{1}, \dots, x_{n - 1}) : = π_{n} (F^{- 1} (x_{1}, \dots, x_{n - 1}, 0))

,

the $n$ -th component of $F^{- 1} (x_{1}, \dots, x_{n - 1}, 0)$ . We claim that $g$ has the desired properties.

Indeed, we first note that $F^{- 1} (x_{1}, \dots, x_{n - 1}, 0) = (x_{1}, \dots, x_{n - 1}, g (x_{1}, \dots, x_{n - 1}))$ , since applying $F$ leaves the first $n - 1$ components unchanged, and thus we get the identity by observing $F (F^{- 1} (x)) = x$ . Let thus $(z_{1}, \dots, z_{n - 1}) \in U$ . Then

\begin{matrix} f (z_{1}, \dots, z_{n - 1}, g (z_{1}, \dots, z_{n - 1})) & = (π_{n} \circ F) (F^{- 1} (z_{1}, \dots, z_{n - 1}, 0)) \\ = π_{n} ((F \circ F^{- 1}) (z_{1}, \dots, z_{n - 1}, 0)) = 0 \end{matrix}

.

Furthermore, the set

{(z_{1}, \dots, z_{n - 1}, g (z_{1}, \dots, z_{n - 1})) | (z_{1}, \dots, z_{n - 1}) \in U}

is open with respect to the subspace topology on $S$ . Indeed, we show

{(z_{1}, \dots, z_{n - 1}, g (z_{1}, \dots, z_{n - 1})) | (z_{1}, \dots, z_{n - 1}) \in U} = S \cap \tilde{V}

.

For $\subseteq$ , we first note that the set on the left hand side is in $S$ , since all points in it are mapped to zero by $f$ . Further,

F (z_{1}, \dots, z_{n - 1}, g (z_{1}, \dots, z_{n - 1})) = (z_{1}, \dots, z_{n - 1}, 0) \in \tilde{U}

and hence $\subseteq$ is completed when applying $F^{- 1}$ . For the other direction, let a point $(x_{1}, \dots, x_{n})$ in $S \cap \tilde{V}$ be given, apply $F$ to get

F ((x_{1}, \dots, x_{n})) = (x_{1}, \dots, x_{n - 1}, 0) \in \tilde{U}

and hence $(x_{1}, \dots, x_{n - 1}) \in U$ ; further

(x_{1}, \dots, x_{n - 1}, g (x_{1}, \dots, x_{n - 1})) = (x_{1}, \dots, x_{n})

by applying $F$ to both sides of the equation.

Now $g$ is automatically differentiable as the component of a differentiable function. $◻$

Informally, the above theorem states that given a set ${x \in ℝ^{n} | f (x) = 0}$ , one can choose the first $n - 1$ coordinates as a "base" for a function, whose graph is precisely a local bit of that set.

Template:BookCat

Calculus/Inverse function theorem, implicit function theorem

Banach's fixed point theorem

The inverse function theorem

The implicit function theorem

Navigation menu

Search