Calculus/Multivariate optimisation: The Lagrangian

From testwiki
Jump to navigation Jump to search

The problem

In previous sections, we've discussed how, using calculus, we can find the optimal solution of a single-variate y=f(x) by finding all points where f(x)=0. But what if we're given a bivariate function - for example, z=f(x,y)? More importantly, what if we're given constraints to follow? The single-variate model does not scale at all.

One variable, one constraint

Consider the optimisation problem minf(x) given a constraint g(x)=h.

First write the constraint in such a way that it's equal to 0 - so g(x)h=0. Then the Lagrangian of the system is defined by L(x,λ)=f(x)+λ(g(x)h). We have two variables to optimise - x and λ. Then find the derivatives with respect to the variables:

Lx=f(x)+λg(x)

Lλ=g(x)h

Set them to 0. Then the optimal set {x,λ} is the solution to f(x)+λg(x)=0 and g(x)=h.

Template:Editor note

A simple univariate example

Template:Calculus/Def

Then the Lagrangian system is L(x,λ)=5x+3+λ(x225). Take the respective derivatives:

Lx=5+λ(2x)

Lλ=x225

Set the second to zero - we get x=±5. Substitute in first: we get 5+10λ=0, which is λ=12. Substitute in second: we get λ=12. In this case, the optimal minimum is the set {x,λ}={5,12} (which is what we're looking for) and the optimal maximum is the set {x,λ}={5,12}.

It is important to realise that the Lagrangian does not guarantee that a particular solution is a minimum - we need to test the solution ourselves - as in one case the solution was actually the maximum.

This is actually a pretty crappy example as you may have seen - it would have been perfectly appropriate in this case to simply test the optimisation problem with the only two valid values given the constraint! It gets more useful when we have multiple variables and constraints to consider.

Two variables, one constraint

Consider the optimisation problem minf(x,y) given a constraint g(x,y)=h.

The Lagrangian system is almost identical to the single-variable case discussed above, except that we have a system of three partial derivatives to consider (2 variables + 1 constraint): L(x,y,λ)=f(x,y)+λ(g(x,y)h). Now take the respective partial derivatives:

Lx=f'x(x,y)+λg'x(x,y) (the first variable x)

Ly=f'y(x,y)+λg'y(x,y) (the second variable y)

Lλ=g(x,y)h (the constraint)

Set them to 0 - and the optimal triplet {x,y,λ} is the solution to that.

A bivariate example

The problem, graphed. Notice the constraint "encircling" the optimisation problem - the solution must lie in that outer circle.

Template:Calculus/Def Template:Hide

Did we have to use the Lagrangian? Actually no. The problem could have been reduced to a univariate form by writing one variable of the constraint in terms of the other: y=±25x2, and substituting it into the optimisation problem:

max(5x+3y)=max(5x±3(25x2))

and use single-variable calculus techniques to solve the problem! But could you do this when there are three variables instead? No, because you will most likely only be able to reduce the problem to two variables.

The general form

In this section, consider a vector x of size n: x=(x1x2x3...xn).

Template:Calculus/Def

Then take the partial derivatives with respect to the vectors x and λ: findLx and Lλ.

Notice that this system has m + n variables, and you'll need to take m + n partial derivatives as well. This can get quite messy. A solution is to use matrix calculus.

This may scare you, and you shouldn't be concerned. The average Calculus 3 course will only consider 2 to three variables.

The regularity condition

Knowledge of linear algebra is expected for this section; this is unlikely to be covered in an average Calculus 3 course as a result.

The regularity condition applies when considering the Lagrange FONC (first order necessary condition) Template:Calculus/DefRemember that it's a necessary condition. This means that

  • just because a point does satisfies the Lagrange FONC does not mean that it is a minimiser or a maximiser.
  • a point that does not satisfy the Lagrange FONC cannot be a minimiser or maximiser.

Template:Calculus/Def

If this condition is not satisfied, the Lagrange FONC does not apply at that point. The reason this does not matter with one constraint is because a single vector is linearly independent by definition. Template:Calculus/Def

Template:Hide

Template:BookCat