Definition

Lagrange Multipliers ^definition

Let and be (smoothness) functions. is the objective function and is real-valued/scalar. is the constraint function, which should satisfy . Let be a point in the domain of . Assuming

Alternatively notated:

In some cases, the entire system is defined as the Lagrangian function (or simply, the Lagrangian). Then, the problem becomes one of finding the stationary points for

This provides one way to find extrema for constrained to , however it may not find all extrema!

Geometrically:

Parallel

  • is tangential to a level set of at :
    • This is because the gradient vector of is always perpendicular to the contour.

Explanation

(Assuming is a scalar function, and we are finding the maximum.)

Lagrange multipliers work on some basic assumptions:

Okay, let’s get started:

  1. We know that the highest local extrema are when . However, it could be that these extrema do not satisfy the constraint function .

  2. One way of ensuring we always satisfy the constraint is to treat as a level set. As we travel along the level curve of , we need to find the point that maximises the value of .

  3. As we travel along this level curve, one of three things can happen:

    • We could be travelling in a way such that we are cutting through the level sets of . That is, we are travelling perpendicularly to these level sets.
    • We could be travelling in a way such that we are going along with some level set of . That is, we are travelling parallelly to these level sets.
    • We could could be doing neither.
  4. If we are travelling by cutting through level sets, then we are travelling parallel to the gradient vector of (see assumption), i.e. we are parallel to . These means we are going towards the extrema, but we haven’t found it.

    • If we are at some point and are travelling along this way, then we know cannot be an extrema because we can just go to a level set that is ‘higher’ or ‘lower’
  5. If we travel parallel to the level set, then all the values are the same, so even if one of these points is an extrema, they all become extrema.

  6. Then what we need to do is to first travel as parallel to the gradient vector of as possible (to ensure most increase) then at some point we should be parallel to a level curve (so we know there is no higher point).

  7. That is, our level curve of should be tangential to a level set of . Since by definition, a tangent touches a curve at exactly one point.

  8. We can then use the second assumption. If we make and parallel at some point , then that makes and tangential at that point . The equation for parallel vectors results in:

We also add the restriction $g(\vec{a}) \neq \vec{0}$, because the zero vector is parallel to *every vector*, which doesn't really help us.

9. This also captures any local extrema of because if we set , we find these extrema.

Visualisation

#todo |#HB

Theorems

T1: Superposition of Lagrange Multipliers ^t1

Let be smooth functions. is the objective function. Let the constraint functions be . If we have, for and . Let be a point . If we have:

  • and are linear independent.
  • and are both non-zero vectors ().

Then:

That is, if the gradient vector at the point can be written as the linear combination of the gradient vectors at the point , then is an extremum etc. etc. This is known as the superposition of Lagrange multipliers and can be used to solve linear programs for multiple constraints.

Examples

E1: Paraboloid constrained to an ellipse

Find the maximal and minimal values of (i.e. the extrema) subject to the constraint where:

E2: Volume-Area Optimisation

An open rectangular box needs to be constructed with a volume of . What dimensions should the box be to minimised the material used?

#todo composition of lagrange multipliers

Resources