Lagrange Multiplier

Lagrange multipliers are a method of finding the constrained extrema of a function, i.e. the maxima & minima subject to certain contraints.

Definition

Lagrange Multipliers ^definition

Let $f$ and $g$ be $C^{1}$ (smoothness) functions. $f : R^{n} \to R$ is the objective function and is real-valued/scalar. $g : R^{n} \to R^{c}$ is the constraint function, which should satisfy $g (x) = 0$ . Let $a = (a_{1}, a_{2}, \dots, a_{n})$ be a point in the domain of $f$ . Assuming $\nabla g (a) \neq = 0$
$\exists λ \in R, \nabla f (a) = λ \cdot \nabla g (a) ⟹ a is an extremum of f subject to the constraint g$
Alternatively notated:
$\nabla f (a) - λ \cdot \nabla g (a) = 0 \nabla f (a) + λ \cdot \nabla g (a) = 0 (Because we can flip the sign of λ)$
In some cases, the entire system is defined as the Lagrangian function (or simply, the Lagrangian). Then, the problem becomes one of finding the stationary points for $L$
$L (x, λ) = f (x) + λ \cdot g (x)$
This provides one way to find extrema for $f$ constrained to $g$ , however it may not find all extrema!

Geometrically:

$\nabla f (a)$ is parallel to $\nabla g (a)$ :

Parallel

$g (x)$ is tangential to a level set of $f$ at $a$ :
- This is because the gradient vector of $f$ is always perpendicular to the contour.

Explanation

(Assuming $g : R^{n} \to R$ is a scalar function, and we are finding the maximum.)

Lagrange multipliers work on some basic assumptions:

The gradient vector points towards the direction of steepest ascent. That is, $\nabla f (x)$ points towards closest highest value of $f$ from $x$ unless $\nabla f (x) = 0$
The level set of a function is always perpendicular to its gradient vector.

Okay, let’s get started:

We know that the highest local extrema are when $\nabla f = 0$ . However, it could be that these extrema do not satisfy the constraint function $g$ .
One way of ensuring we always satisfy the $g$ constraint is to treat $g (x) = 0$ as a level set. As we travel along the level curve of $g$ , we need to find the point $x$ that maximises the value of $f$ .
As we travel along this level curve, one of three things can happen:
- We could be travelling in a way such that we are cutting through the level sets of $f$ . That is, we are travelling perpendicularly to these level sets.
- We could be travelling in a way such that we are going along with some level set of $f$ . That is, we are travelling parallelly to these level sets.
- We could could be doing neither.
If we are travelling by cutting through level sets, then we are travelling parallel to the gradient vector of $f$ (see assumption), i.e. we are parallel to $\nabla f (x)$ . These means we are going towards the extrema, but we haven’t found it.
- If we are at some point $f (b)$ and are travelling along this way, then we know $b$ cannot be an extrema because we can just go to a level set that is ‘higher’ or ‘lower’
If we travel parallel to the level set, then all the values $f (x)$ are the same, so even if one of these points is an extrema, they all become extrema.
Then what we need to do is to first travel as parallel to the gradient vector of $f$ as possible (to ensure most increase) then at some point we should be parallel to a level curve (so we know there is no higher point).
That is, our level curve of $g$ should be tangential to a level set of $f$ . Since by definition, a tangent touches a curve at exactly one point.
We can then use the second assumption. If we make $\nabla g$ and $\nabla f$ parallel at some point $a$ , then that makes $f$ and $g$ tangential at that point $a$ . The equation for parallel vectors results in:

\nabla f (a) = λ g (a)

We also add the restriction $g(\vec{a}) \neq \vec{0}$, because the zero vector is parallel to *every vector*, which doesn't really help us.

9. This also captures any local extrema of $f$ because if we set $λ = 0$ , we find these extrema.

Visualisation

#todo |#HB

Theorems

T1: Superposition of Lagrange Multipliers ^t1

Let $f, g_{1}, g_{2}$ be $C^{1}$ smooth functions. $f : R^{n} \to R$ is the objective function. Let the constraint functions be $g_{1}, g_{2} : R^{n} \to R^{c}$ . If we have, for $g_{1} (x) = 0$ and $g_{2} (x) = 0$ . Let $a$ be a point $(a_{1}, \dots, a_{n}) \in R^{n}$ . If we have:

$\nabla g_{1} (a)$ and $\nabla g_{2} (a)$ are linear independent.

$\nabla g_{1} (a)$ and $\nabla g_{2} (a)$ are both non-zero vectors ( $\neq = 0$ ).

Then:
$\exists λ_{1}, λ_{2} \in R, \nabla f (a) = λ_{1} \nabla g_{1} (a) + λ_{2} \nabla g_{2} (a) ⟹ a is an extremum of f subject to the constraints g_{1}, g_{2}$
That is, if the gradient vector $\nabla f$ at the point $a$ can be written as the linear combination of the gradient vectors $\nabla g_{1}, \nabla g_{2}$ at the point $a$ , then $a$ is an extremum etc. etc. This is known as the superposition of Lagrange multipliers and can be used to solve linear programs for multiple constraints.

Examples

E1: Paraboloid constrained to an ellipse

Find the maximal and minimal values of $f$ (i.e. the extrema) subject to the constraint $g (x, y) = 0$ where:
$f (x, y) = x^{2} + y^{2} g (x, y) = \frac{x ^{2}}{4} + y^{2} - 1$

Solution $g (x, y) = 0$ defines a ellipse. We start by finding the gradient vectors for both $f$ and $g$ :

Note that
$\nabla f = (\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}) = (2 x, 2 y) \nabla g = (\frac{\partial g}{\partial x}, \frac{\partial g}{\partial y}) = (\frac{x}{2}, 2 y)$
We use the Lagrange multipliers equation, combined with the restriction that $g (x, y) = 0$
$\nabla f = λ \nabla g and \frac{x ^{2}}{4} + y^{2} - 1 = 0$
Solving results in a system of equations:
$⎩ ⎨ ⎧ 2 x = \frac{λ x}{2} ⟹ λ = 4 2 y = 2 λ y ⟹ λ = 1$
#todo

E2: Volume-Area Optimisation

An open rectangular box needs to be constructed with a volume of $4 m^{3}$ . What dimensions should the box be to minimised the material used?

Solution surface area of the box. Also, an open rectangular box implies it is a cuboid without one face. Hence, we have volume and surface area given by:

Firstly, note minimising the material used is the same as minimising the
$V (x, y, z) = x yz A (x, y, z) = 2 x y + 2 yz + 2 x z - x y = 2 z (x + y) - x y$
We are minimising $f (x, y, z) = A (x, y, z)$ subject to the constraint function $g (x, y, z) = V (x, y, z) - 4$ . Let’s start with the gradient vectors:
$\nabla f \nabla g = (\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z}) = (y + 2 z, x + 2 z, 2 x y) = (\frac{\partial g}{\partial x}, \frac{\partial g}{\partial y}, \frac{\partial g}{\partial z}) = (yz, x z, x y)$
We then use the Lagrange multipliers equation, combined with the equation $g (x, y, z) = 0$ :
$\nabla f = λ \nabla g and x yz - 4 = 0$
Solving the system of equations:
$⎩ ⎨ ⎧ y + 2 z = λ yz x + 2 z = λ x z 2 x + 2 y = λ x y x yz = 4$

We assume all the dimensions to be positive, $x, y, z >> 0$

We can express $z$ using $x, y$ in the 4th equation: $x yz = 4 ⟹ z = \frac{4}{x y}$ . We assume that that we must maximise the box such that it is symmetrical in the x and y axis, (since the only ‘modification’ is that the box has an open z-face). Hence $z = \frac{4}{x ^{2}}$ (assuming $x = y$ )

Combining equations (1) - (2): $(1) - (2) : y - x = λ z (y - x)$

Again, by symmetry we can modify equation 3: $4 x = λ x^{2} ⟹ x \cdot (4 - λ x) = 0$ . The solutions are $x = 0$ or $x = \frac{4}{λ}$ . But since $x >> 0$ , the only solution is $x = \frac{4}{λ}$

Then, by symmetry, $y = \frac{4}{λ}$ . And by equation (4): $z = \frac{4}{( \frac{4}{λ} ) ^{2}} = \frac{λ ^{2}}{4}$

Finally, we can use equation (2): $x + 2 z = λ x z ⟹ \frac{4}{λ} + \frac{λ ^{2}}{2} = λ^{2} ⟹ 8 = λ^{3}$

Hence $λ = 2$

So our solution $(x, y, z) = (2, 2, 1)$

#todo composition of lagrange multipliers

Resources

https://www.youtube.com/watch?v=8mjcnxGMwFo

Questionably Accurate Notes

Explorer

Lagrange Multiplier

Definition

Explanation

Visualisation

Theorems

Examples

Resources

Table of Contents

Related Concepts

See Also:

Questionably Accurate Notes

Explorer

Lagrange Multiplier

Definition §

Explanation §

Visualisation §

Theorems §

Examples §

Resources §

Table of Contents

Related Concepts

See Also:

Definition

Explanation

Visualisation

Theorems

Examples

Resources