Lagrange Duality#

Primal Problem#

Consider a general optimization problem (called as the primal problem):

(1)#\[\begin{split}\underset{x}{\text{min}} & f(x) \\ \text { s.t. } & h_i(x) \leq 0, i=1, \cdots, m \\ & \ell_j(x)=0, j=1, \cdots, r\end{split}\]

We define its Lagrangian as:

(2)#\[L(x, u, v)=f(x)+\sum_{i=1}^m u_i h_i(x)+\sum_{j=1}^r v_j \ell_j(x)\]

Lagrange multipliers \(u \in \mathbb{R}^m, v \in \mathbb{R}^r\).

Note

This expression may be so complex that you won’t immediately understand what it means. Don’t worry; we’ll explain how it can be used to solve the constrained optimization problem in Problem Eq.1.

Lemma 1

At each feasible \(x, f(x)=\underset{u \geq 0, v}{\max} L(x, u, v)\), and the supremum is taken iff \(u \geq 0\) satisfying \(u_i h_i(x)=0, i=1, \cdots, m\).

Lemma 2

The optimal value of the primal problem, named as \(f^*\), satisfies:

(3)#\[f^*=\underset{x}{\text{min}}\quad \theta_p(x)=\underset{x}{\text{min}}\underset{u \geq 0, v}{\max} \quad L(x, u, v)\]

Proof of Lemma 1

Define \(\theta_p(x)=\underset{u \geq 0, v}{\max} L(x, u, v)\). If \(x\) is feasible, that means the conditions in Problem Eq.1 are satisfied. Then we have \(h_i(x)\le0\) and \(\ell_j(x)=0\), thus \(L(x, u, v)=f(x)+\sum_{i=1}^m u_i h_i(x)+\sum_{j=1}^r v_j \ell_j(x)\le f(x)\). The last inequality becomes equality iff \(u_ih_i(x)=0, i=1,...,m\). So, if \(x\) is feasible, we obtain \(f(x)=\theta_p(x)\), where the subscript \(p\) denotes primal problem.

Proof of Lemma 2

If \(x\) is infeasible, we have \(h_i(x)>0\) or \(\ell_j(x)\neq0\). Then a quick fact is that \(\theta_p(x)\rightarrow +\infty\) as \(u_i\rightarrow +\infty\) or \(v_jh_j(x)\rightarrow +\infty\). So in total, if \(f^*\) violates the constraints, it will not be the optimal value of the primal problem. Thus we obtain \(f^*=\underset{x}{\text{min}}\quad \theta_p(x)\) if \(f^*\) is the optimal value of the primal problem.

Dual Problem#

Given a Lagrangian, we define its Lagrange dual function as:

(4)#\[\theta_d(u,v)=\underset{x}{\text{min}}\quad L(x,u,v)\]

where the subscription \(d\) denotes the dual problem. It is worth mentioning that the infimum here does not require \(x\) to be taken in the feasible set.

Given the primal problem Eq.1, we define its Lagrange dual problem as:

(5)#\[\begin{split}\begin{array}{rl} \underset{u,v}{\max}& \theta_d(u, v) \\ \text { s.t. } & u \geq 0 \end{array}\nonumber\end{split}\]

From the definitions we easily obtain that the optimal value of the dual problem, named as \(g^*\), satisfies: \(g^*=\underset{u\ge0,v}{\text{max}}\underset{x}{\text{min}}\quad L(x,u,v)\).

Lemma3

The dual problem is a convex optimization problem.

Proof of Lemma 3

By definition, \(\theta_d(u,v)=\underset{x}{\text{min}}\quad L(x,u,v)\) can be viewed as point-wise infimum of affine functions of \(u\) and \(v\), thus is concave. \(u \geq 0\) is affine constraints. Hence dual problem is a concave maximization problem, which is a convex optimization problem.

Strong and Week Duality#

In the above introduction, we learned about the definition of primal and dual problems. You may find that the dual problem has a suitable property, that the dual problem is convex.

Note

The naive idea is that since the dual problem is convex, that is, convenient to solve, can the solution of the primal problem be converted to the solution of the dual problem?

We will discuss the weak and strong duality to show you the connection between the primal and dual problems.

Introduction to Weak Duality

The Lagrangian dual problem yields a lower bound for the primal problem. It always holds true that \(f^*\ge g^*\). We define that as weak duality. Proof. We have the definitions that:

(6)#\[f^*=\underset{x}{\text{min}}\underset{u \geq 0, v}{\max} \quad L(x, u, v) \quad g^*=\underset{u\ge0,v}{\text{max}}\underset{x}{\text{min}}\quad L(x,u,v)\]

Then:

(7)#\[\begin{split}\begin{aligned} g^*&=\underset{u\ge0,v}{\text{max}}\underset{x}{\text{min}}\quad L(x,u,v)=\underset{x}{\text{min}}\quad L(x,u^*,v^*)\nonumber\\ &\le L(x^*,u^*,v^*)\le \underset{u\ge 0,v}{\text{max}}\quad L(x^*,u,v)\nonumber\\ &=\underset{x}{\text{min}}\underset{u \geq 0, v}{\max} \quad L(x, u, v)=f^*\nonumber \end{aligned}\end{split}\]

The weak duality is intuitive because it simply takes a small step based on the definition. However, it make little sense for us to solve Problem Eq.1, because \(f^*\neq g^*\). So we will introduce strong duality and luckily, with that we can obtain \(f^*=g^*\).

Introduction to Strong Duality

In some problems, we actually have \(f^*=g^*\), which is called strong duality. In fact, for convex optimization problems, we nearly always have strong duality, only in addition to some slight conditions. A most common condition is the Slater’s condition.

If the primal is a convex problem, and there exists at least one strictly feasible \(\tilde{x}\in \mathbb{R}^n\), satisfying the Slater’s condition, meaning that:

(8)#\[\exists \tilde{x}, h_i(\tilde{x})<0, i=1, \ldots, m, \ell_j(\tilde{x})=0, j=1, \ldots r\]

Then strong duality holds.

Summary#

In this section we introduce you to the Lagrange method, which converts the solution of a constrained optimization problem into a solution to an unconstrained optimization problem. We also introduce that under certain conditions, the solution of a complex primal problem can also be converted to a relatively simple solution of a dual problem. SafeRL’s algorithms are essentially solutions to constrained problems, so the Lagrange method is an important basis for many of these algorithms.