On the global convergence of interior-point nonlinear programming algorithms

Haeser, Gabriel

doi:10.1590/S1807-03022010000200003

Abstract

Carathéodory's lemma states that if we have a linear combination of vectors in <img border=0 src="../../../../img/revistas/cam/v29n2/r_bastao.gif" align=absmiddle>n, we can rewrite this combination using a linearly independent subset. This lemma has been successfully applied in nonlinear optimization in many contexts. In this work we present a new version of this celebrated result, in which we obtained new bounds for the size of the coefficients in the linear combination and we provide examples where these bounds are useful. We show how these new bounds can be used to prove that the internal penalty method converges to KKT points, and we prove that the hypothesis to obtain this result cannot be weakened.The new bounds also provides us some new results of convergence for the quasi feasible interior point ℓ2-penalty method of Chen and Goldfarb [7]. Mathematical subject classification: 90C30, 49K99, 65K05.

nonlinear programming; constraint qualifications; interior point methods

On the global convergence of interior-point nonlinear programming algorithms

Gabriel Haeser^* * This work was supported by FAPESP Grant 05/02163-8.

Department of Applied Mathematics Institute of Mathematics, Statistics and Scientific Computing University of Campinas, Campinas, SP, Brazil E-mail: ghaeser@gmail.com

ABSTRACT

Carathéodory's lemma states that if we have a linear combination of vectors in

ⁿ, we can rewrite this combination using a linearly independent subset. This lemma has been successfully applied in nonlinear optimization in many contexts. In this work we present a new version of this celebrated result, in which we obtained new bounds for the size of the coefficients in the linear combination and we provide examples where these bounds are useful. We show how these new bounds can be used to prove that the internal penalty method converges to KKT points, and we prove that the hypothesis to obtain this result cannot be weakened.The new bounds also provides us some new results of convergence for the quasi feasible interior point ℓ₂-penalty method of Chen and Goldfarb [7].

Mathematical subject classification: 90C30, 49K99, 65K05.

Key words: nonlinear programming, constraint qualifications, interior point methods.

1 Introduction

In 1911 Carathéodory proved that if a point x ∈ ⁿ lies on the convex hull of a compact set P, then x lies on the convex hull of a subset P' of P with no more than n+1 points [6]. In 1914 Steinitz generalized this result for a general set P [18].

Here we will see a different version of Carathéodory's result, which appears in [5] as "Carathéodory's theorem for cones", but is better known as "Carathéodory's lemma". We will provide bounds on the size of the multipliers given by the Carathéodory's lemma and we will apply this result to internal penalty methods. We address the following nonlinear optimization problem:

where ⁿ

ⁿ→

, h:

ⁿ

^m and g:

ⁿ→

^p are continuously differentiable functions. Under a given constraint qualification, the solution x^* satisfies the KKT condition, that is, x^* is feasible with respect to equality and inequality constraints and there exist λ ∈

^m and µ_j > 0 for every j ∈ A(x^*) = {i ∈ {1,...,p}|g_i(x^*) = 0} such that

A common constraint qualification usually employed is the Linear Independence constraint qualification, which states that

is linearly independent. We refer to this multi-set as the active set of gradients at x^*. The weaker Mangasarian-Fromovitz constraint qualification (MFCQ) [14, 16] states that the active set of gradients is positive-linearly independent, which means that there are no α ∈^m, β_j> 0 for every j ∈ A(x^*) such that

except if we take all α_iand β_jequal to zero.

Recently, a weaker constraint qualification appeared in the literature: theConstant Positive Linear Dependence constraint qualification (CPLD) [15, 4], which has been successfully applied to obtain new practical algorithms [1, 2, 10]. We say that the CPLD condition holds for a feasible x^* if for every I ⊂ {1,...,m}, J ⊂ A(x^*) such that the set of gradients {∇h_i(x^*)}_i_∈_I∪{∇g_j(x^*)}_j_∈_J is positive-linearly dependent, there exists a neighborhood V(x^*) of x^* such that the set of gradients {∇h_i(y)}_i_∈_I ∪{∇g_j(y)}_j_∈_J remains positive-linearly dependent for every y ∈ V(x^*). The CPLD condition is a natural generalization of the Constant Rank constraint qualification of Janin [13], which states the same as above, replacing "positive-linearly dependent" by "linearly dependent". The CPLD condition is weaker than the Constant Rankcondition [17].

In practical algorithms, weaker constraint qualifications are preferred, since convergence results are stronger.

In Section 2 we will state Carathéodory's lemma and obtain new bounds on the size of the multipliers. Examples of possible applications of the new result will be given. In Section 3 we will illustrate the usefulness of the new bounds by proving that the internal penalty method converges to KKT points under the CPLD constraint qualification and the sufficient interior property. We conclude this section by proving that, in fact, convergence of the pure internal penalty method under MFCQ cannot be weakened in some sense. In Section 4 we address the interior point method of Chen and Goldfarb [7]. Using the new bounds for Carathéodory's lemma, we obtain stronger convergence results.

2 Generalized Carathéodory's lemma

The main tool which enables us to prove convergence results under the CPLD condition is Carathéodory's lemma. A simple modification of the classical proof provides us new bounds given by item (4) in Theorem 2.1, which can be very useful in applications of this result.

Theorem 2.1. If with υ_i∈ ⁿ and α_i ≠ 0 for every i, then there exist I ⊂ {1,...,m} and scalars for every i ∈ I such that

Proof. We assume that is linearly dependent, otherwise the result follows trivially. Then, there exists β ∈ ^m, β ≠ 0 such that Thus, we may write

for every γ ∈ Let i^* = argmin_i and = , then is the least modulus coefficient . Note that is such that β_i- α_i = 0 for at least one index i = i^*. If α_i( α_i - β_i) < 0, then with α_i≠ 0,β_i≠ 0, thus || > which contradicts the definition of . Therefore we conclude that α_i(α_i< - β_i> 0. Also, |α_i - β_i| <2|α_i|, since || < for every i. Including in the sum only the indexes such that we are able to write the linear combination x with at leastone less vector. We can repeat this procedure until {υ_i}_i_∈_I is linearly independent with and for every i ∈ I.

□

The new bounds for every i ∈ I and for every j ∈ J may be useful in many ways. For example, if we have that {(λ^k, µ^k)} is bounded, then the same is true for the sequence of new multipliers {()}. The converse is not always true. Consider for instance with for We have for every k, then and for every k.

Another situation in which bounds may be useful is when for some j. This appears for example in the internal penalty method, in which quasi-KKT points are defined as

with when g_j(x^*) < 0. With the new bounds, we have that whenever (we point out that the reciprocal is also not true, this can be observed by taking the previous counter-example with and dividedby 10^k). This result is crucial to obtain the complementarity condition of the KKT condition. We will give the details in the next section, where we also show the impossibility to weaken the hypothesis that guarantee convergence of the pure internal penalty method to KKT points.

3 Internal penalty method

In this section we will consider problem (1) with only inequality constraints:

The internal penalty method consists of solving the following subproblem:

for a sequence of positive scalars r_k→ 0. If there are additional constraints x ∈ Ω, they are added to the constraints of the subproblems.

It is a well known fact that if x^* is a limit point of the sequence {x^k} generated by the internal penalty method, such that x^* satisfies the sufficient interior property, that is, x^* can be approximated by a sequence of strictly feasible points y^k → x^* (g(y^k) < 0), then x^* is a solution to problem (3) [8, 5, 11].

We assume that x^* is a local solution of problem (3) such that the sufficient interior property holds, and we apply the internal penalty method to:

for a sufficiently small δ (note that x^* is the unique global solution of this problem). The corresponding subproblem is:

It's a classical result of internal penalty methods that the subproblems (6) admit a global solution x^k [8, 11]. Since every limit point of the sequence of solutions {x^k} of (6) is a global solutions of (5), we have that x^k→ x^*, thus, for sufficiently large k, we have ∇φ(x^k) = 0, that is,

We can then repeat standard arguments (see [15, 1, 2, 3, 11, 17]) to prove that under the CPLD constraint qualification, there exist J ⊂ {1,...,p} and new non-negative multipliers , j ∈ J, given by Carathéodory's lemma, such that we can take a subsequence in wich converges to some non-negative µ_j for every j and

To obtain that x^* is a KKT point, we note that if g_j(x^*) < 0, then → 0, thus, by the new bounds < 2^p-1, we have → 0, that is, µ_j= 0, and thus complementarity holds. So, under the CPLD constraint qualification and the sufficient interior property, limit points of the internal penalty method are KKT points. We will prove next that these hypotheses are equivalent to the Mangasarian-Fromovitz condition when only inequality constraints are present.

For this purpose we shall define the quasi-normality constraint qualification [12, 5].

Definition 3.1. We say that a feasible point x^* to problem (3) satisfies the quasi-normality constraint qualification if x^* satisfies MFCQ, or if there exist µ_j > 0 for every j ∈ A(x^*), not all zero, with Σ_j_∈_A₍_x*₎ µ_j∇g_j(x^*) = 0 then there does not exist a sequence z^k → x^*, such that µ_j > ⇒ g_j(z^k) > 0 for every j ∈ A(x^*).

We will use the result proved in [4] that CPLD implies quasi-normality.

Theorem 3.2. A feasible point x^* satisfies CPLD and the sufficient interior property if, and only if, x^* satisfies MFCQ.

Proof. Suppose a feasible point x^* satisfies the CPLD condition and the sufficient interior property. Then x^* satisfies the CPLD condition for the problem:

therefore x^* satisfies the quasi-normality condition for problem (7). If MFCQ does not hold, then there exist not all zero scalars µ_j > 0 such that

multiplying by -1 we get that MFCQ does not hold for problem (7). Thus, by the quasi-normality for this problem we get that there is no sequence z^k → x^* such that µ_j > ⇒ -g_j(z^k) > 0 for every j ∈ A(x^*). Since there is at least one index j ∈ A(x^*) such that µ_j > 0, we conclude that there is no sequence z^k→ x^* such that g_j(z^k) < 0, which contradicts the sufficient interior property.

The converse holds trivially since one can easily prove that the sufficient interior property holds using the direction given by the original MFCQ definition, see details in [9, 11]. Clearly, MFCQ also implies the CPLD condition.

□

This shows that the internal penalty method converges to a KKT point under MFCQ, and relaxing this condition to CPLD does not provide a stronger result. This is clear since we cannot expect convergence of the internal penalty method if the sufficient interior property does not hold.

We conclude this section with a counter-example showing that a stronger form of Theorem 3.2, in which CPLD is replaced by quasi-normality, does not hold. Consider the problem:

Minimize x subject to -x²< 0,

at the point x^* = 0. It is clear that MFCQ does not hold and the sufficient interior property holds. Also, the quasi-normality condition holds since there are no infeasible points.

In the next section we will use the new bounds obtained in Carathéodory's lemma to prove some stronger convergence results for Chen and Goldfarb's interior point method [7].

4 Chen and Goldfarb's interior point method

Consider the following nonlinear optimization problem:

where :

ⁿ→

, h:

ⁿ→

^m and c:

ⁿ→

^pare twice continuously differentiable functions and F⁰ = {x ∈

ⁿ|c(x) > 0} is non-empty.

Chen and Goldfarb's quasi-feasible interior point method consists in twoparts: the first part is to apply the log-barrier method to problem (8), obtaining subproblems (FP_µ) below:

for a sequence of positive parameters µ → 0. The second part consists in applying, for every µ, an ℓ₂-penalty method to solve (FP_µ), yielding subproblems (ℓ₂FP_µ) below:

for a sequence of parameters r → +∞. The idea of the method is to solve (ℓ₂FP_µ) by a Newton-like approach. Here follows the details of the algorithm to solve (FP_µ), for a fixed µ > 0, according to [7].

Algorithm 4.1 (Chen and Goldfarb). Parameters: ε_µ> 0, σ ∈ (0,), χ > 1, k₁∈ (0,1), k₂ > 1, π_µ= max{µ,0.1}, ν > 0, 0 < g_min < 1 < γ_max, k: = 0. Given initial interior points x⁰∈ F⁰, u⁰ > 0, an initial penalty parameter r₀ > 0 and an initial approximation H⁰∈ ^n×nfor the Hessian of the Lagrangian L(x,λ,y) = (x)-λ^Tc(x)+y^Th(x).

Step 1: Search direction

Modify H^k, if necessary, such that condition C-5 below, holds:

where

with

Calculate (Δx^k,l^k,y^k), solution of the KKT system

where

I is the identity matrix and e = (1,...,1) of appropriate dimensions.

Step 2: Termination

Step 3: Penalty parameter update

If the following conditions hold

then

r_k₊₁: = χ_rk,

(x^k⁺¹, u^k⁺¹, H^k⁺¹) = (x^k,u^k,H^k),

k: = k+1,

and go back to Step 1.

Step 4: Line search

Initialize t_k = 1 and successively divide it by 2, if necessary, until the following conditions hold

Step 5: Update

Define to be the projection of on the interval

for each i.

x^k⁺¹ = x^k+t_kΔx^k,

r_k₊₁ = r_k.

Calculate the new estimative H^k⁺¹for the Hessian of the Lagrangian.

k: = k+1

go back to Step 1.

In [7], the authors prove that if the primal iterate sequence {x^k} lies in abounded set and the modified Hessian sequence {H^k} is bounded, then, under MFCQ, the limit points of {x^k} are stationary for an infeasibility measure problem, and, if the limit point is feasible, KKT condition holds for FP_µ.

We will prove, using the new bounds for Carathéodory's lemma, that if the penalty parameter r_k→ +∞ and x^* is infeasible with respect to the equality constraints, then we can weaken the constraint qualification hypothesis and assume only the CPLD condition to obtain that x^* is stationary for an infeasibility measure problem.

Proposition 4.2. If the penalty parameter r_k→ +∞ and x^* is a limit point of the sequence {x^k} generated by Algorithm 4.1 such that ||h(x^*)||₂ > 0, and x^* satisfies the CPLD constraint qualification for problem

then x^* is a KKT point for this problem.

Proof. Let's consider a subsequence {x^k} such that x^k → x^*and r_k is increased for every k, thus, conditions C-1 to C-4 are fulfilled. From (9), we can write

By C-3, 0 < κ₁µ < c_i(x^k) , thus > 0, since c_i(x^k) > 0. By Carathéodory's lemma, there exist a subset I_k ⊂ {1,...,m} and scalars > 0 such that

and {∇c_i(x^k)}_i_∈_Ik is linearly independent.

Let's take a subsequence such that I_k = I. Since < , from the new bounds on Carathéodory's lemma, we have hence 0 <

If admits a limited subsequence, we may consider a subsequence such that → λ'. Dividing (11) by r_k, taking limits for k and observing that {Δx^k} and are limited sequences since C-2 and C-4 hold, we obtain

and

thus x^* is a KKT point of problem (10).

In the case → +∞, dividing (11) by ^k and taking limits for a subsequence such that we have

and

Excluding from the set I all indexes such that = 0, we have I ⊂ A(x^*) and CPLD is not fulfilled.

□

Chen and Goldfarb's algorithm to solve (8) consists of defining positive sequences µ_k→ 0, ε_k→ 0 and using Algorithm 4.1 to approximately solve(FP_mk), that is, obtaining iterates satisfying the stopping criterium of Step 2.In this case, they prove that under MFCQ, limit points are stationary for an infeasibility measure problem, and in case the limit point is feasible, KKT condition holds for (8). We will prove that, under CPLD, if the limit point is feasible, then the KKT condition holds.

Proposition 4.3. Assume x^* is a limit point of the sequence {x^k} generated by Chen and Goldfarb's algorithm to solve (8), such that x^* satisfies the CPLD constraint qualification for problem (8). Assume also that Algorithm (4.1) is well-defined, thus x^* is a KKT point of problem (8).

Proof. Let's take a subsequence such that x^k → x^*. By the stopping criterium of Step 2, we have

such that

By Carathéodory's lemma, there are scalars and subsets

I_k⊂ {1,...,m},J_k ⊂ {1,...,p}

(we will take a subsequence that satisfies I_k = I and J_k = J for every k) such that

is linearly independent and thus . Define

If {a_k} admits a limited subsequence, let's consider a subsequence such that Since ε_k → 0, taking limits in (15) we obtain

Since

> -2^m+p-1ε_k, we have

, and from (13) we get

which implies

_ic_i(x^*) = 0. By (14) we get h(x^*) = 0, thus x^* is a KKT point of problem (8).

If a_k→ +∞ consider a subsequence such that and since

we have

> 0.

Dividing (15) by α_k and taking limits we get

with

thus, multiplying this inequality by and taking limits we get c_i(x^*)_i = 0, therefore, removing from I all indexes such that _i = 0, we get I ⊂ A(x^*), which contradicts CPLD.

□

We point out that since problem (8) includes also equality constraints, the result of Theorem 3.2 does not apply.

Acknowledgement. The author is indebted to an anonymous referee for insightful comments and suggestions.

Received: 30/VI/09.

Accepted: 17/III/10.

#CAM-111/09.

[1] R. Andreani, E.G. Birgin, J.M. Martínez and M.L. Schuverdt, On augmented lagrangian methods with general lower-level constraints. SIAM Journal on Optimization, 18 (2007), 1286-1309.
[2] R. Andreani, E.G. Birgin, J.M. Martínez and M.L. Schuverdt. Augmented lagrangian methods under the constant positive linear dependence constraint qualification. Mathematical Programming, 112 (2008), 5-32.
[3] R. Andreani, G. Haeser and J.M. Martínez, On sequential optimality conditions for smooth constrained optimization. To appear in Optimization, 2010. Available at Optimization Online http://www.optimization-online.org/DB_HTML/2009/06/2326.html
[4] R. Andreani, J.M. Martínez and M.L. Schuverdt, On the relation between constant positive linear dependence condition and quasinormality constraint qualification. Journal of Optimization Theory and Applications, 125 (2005), 473-485.
[5] D.P. Bertsekas, Nonlinear Programming: 2nd Edition Athena Scientific (1999).
[6] C. Carathéodory, Über den Variabilitätsbereich der Fourierschen Konstanten von positiven harmonischen Funktionen. Rend. Circ. Mat. Palermo, 32 (1911), 193-217.
[7] L. Chen and D. Goldfarb, Interior-point l2-penalty methods for nonlinear programming with strong global convergence. Mathematical Programming, 108(1) (2006), 1-36.
[8] A.V. Fiacco and G.P. McCormick, Nonlinear Programming: Sequential Unconstrained Minimization Techniques Wiley, 1968 (reprinted, SIAM, 1990).
[9] A. Forsgren, P.E. Gill and M.H. Wright, Interior methods for nonlinear optimization. SIAM Review, 44(4) (2002), 525-597.
[10] M.A. Gomes-Ruggiero, J.M. Martínez and S.A. Santos, Spectral projected gradient method with inexact restoration for minimization with nonconvex constraints. SIAM Journal on Scientific Computing, 31(3) (2009), 1628-1652.
[11] G. Haeser, Condições sequenciais de otimalidade. PhD thesis, IMECC-UNICAMP, Departamento de Matemática Aplicada, Campinas-SP, Brazil, 2009. http://libdigi.unicamp.br/document/?code=000466897
[12] M.R. Hestenes, Optimization theory: the finite dimensional case John Wiley (1975).
[13] R. Janin, Directional derivative of the marginal function in nonlinear programming. Mathematical Programming Study, 21 (1984), 110-126.
[14] O.L. Mangasarian and S. Fromovitz, The Fritz John optimality conditions in the presence of equality and inequality constraints. Journal of Mathematical Analysis and Applications, 17 (1967), 37-47.
[15] L. Qi and Z. Wei, On the constant positive linear dependence condition and its applications to SQP methods. SIAM Journal on Optimization, 10(4) (2000), 963-981.
[16] R.T. Rockafellar, Lagrange multipliers and optimality. SIAM Review, 35(2) (1993), 183-238.
[17] M.L. Schuverdt, Métodos de Lagrangiano aumentado com convergência utilizando a condição de dependência linear positiva constante. PhD thesis, IMECC-UNICAMP, Departamento de Matemática Aplicada, Campinas-SP, Brazil (2006). http://libdigi.unicamp.br/document/?code=vtls000375801
[18] E. Steinitz, Bedingt konvergente Reihen und konvexe Systeme i-iv. J. Reine Angew. Math., 143 (1913), 128-175.

*

This work was supported by FAPESP Grant 05/02163-8.

Publication Dates

Publication in this collection
13 Sept 2010
Date of issue
June 2010

History

Received
30 June 2009
Accepted
17 Mar 2010

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

[1] [1] R. Andreani, E.G. Birgin, J.M. Martínez and M.L. Schuverdt, On augmented lagrangian methods with general lower-level constraints. SIAM Journal on Optimization, 18 (2007), 1286-1309.

[2] [2] R. Andreani, E.G. Birgin, J.M. Martínez and M.L. Schuverdt. Augmented lagrangian methods under the constant positive linear dependence constraint qualification. Mathematical Programming, 112 (2008), 5-32.

[3] [3] R. Andreani, G. Haeser and J.M. Martínez, On sequential optimality conditions for smooth constrained optimization. To appear in Optimization, 2010. Available at Optimization Online http://www.optimization-online.org/DB_HTML/2009/06/2326.html

[4] [4] R. Andreani, J.M. Martínez and M.L. Schuverdt, On the relation between constant positive linear dependence condition and quasinormality constraint qualification. Journal of Optimization Theory and Applications, 125 (2005), 473-485.

[5] [5] D.P. Bertsekas, Nonlinear Programming: 2nd Edition Athena Scientific (1999).

[6] [6] C. Carathéodory, Über den Variabilitätsbereich der Fourierschen Konstanten von positiven harmonischen Funktionen. Rend. Circ. Mat. Palermo, 32 (1911), 193-217.

[7] [7] L. Chen and D. Goldfarb, Interior-point l2-penalty methods for nonlinear programming with strong global convergence. Mathematical Programming, 108(1) (2006), 1-36.

[8] [8] A.V. Fiacco and G.P. McCormick, Nonlinear Programming: Sequential Unconstrained Minimization Techniques Wiley, 1968 (reprinted, SIAM, 1990).

[9] [9] A. Forsgren, P.E. Gill and M.H. Wright, Interior methods for nonlinear optimization. SIAM Review, 44(4) (2002), 525-597.

[10] [10] M.A. Gomes-Ruggiero, J.M. Martínez and S.A. Santos, Spectral projected gradient method with inexact restoration for minimization with nonconvex constraints. SIAM Journal on Scientific Computing, 31(3) (2009), 1628-1652.

[11] [11] G. Haeser, Condições sequenciais de otimalidade. PhD thesis, IMECC-UNICAMP, Departamento de Matemática Aplicada, Campinas-SP, Brazil, 2009. http://libdigi.unicamp.br/document/?code=000466897

[12] [12] M.R. Hestenes, Optimization theory: the finite dimensional case John Wiley (1975).

[13] [13] R. Janin, Directional derivative of the marginal function in nonlinear programming. Mathematical Programming Study, 21 (1984), 110-126.

[14] [14] O.L. Mangasarian and S. Fromovitz, The Fritz John optimality conditions in the presence of equality and inequality constraints. Journal of Mathematical Analysis and Applications, 17 (1967), 37-47.

[15] [15] L. Qi and Z. Wei, On the constant positive linear dependence condition and its applications to SQP methods. SIAM Journal on Optimization, 10(4) (2000), 963-981.

[16] [16] R.T. Rockafellar, Lagrange multipliers and optimality. SIAM Review, 35(2) (1993), 183-238.

[17] [17] M.L. Schuverdt, Métodos de Lagrangiano aumentado com convergência utilizando a condição de dependência linear positiva constante. PhD thesis, IMECC-UNICAMP, Departamento de Matemática Aplicada, Campinas-SP, Brazil (2006). http://libdigi.unicamp.br/document/?code=vtls000375801

[18] [18] E. Steinitz, Bedingt konvergente Reihen und konvexe Systeme i-iv. J. Reine Angew. Math., 143 (1913), 128-175.

Brasil

Brasil

On the global convergence of interior-point nonlinear programming algorithms

Abstract

Publication Dates

History