Abstract
Carathéodory's lemma states that if we have a linear combination of vectors in <img border=0 src="../../../../img/revistas/cam/v29n2/r_bastao.gif" align=absmiddle>n, we can rewrite this combination using a linearly independent subset. This lemma has been successfully applied in nonlinear optimization in many contexts. In this work we present a new version of this celebrated result, in which we obtained new bounds for the size of the coefficients in the linear combination and we provide examples where these bounds are useful. We show how these new bounds can be used to prove that the internal penalty method converges to KKT points, and we prove that the hypothesis to obtain this result cannot be weakened.The new bounds also provides us some new results of convergence for the quasi feasible interior point ℓ2-penalty method of Chen and Goldfarb [7]. Mathematical subject classification: 90C30, 49K99, 65K05.
nonlinear programming; constraint qualifications; interior point methods
On the global convergence of interior-point nonlinear programming algorithms
Gabriel Haeser* * This work was supported by FAPESP Grant 05/02163-8.
Department of Applied Mathematics Institute of Mathematics, Statistics and Scientific Computing University of Campinas, Campinas, SP, Brazil E-mail: ghaeser@gmail.com
ABSTRACT
Carathéodory's lemma states that if we have a linear combination of vectors in
n, we can rewrite this combination using a linearly independent subset. This lemma has been successfully applied in nonlinear optimization in many contexts. In this work we present a new version of this celebrated result, in which we obtained new bounds for the size of the coefficients in the linear combination and we provide examples where these bounds are useful. We show how these new bounds can be used to prove that the internal penalty method converges to KKT points, and we prove that the hypothesis to obtain this result cannot be weakened.The new bounds also provides us some new results of convergence for the quasi feasible interior point ℓ2-penalty method of Chen and Goldfarb [7].Mathematical subject classification: 90C30, 49K99, 65K05.
Key words: nonlinear programming, constraint qualifications, interior point methods.
1 Introduction
In 1911 Carathéodory proved that if a point x ∈ n lies on the convex hull of a compact set P, then x lies on the convex hull of a subset P' of P with no more than n+1 points [6]. In 1914 Steinitz generalized this result for a general set P [18].
Here we will see a different version of Carathéodory's result, which appears in [5] as "Carathéodory's theorem for cones", but is better known as "Carathéodory's lemma". We will provide bounds on the size of the multipliers given by the Carathéodory's lemma and we will apply this result to internal penalty methods. We address the following nonlinear optimization problem:
where n
n→ , h: n m and g:n→ p are continuously differentiable functions. Under a given constraint qualification, the solution x* satisfies the KKT condition, that is, x* is feasible with respect to equality and inequality constraints and there exist λ ∈m and µj > 0 for every j ∈ A(x*) = {i ∈ {1,...,p}|gi(x*) = 0} such that
A common constraint qualification usually employed is the Linear Independence constraint qualification, which states that
is linearly independent. We refer to this multi-set as the active set of gradients at x*. The weaker Mangasarian-Fromovitz constraint qualification (MFCQ) [14, 16] states that the active set of gradients is positive-linearly independent, which means that there are no α ∈m, βj> 0 for every j ∈ A(x*) such that
except if we take all αiand βjequal to zero.
Recently, a weaker constraint qualification appeared in the literature: theConstant Positive Linear Dependence constraint qualification (CPLD) [15, 4], which has been successfully applied to obtain new practical algorithms [1, 2, 10]. We say that the CPLD condition holds for a feasible x* if for every I ⊂ {1,...,m}, J ⊂ A(x*) such that the set of gradients {∇hi(x*)}i∈I∪{∇gj(x*)}j∈J is positive-linearly dependent, there exists a neighborhood V(x*) of x* such that the set of gradients {∇hi(y)}i∈I ∪{∇gj(y)}j∈J remains positive-linearly dependent for every y ∈ V(x*). The CPLD condition is a natural generalization of the Constant Rank constraint qualification of Janin [13], which states the same as above, replacing "positive-linearly dependent" by "linearly dependent". The CPLD condition is weaker than the Constant Rankcondition [17].
In practical algorithms, weaker constraint qualifications are preferred, since convergence results are stronger.
In Section 2 we will state Carathéodory's lemma and obtain new bounds on the size of the multipliers. Examples of possible applications of the new result will be given. In Section 3 we will illustrate the usefulness of the new bounds by proving that the internal penalty method converges to KKT points under the CPLD constraint qualification and the sufficient interior property. We conclude this section by proving that, in fact, convergence of the pure internal penalty method under MFCQ cannot be weakened in some sense. In Section 4 we address the interior point method of Chen and Goldfarb [7]. Using the new bounds for Carathéodory's lemma, we obtain stronger convergence results.
2 Generalized Carathéodory's lemma
The main tool which enables us to prove convergence results under the CPLD condition is Carathéodory's lemma. A simple modification of the classical proof provides us new bounds given by item (4) in Theorem 2.1, which can be very useful in applications of this result.
Theorem 2.1. If with υi∈ n and αi ≠ 0 for every i, then there exist I ⊂ {1,...,m} and scalars for every i ∈ I such that
Proof. We assume that is linearly dependent, otherwise the result follows trivially. Then, there exists β ∈ m, β ≠ 0 such that Thus, we may write
for every γ ∈ Let i* = argmini and = , then is the least modulus coefficient . Note that is such that βi- αi = 0 for at least one index i = i*. If αi( αi - βi) < 0, then with αi≠ 0,βi≠ 0, thus || > which contradicts the definition of . Therefore we conclude that αi(αi< - βi> 0. Also, |αi - βi| <2|αi|, since || < for every i. Including in the sum only the indexes such that we are able to write the linear combination x with at leastone less vector. We can repeat this procedure until {υi}i∈I is linearly independent with and for every i ∈ I.
□
The new bounds for every i ∈ I and for every j ∈ J may be useful in many ways. For example, if we have that {(λk, µk)} is bounded, then the same is true for the sequence of new multipliers {()}. The converse is not always true. Consider for instance with for We have for every k, then and for every k.
Another situation in which bounds may be useful is when for some j. This appears for example in the internal penalty method, in which quasi-KKT points are defined as
with when gj(x*) < 0. With the new bounds, we have that whenever (we point out that the reciprocal is also not true, this can be observed by taking the previous counter-example with and dividedby 10k). This result is crucial to obtain the complementarity condition of the KKT condition. We will give the details in the next section, where we also show the impossibility to weaken the hypothesis that guarantee convergence of the pure internal penalty method to KKT points.
3 Internal penalty method
In this section we will consider problem (1) with only inequality constraints:
The internal penalty method consists of solving the following subproblem:
for a sequence of positive scalars rk→ 0. If there are additional constraints x ∈ Ω, they are added to the constraints of the subproblems.
It is a well known fact that if x* is a limit point of the sequence {xk} generated by the internal penalty method, such that x* satisfies the sufficient interior property, that is, x* can be approximated by a sequence of strictly feasible points yk → x* (g(yk) < 0), then x* is a solution to problem (3) [8, 5, 11].
We assume that x* is a local solution of problem (3) such that the sufficient interior property holds, and we apply the internal penalty method to:
for a sufficiently small δ (note that x* is the unique global solution of this problem). The corresponding subproblem is:
It's a classical result of internal penalty methods that the subproblems (6) admit a global solution xk [8, 11]. Since every limit point of the sequence of solutions {xk} of (6) is a global solutions of (5), we have that xk→ x*, thus, for sufficiently large k, we have ∇φ(xk) = 0, that is,
We can then repeat standard arguments (see [15, 1, 2, 3, 11, 17]) to prove that under the CPLD constraint qualification, there exist J ⊂ {1,...,p} and new non-negative multipliers , j ∈ J, given by Carathéodory's lemma, such that we can take a subsequence in wich converges to some non-negative µj for every j and
To obtain that x* is a KKT point, we note that if gj(x*) < 0, then → 0, thus, by the new bounds < 2p-1 , we have → 0, that is, µj= 0, and thus complementarity holds. So, under the CPLD constraint qualification and the sufficient interior property, limit points of the internal penalty method are KKT points. We will prove next that these hypotheses are equivalent to the Mangasarian-Fromovitz condition when only inequality constraints are present.
For this purpose we shall define the quasi-normality constraint qualification [12, 5].
Definition 3.1. We say that a feasible point x* to problem (3) satisfies the quasi-normality constraint qualification if x* satisfies MFCQ, or if there exist µj > 0 for every j ∈ A(x*), not all zero, with Σj∈A(x*) µj∇gj(x*) = 0 then there does not exist a sequence zk → x*, such that µj > ⇒ gj(zk) > 0 for every j ∈ A(x*).
We will use the result proved in [4] that CPLD implies quasi-normality.
Theorem 3.2. A feasible point x* satisfies CPLD and the sufficient interior property if, and only if, x* satisfies MFCQ.
Proof. Suppose a feasible point x* satisfies the CPLD condition and the sufficient interior property. Then x* satisfies the CPLD condition for the problem:
therefore x* satisfies the quasi-normality condition for problem (7). If MFCQ does not hold, then there exist not all zero scalars µj > 0 such that
multiplying by -1 we get that MFCQ does not hold for problem (7). Thus, by the quasi-normality for this problem we get that there is no sequence zk → x* such that µj > ⇒ -gj(zk) > 0 for every j ∈ A(x*). Since there is at least one index j ∈ A(x*) such that µj > 0, we conclude that there is no sequence zk→ x* such that gj(zk) < 0, which contradicts the sufficient interior property.
The converse holds trivially since one can easily prove that the sufficient interior property holds using the direction given by the original MFCQ definition, see details in [9, 11]. Clearly, MFCQ also implies the CPLD condition.
□
This shows that the internal penalty method converges to a KKT point under MFCQ, and relaxing this condition to CPLD does not provide a stronger result. This is clear since we cannot expect convergence of the internal penalty method if the sufficient interior property does not hold.
We conclude this section with a counter-example showing that a stronger form of Theorem 3.2, in which CPLD is replaced by quasi-normality, does not hold. Consider the problem:
Minimize x subject to -x2< 0,
at the point x* = 0. It is clear that MFCQ does not hold and the sufficient interior property holds. Also, the quasi-normality condition holds since there are no infeasible points.
In the next section we will use the new bounds obtained in Carathéodory's lemma to prove some stronger convergence results for Chen and Goldfarb's interior point method [7].
4 Chen and Goldfarb's interior point method
Consider the following nonlinear optimization problem:
where :
n→ , h: n→ m and c: n→p are twice continuously differentiable functions and F0 = {x ∈ n|c(x) > 0} is non-empty.Chen and Goldfarb's quasi-feasible interior point method consists in twoparts: the first part is to apply the log-barrier method to problem (8), obtaining subproblems (FPµ) below:
for a sequence of positive parameters µ → 0. The second part consists in applying, for every µ, an ℓ2-penalty method to solve (FPµ), yielding subproblems (ℓ2FPµ) below:
for a sequence of parameters r → +∞. The idea of the method is to solve (ℓ2FPµ) by a Newton-like approach. Here follows the details of the algorithm to solve (FPµ), for a fixed µ > 0, according to [7].
Algorithm 4.1 (Chen and Goldfarb). Parameters: εµ> 0, σ ∈ (0,), χ > 1, k1∈ (0,1), k2 > 1, πµ= max{µ,0.1}, ν > 0, 0 < gmin < 1 < γmax, k: = 0. Given initial interior points x0∈ F0, u0 > 0, an initial penalty parameter r0 > 0 and an initial approximation H0∈ n×nfor the Hessian of the Lagrangian L(x,λ,y) = (x)-λTc(x)+yTh(x).
Step 1: Search direction
Modify Hk, if necessary, such that condition C-5 below, holds:
where
with
Calculate (Δxk,lk,yk), solution of the KKT system
where
I is the identity matrix and e = (1,...,1) of appropriate dimensions.
Step 2: Termination
Step 3: Penalty parameter update
If the following conditions hold
then
rk+1: = χrk,
(xk+1, uk+1, Hk+1) = (xk,uk,Hk),
k: = k+1,
and go back to Step 1.
Step 4: Line search
Initialize tk = 1 and successively divide it by 2, if necessary, until the following conditions hold
Step 5: Update
Define to be the projection of on the interval
for each i.
xk+1 = xk+tkΔxk,
rk+1 = rk.
Calculate the new estimative Hk+1for the Hessian of the Lagrangian.
k: = k+1
go back to Step 1.
In [7], the authors prove that if the primal iterate sequence {xk} lies in abounded set and the modified Hessian sequence {Hk} is bounded, then, under MFCQ, the limit points of {xk} are stationary for an infeasibility measure problem, and, if the limit point is feasible, KKT condition holds for FPµ.
We will prove, using the new bounds for Carathéodory's lemma, that if the penalty parameter rk→ +∞ and x* is infeasible with respect to the equality constraints, then we can weaken the constraint qualification hypothesis and assume only the CPLD condition to obtain that x* is stationary for an infeasibility measure problem.
Proposition 4.2. If the penalty parameter rk → +∞ and x* is a limit point of the sequence {xk} generated by Algorithm 4.1 such that ||h(x*)||2 > 0, and x* satisfies the CPLD constraint qualification for problem
then x* is a KKT point for this problem.
Proof. Let's consider a subsequence {xk} such that xk → x*and rk is increased for every k, thus, conditions C-1 to C-4 are fulfilled. From (9), we can write
By C-3, 0 < κ1µ < ci(xk) , thus > 0, since ci(xk) > 0. By Carathéodory's lemma, there exist a subset Ik ⊂ {1,...,m} and scalars > 0 such that
and {∇ci(xk)}i∈Ik is linearly independent.
Let's take a subsequence such that Ik = I. Since < , from the new bounds on Carathéodory's lemma, we have hence 0 <
If admits a limited subsequence, we may consider a subsequence such that → λ'. Dividing (11) by rk, taking limits for k and observing that {Δxk} and are limited sequences since C-2 and C-4 hold, we obtain
and
thus x* is a KKT point of problem (10).
In the case → +∞, dividing (11) by k and taking limits for a subsequence such that we have
and
Excluding from the set I all indexes such that = 0, we have I ⊂ A(x*) and CPLD is not fulfilled.
□
Chen and Goldfarb's algorithm to solve (8) consists of defining positive sequences µk→ 0, εk→ 0 and using Algorithm 4.1 to approximately solve(FPmk), that is, obtaining iterates satisfying the stopping criterium of Step 2.In this case, they prove that under MFCQ, limit points are stationary for an infeasibility measure problem, and in case the limit point is feasible, KKT condition holds for (8). We will prove that, under CPLD, if the limit point is feasible, then the KKT condition holds.
Proposition 4.3. Assume x* is a limit point of the sequence {xk} generated by Chen and Goldfarb's algorithm to solve (8), such that x* satisfies the CPLD constraint qualification for problem (8). Assume also that Algorithm (4.1) is well-defined, thus x* is a KKT point of problem (8).
Proof. Let's take a subsequence such that xk → x*. By the stopping criterium of Step 2, we have
such that
By Carathéodory's lemma, there are scalars and subsets
Ik⊂ {1,...,m},Jk ⊂ {1,...,p}
(we will take a subsequence that satisfies Ik = I and Jk = J for every k) such that
is linearly independent and thus . Define
If {ak} admits a limited subsequence, let's consider a subsequence such that Since εk → 0, taking limits in (15) we obtain
Since
> -2m+p-1εk, we have , and from (13) we get
which implies
ici(x*) = 0. By (14) we get h(x*) = 0, thus x* is a KKT point of problem (8).If ak→ +∞ consider a subsequence such that and since
we have
> 0.Dividing (15) by αk and taking limits we get
with
thus, multiplying this inequality by and taking limits we get ci(x*)i = 0, therefore, removing from I all indexes such that i = 0, we get I ⊂ A(x*), which contradicts CPLD.
□
We point out that since problem (8) includes also equality constraints, the result of Theorem 3.2 does not apply.
Acknowledgement. The author is indebted to an anonymous referee for insightful comments and suggestions.
Received: 30/VI/09.
Accepted: 17/III/10.
#CAM-111/09.
- [1] R. Andreani, E.G. Birgin, J.M. Martínez and M.L. Schuverdt, On augmented lagrangian methods with general lower-level constraints. SIAM Journal on Optimization, 18 (2007), 1286-1309.
- [2] R. Andreani, E.G. Birgin, J.M. Martínez and M.L. Schuverdt. Augmented lagrangian methods under the constant positive linear dependence constraint qualification. Mathematical Programming, 112 (2008), 5-32.
- [3] R. Andreani, G. Haeser and J.M. Martínez, On sequential optimality conditions for smooth constrained optimization. To appear in Optimization, 2010. Available at Optimization Online http://www.optimization-online.org/DB_HTML/2009/06/2326.html
- [4] R. Andreani, J.M. Martínez and M.L. Schuverdt, On the relation between constant positive linear dependence condition and quasinormality constraint qualification. Journal of Optimization Theory and Applications, 125 (2005), 473-485.
- [5] D.P. Bertsekas, Nonlinear Programming: 2nd Edition Athena Scientific (1999).
- [6] C. Carathéodory, Über den Variabilitätsbereich der Fourierschen Konstanten von positiven harmonischen Funktionen. Rend. Circ. Mat. Palermo, 32 (1911), 193-217.
- [7] L. Chen and D. Goldfarb, Interior-point l2-penalty methods for nonlinear programming with strong global convergence. Mathematical Programming, 108(1) (2006), 1-36.
- [8] A.V. Fiacco and G.P. McCormick, Nonlinear Programming: Sequential Unconstrained Minimization Techniques Wiley, 1968 (reprinted, SIAM, 1990).
- [9] A. Forsgren, P.E. Gill and M.H. Wright, Interior methods for nonlinear optimization. SIAM Review, 44(4) (2002), 525-597.
- [10] M.A. Gomes-Ruggiero, J.M. Martínez and S.A. Santos, Spectral projected gradient method with inexact restoration for minimization with nonconvex constraints. SIAM Journal on Scientific Computing, 31(3) (2009), 1628-1652.
- [11] G. Haeser, Condições sequenciais de otimalidade. PhD thesis, IMECC-UNICAMP, Departamento de Matemática Aplicada, Campinas-SP, Brazil, 2009. http://libdigi.unicamp.br/document/?code=000466897
- [12] M.R. Hestenes, Optimization theory: the finite dimensional case John Wiley (1975).
- [13] R. Janin, Directional derivative of the marginal function in nonlinear programming. Mathematical Programming Study, 21 (1984), 110-126.
- [14] O.L. Mangasarian and S. Fromovitz, The Fritz John optimality conditions in the presence of equality and inequality constraints. Journal of Mathematical Analysis and Applications, 17 (1967), 37-47.
- [15] L. Qi and Z. Wei, On the constant positive linear dependence condition and its applications to SQP methods. SIAM Journal on Optimization, 10(4) (2000), 963-981.
- [16] R.T. Rockafellar, Lagrange multipliers and optimality. SIAM Review, 35(2) (1993), 183-238.
- [17] M.L. Schuverdt, Métodos de Lagrangiano aumentado com convergência utilizando a condição de dependência linear positiva constante. PhD thesis, IMECC-UNICAMP, Departamento de Matemática Aplicada, Campinas-SP, Brazil (2006). http://libdigi.unicamp.br/document/?code=vtls000375801
- [18] E. Steinitz, Bedingt konvergente Reihen und konvexe Systeme i-iv. J. Reine Angew. Math., 143 (1913), 128-175.
Publication Dates
-
Publication in this collection
13 Sept 2010 -
Date of issue
June 2010
History
-
Received
30 June 2009 -
Accepted
17 Mar 2010