SciELO - Scientific Electronic Library Online

 
vol.34 issue3TRUST-REGION-BASED METHODS FOR NONLINEAR PROGRAMMING: RECENT ADVANCES AND PERSPECTIVESCONSTANT RANK CONSTRAINT QUALIFICATIONS: A GEOMETRIC INTRODUCTION author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

Share


Pesquisa Operacional

Print version ISSN 0101-7438

Pesqui. Oper. vol.34 no.3 Rio de Janeiro Sept./Dec. 2014

http://dx.doi.org/10.1590/0101-7438.2014.034.03.0463 

Articles

STABILIZED SEQUENTIAL QUADRATIC PROGRAMMING: A SURVEY

Damián Fernández1 

Mikhail Solodov2  * 

1FaMAF, Universidad Nacional de Córdoba, Medina Allende s/n, 5000 Córdoba, Argentina. E-mail: dfernandez@famaf.unc.edu.ar

2IMPA - Instituto de Matemática Pura e Aplicada, Estrada Dona Castorina 110, Jardim Botânico, 22460-320 Rio de Janeiro, RJ, Brazil. E-mail: solodov@impa.br

ABSTRACT

We review the motivation for, the current state-of-the-art in convergence results, and some open questions concerning the stabilized version of the sequential quadratic programming algorithm for constrained optimization. We also discuss the tools required for its local convergence analysis, globalization challenges, and extentions of the method to the more general variational problems.

Key words: Newton method; (stabilized) sequential quadratic programming; constrained optimization; variational problem; second-order sufficiency; (non)critical Lagrange multipliers

1 INTRODUCTION

Consider the optimization problem

where f: R nR, h: R nR l and g: RnR m are twice continuously differentiable. Let L: R n × R l × R mR,

be the Lagrangian of problem (1), where ‹·,·› stands for the inner product (the space is always clear from the context). Stationary points of problem (1) and the associated Lagrange multipliers are characterized by the Karush-Kuhn-Tucker (KKT) optimality system

We denote by M(x) the set of Lagrange multipliers associated with xRn, that is, the pairs (λ, µ) ∈ R l × R m satisfying (2) for x = x

The fundamental Newtonian approach to solving (1) is the sequential quadratic programming (SQP) algorithm[2,40,18]; see also [28, Chapter 4 ]. Given the current primal-dual iterate (xk, λk, µk) ∈Rn × R l × R m, an iteration of SQP generates the next iterate (xk+1, λk+1, µk+1) as a stationary point and associated Lagrange multipliers of the quadratic programming (QP) subproblem

where Hk is a symmetric n × n matrix. The basic Newtonian scheme corresponds to taking

In fact, if there are no inequality constraints, it can be seen that computing (xk+1 , λ k+1) by solving (3) with the choice (4) is equivalent to the usual Newton iteration from the point (xk, λk), applied to the equation given by the first two equalities in the KKT system (2).

To motivate the stabilized modification of SQP, we start with some comments about convergence properties of SQP itself. The first relevant observation is that without constraint qualification (CQ)[41] assumptions, the QP (3) can simply be infeasible and thus the method be not welldefined. Indeed, it can be informally stated that one of the roles of CQs is precisely to ensure that the first-order approximation of the constraints, like in (3), be consistent (and adequately approximate local structure of the feasible set of the original problem (1) around the given point).

For a point x feasible in (1), denote by

the set of inequality constraints active at x. If x is a stationary point of (1) and (λ, μ)∈ M(x), denote further by

the sets of strongly and weakly active constraints, respectively.

The linear independence constraint qualification (LICQ) is said to hold at x if

where from now on, the notation MJ refers to the submatrix of the matrix M comprised by the rows of M indexed by the set J. In particular, the LICQ condition (5) says that the gradients of all of the equality constraints together with the gradients of all of the active inequality constraints form a linearly independent set in Rn. The Mangasarian-Fromovitz constraint qualification(MFCQ) is said to hold at x if

where for a matrix M we denote its null space by ker M ={ξ | Mξ = 0}. Both LICQ and MFCQ imply that for a local solution x of (1) the multiplier set M(x) is nonempty (for this specific property, weaker or different conditions can be used as well; see[41]). Note that MFCQ is equivalent to the requirement that M(x) be nonempty and bounded. The so-called strict MFCQ (SMFCQ) consists of saying that, in addition to (6), the multiplier associated to x is unique (M(x) is a singleton). In the absence of (active) inequality constraints MFCQ, SMFCQ and LICQ are all equivalent (to the regularity condition rank h' (x) = l), but otherwise MFCQ is a weaker assumption than SMFCQ which, in turn, is weaker than LICQ.

We say that for a given stationary point x of problem (1) and for an associated multiplier (λ, μ) ∈ M(x) the second-order sufficient optimality condition (SOSC) holds if

where

is a strict local minimizer in (1) at x. We note that SOSC implies that x is a strict local minimizer in (1).

The sharpest local superlinear convergence result for SQP is provided by the analysis in[3]; see also [28, Chapter 4]. It assumes SMFCQ (uniqueness of the multiplier (λ, μ) associated to x) and SOSC (7). Earlier results all required, in addition to SOSC, the stronger LICQ and the strict complementarity condition (μA(x) > 0 (such statements are standard; see, e.g.,[1,36 ]). In particular, we emphasize that convergence of SQP requires certain regularity of constraints (a CQ).

The stabilized version of SQP (sSQP) had been developed with the goal to guarantee fast convergence rate despite possible degeneracy of constraints (i.e., when usual CQs may not hold), and in particular when the Lagrange multipliers associated to a solution are not unique. The method was introduced in[42] for the case of inequality constraints, in the form of iteratively solving the min-max subproblems

where (xkk) ∈ Rn × R m+ is the current approximation to a primal-dual solution of (2), and σk > 0 is the dual stabilization parameter. Adding also equality constraints, it can be seen[31] that the corresponding min-max problem is equivalent to the following QP in the primal-dual space:

The dual stabilization parameter is usually based on computing the violation of the KKT optimality conditions (2) by the point (xk, λk, µk). For example, for fast local convergence one chooses in (8) σk = σ (xk, λk, µk), where σ: Rn × Rl × RmR+ is the natural residual of the KKT system (2), i.e.,

with the minimum applied componentwise.

One immediate observation is that for σk > 0, the constraints in (8) have the so-called "elastic mode" feature, and the subproblem is therefore automatically feasible regardless of any CQs or convexity assumptions. For example, fixing any xRn and taking a suitable λ (uniquely defined for each x by the first constraint in (8)) and µ > 0 with all the components large enough, gives points (x, λ, µ) feasible in (8). This is the first major difference from the standard SQP.

Another consideration is the following. It has been observed (e.g., in [44, Section 6], and in [23,24,26]) that the difficulties with convergence of SQP in degenerate cases are often not because of degeneracy as such, but are due to some undesirable behaviour of the dual sequence. The dual regularization/stabilization term in the objective function of (8) can be regarded as an attempt to modify this behaviour. As discussed in Section 2 below, in the sense of local convergence it indeed does the job. The situation is more complicated in the global sense; see Section 3.

2 LOCAL CONVERGENCE THEORY

In this section, we first survey the historical accounts on local convergence analyses of sSQP. Then, we state the current state-of-the-art results, and finally briefly describe the (relatively recent) variational tools required to establish those properties.

In [42], local superlinear convergence of sSQP is established under MFCQ (6), SOSC (7) assumed to hold for all multipliers, the existence of a multiplier μ satisfying the strict complementarity condition μA(x) > 0, and the assumption that the initial dual iterate is close enough to such a multiplier. Also, [42] gives an analysis in the presence of round-off errors. In [44], the assumption of strict complementarity has been removed. Also, [43] suggests a certain inexact SQP framework which includes sSQP as a particular case. The assumptions, however, still contain MFCQ. In [19], CQs are not used at the expense of employing instead of the weaker SOSC (7) the strong SOSC (SSOSC)

where

and assuming that the dual starting point is close to a multiplier satisfying this SSOSC. (SSOSC (10) is stronger than SOSC (7) because C(x, μ).) In [12], the result of [19] was recovered from more general principles (some details will be discussed below), and somewhat sharpened. The iterative framework of [12] was further used in [10] to prove local superliner convergence using SOSC (7) only, with no CQs or other assumptions. Moreover, the method was extended to variational problems (see Section 4 below). Quasi-Newton versions of sSQP are analyzed under SOSC in [7]. In [27] it was shown that the SOSC cannot be relaxed when inequality constraints are present, but for equality-constrained problems the weaker condition of noncriticality of the relevant Lagrange multiplier (see the definition immediately below) is sufficient for convergence.

A Lagrange multiplier (λ, μ) ∈ M(x) is said to be critical if there exists a triple (ξ, η, ζ) ∈ Rn × Rl × Rm, with ξ ≠0, satisfying the system

and noncritical otherwise. We refer the reader to [23,24,26,27,21,22] for the role this notion plays in convergence properties of algorithms, stability, error bounds, and other issues; see also [28, Chapter 7]. Some comments will also be given below. When there are no inequality constraints, it can be seen from (11) that a multiplier λM(x) being critical means that

It can be easily seen, essentially observing that im(h'(x))T = (ker h'(x)), that this noncriticality property for equality-constrained problems is implied by the corresponding version of SOSC (7), but not vice versa. The same conclusion holds for the general case: if (11) has a solution with ξ ≠0, multiplying the first equality in (11) by this ξ and using the other relations in (11), one arrives to a contradiction with SOSC (7). It should be emphasized that SOSC is a much stronger assumption than noncriticality. Noncritical multipliers, when they exist, form a relatively open and dense subset of the multiplier set M(x), which is of course not so for multipliers satisfying SOSC.

We are now in position to formally state local convergence properties of sSQP [10,27]; see also [28, Chapter 7]. Note that in Theorem 1 below, if there are equality constraints only, everything that involves the multiplier µ disappears from the statement. Note also that in the equalityconstrained case, finding a stationary point of the sSQP subproblem (8) is equivalent to solving the linear system of equations

in the variables (x, λ).

Theorem 1. Let ƒ: RnR, h: RnRl and g: RnRm be twice differentiable in a neighborhood of x, with their second derivatives being continuous at x. Let (σk = σ (xk, λk, μk), where σ is given by (9). Let (x, λ, μ) be a solution of the KKT system (2), satisfying SOSC (7). If there are equality constraints only, let instead λ be a noncritical multiplier (i.e., the right-most relation in (12) does not hold for any ξ ≠ 0.)

Then for any c > 0 large enough and any starting point (x000)Rn × Rl × Rm+ close enough to (x, λ, μ), there exists a sequence {(xk, λk, µk)} ⊂ Rn × Rl ×Rm such that for each k = 0,1, ..., xk+1 is a stationary point of sSQP subproblem (8) with associated Lagrange multipliersk+1, µk+1) which satisfies

any such sequence converges to (x, λ, μ) with some (λ*, µ*) ∈ M(x), and the rates of convergence of {(xk, λk, µk)} to (x, λ, μ) and of {dist ((xk, λk, µk), {x} × M(x))} to zero are superlinear. Moreover, the rates of convergence are quadratic provided the second derivatives of f, h and g are locally Lipschitz-continuous with respectto x.

Some comments are in order. Under SSOSC (10), solutions of sSQP subproblems (8) can in addition be shown to be unique in some neighbourhood [10]. Furthermore, in the equality-constrained case, locally and under the noncriticality assumption, the linear system (13) has the unique solution, i.e., the sSQP subproblem has the unique stationary point [27]. For the equality-constrained case, sSQP is the only currently known method that solves a linear system or a QP per iteration (i.e., an explicitly Newtonian method), and which requires for convergence something weaker than SOSC (in particular, noncriticality of the multiplier) and does not need any CQs. In [27] it is shown that when there are inequality constraints, SOSC cannot be replaced by noncriticality. Whether convergence of the primal (rather than primal-dual) sequence generated by sSQP is also superlinear is an open question [8]. Recall that in general, superlinear convergence of primal-dual sequence does not imply any rate for the primal (or dual) sequence separately [4, Exercise 14.8]. For SQP, the primal rate is superlinear [8]. For sSQP, only a kind of "two-step" superlinear estimate for the primal sequence is available [7].

We illustrate the convergence result in Theorem 1 with the following example.

Example 1. Consider the optimization problem

It can be seen that x= (0,0) is the unique solution of this problem, and that the associated set of Lagrange multipliers is given by

In particular, MFCQ (6) does not hold and M(x) is unbounded. Furthermore, SOSC (7) holds at (x, μ) for any μM(x) with μ1 > 0, but SSOSC (10) is not satisfied for any multiplier.

Experiments were performed with Matlab implementation of sSQP, using the built-in subroutine quadprog for solving QP subproblems (8) and choosing random starting points xi0 ∈ [-1/2, 1/2], i = 1, 2, and µ0j ∈ [0, 1], j = 1, 2, 3. The stopping criterion is σ (xk, µk) < 10-15.

In about 10% of the cases, the sequence converged linearly to (x, μ) with μ1 = 0 (SOSC is not valid at this solution). Such cases appear to correspond to the choices of starting points that are not close enough to a solution satisfying SOSC (so that Theorem 1 does not apply). About 3% of the starting points produced unsolvable subproblems at the first iteration (for the same reason as above - starting points not being close enough to a solution). All the remaining runs converged superlinearly to a primal-dual solution satisfying SOSC. Table 1 shows the average values of ║xk - x║+ dist (µk, M(x)) for the last 5 iterations in the cases of convergence to a primal-dual solution satisfying SOSC.

Table 1 Distance to solution on last 5 iterations in Example 1. 

x - x║ + dist(μ, M(x))
1.5891e-001
1.0599e-002
7.3688e-005
6.3242e-009
8.4865e-017

The analysis that leads to Theorem 1 relies on the variational Newtonian framework for generalized equations (GEs) with nonisolated solutions, developed in [12]; see also [28, Chapter 7] (in our context, in the absence of CQs dual solutions of the KKT system (2) are not isolated). To that end, consider the generalized equation (GE)

where Φ : RνRν is a smooth (single-valued) mapping, and N (·) is a set-valued mapping from Rν to the subsets of Rν. As is well known, the KKT system (2) corresponds to the GE (15) with ν = n + l + m, the mapping Φ : Rn × Rl × RmRn × Rl × Rm given by

and with N beining the normal cone to the set

Consider the class of methods for solving (15) that, given the current iterate u kRν, generate the next iterate u k+1 as a solution of the subproblem of the form

where for ũRν the mapping A (ũ,·) is some kind of approximation of Φ around ũ. For example, if

the iteration subproblem (18) becomes that of the Josephy-Newton method for GEs (see [28, Chapter 3]), and when applied to the KKT system (2) (i.e., the special case of GE described above), it corresponds to the SQP subproblem (3). For each ũ ∈ Rν, define the set

so that U (uk) is the solution set of the iteration subproblem (18). As is usual and natural in local convergence considerations, one has to specify which of the solutions of (18) are allowed to be the next iterate (solutions "far away" must clearly be discarded from local analysis; note that sSQP subproblem (8) need not be strongly convex and thus may have such "far away" solutions). In other words, we have to restrict the distance from the current iterate uk to the next one, i.e., to an element of U (uk) that can be declared to be uk+1. To that end, for an arbitrary but fixed c > 0 define the subset of the solution set of the subproblem (18) by

where U is the solution set of the GE (15), and consider the iterative scheme

Superlinear convergence of this scheme is established under the following three conditions:

(i) Upper Lipschitzian behavior of solutions of GE under canonical perturbations - For every rRν close enough to 0, any solution u(r) of the perturbed GE

close enough to u satisfies the estimate

(ii) Precision of approximation of Φ in subproblems - There exists a function ω: R+R+ such that ω(t) = o (t) as t → 0 and the estimate

holds for all ũ ∈Rν close enough to u.

(iii) Solvability of subproblems with the localization condition - For any ũ ∈Rν close enough to u the set Uc (ũ) defined by (19), (20) is nonempty.

Some comments are in order. For GE corresponding to the KKT system (2), the canonically perturbed problem has the form

for r = (a,b,c) ∈ Rn × Rl × Rm. For KKT systems, the upper Lipschitzian behavior of solutions under canonical perturbations (the first assumption above) is equivalent to noncriticality of the Lagrange multiplier (under the smoothness assumptions in this survey) [21]. In particular, it is implied by the SOSC (7). The second assumption above naturally holds for Newton-type methods, and in particular for sSQP if the stabilization parameter is properly chosen (for example, based on the KKT natural residual (9)). The third assumption on solvability of subproblems and localization condition is where the most work is required [10,27]. And it is here where noncriticality of the multiplier needs to be strengthened to SOSC if inequality constraints are present.

We finally note that sSQP can also be interpreted within the perturbed Josephy-Newton framework of [25]; see also [28, Chapter 3]. However, the main convergence result in this framework requires SMFCQ. If the method is interpreted instead via [12,10] as outlined above, no CQs are needed to prove local convergence. It is also interesting to mention that a modification of the Newtonian framework of [12] is used in [11] to derive local convergence and rate of convergence results for the augmented Lagrangian algorithm (method of multipliers) under SOSC (7) only, significantly improving on the classical results such as in [1] that assume in addition LICQ (5) and strict complementarity. Moreover, for the equality-constrained case SOSC can be relaxed to noncriticality [22], as is the case for sSQP (the required analysis is very different though). It is interesting that even though the augmented Lagrangian method is not of Newton type, the Newtonian lines of analysis turned to be very fruitful for this context as well.

3 GLOBALIZATION ISSUES

As any Newtonian method, sSQP is a local scheme, guaranteed to converge if initialized at a point close enough to a solution with the properties discussed in Section 2. To obtain a complete algorithm, some strategy to globalize convergence is needed (so that arbitrary starting points can be used). This proved to be a rather difficult task. Recall that to globalize SQP at least three different approaches are available; see [28, Chapter 6]. Globalization can be organized using linesearch [4, Chapter 17 or trust-region [5, Chapter 15.4 ] for a nonsmooth penalty function, and the filter technique [13,14,37]. For example, if a positive definite matrix Hk is employed in the QP (3), then the generated direction x k+1 - xk is that of descent for the penalty function

provided one takes ck > ║(λk+1, µk+1)║∞. One can then perform linesearch in the obtained direction to guarantee progress towards solving (1) via decreasing the penalty function with respect to its value at the previous iterate xk andthenre-defining xk+1 accordingly. To find a suitable penalty function for which the direction computed by the sSQP subproblem (8) is of descent, proved a challenge. In particular, the penalty function like (24), or other "usual" candidates, do not do the job.

Some numerical results on global behaviour of sSQP, without attempting to globalize the method itself, are reported in [33], but this experience is rather limited (just a few test problems are considered). More test problems have been employed in [26] but globalization used there is a heuristic not supported by a proof. As for any local algorithm, so-called "hybrid" strategies can certainly be used (see [28, Chapter 5] for one family of hybrid globalizations of local methods for variational problems). Something like this is certainly applicable to sSQP. In [20], this approach was implemented in conjunction with the augmented Lagrangian as the globally convergent method. We next survey the few more direct approaches to globalize sSQP that have been proposed so far.

In [39] the globalization technique is based on a linesearch for the so-called primal-dual augmented Lagrangian [17]. This work deals with optimization problems in the format

(The more general problem (1) can be reformulated into this setting using slack variables.) For the optimization problem (25), taking µ = f'(x) + (h'(x))T λ, the natural residual (9) can be written as

In [39] the nonnegativity constraint is excluded from stabilization, and instead of (8), the sSQP subproblem is given by

where k is a reference Lagrange multiplier estimate and Hk is a symmetric matrix such that Hk + (1/σk) (h'(xk))T h'(xk) is positive definite. It should be noted that there does not seem to be any theory to justify that this "partial" (excluding the constraint x ≥ 0) stabilization in (26) actually inherits local convergence properties of sSQP under some reasonable assumptions. In terms of local convergence, the idea of [39] is to use in addition identification of active inequality constraints, so that the overall algorithm eventually becomes sSQP for the associated equalityconstrained problem. From the analysis in [39], if (xk+1, λk+1) is the (unique) solution of (26) then (xk+1 - xk, λk+1 - λk) is a descent direction at (xk, λk) for the primal-dual penalty function

where c > 0 is a fixed parameter. This penalty function is minimized using linesearch, thus re-defining (x k+1, λk+1). The reference Lagrange multiplier k+1 is updated to λk+1 if either the weighted natural residual for problem (25) is small or if the natural residual for the problem of minimizing ϕc(x, λ; k, σk) subject to x ≥ 0 is small. The dual stabilization parameter σk and other algorithmic parameters are updated by certain rules. According to [39, Theorem 4.2 ], if the generated sequence {xk} is bounded and the sequence {Hk} is chosen bounded with {Hk + (1/σk)(h'(xk))T h'(xk)} being also uniformly positive definite, then either there exists an index set K such that limK∋k→∞σ(xk, λk) = 0 (accumulation points of this subsequence solve the KKT system of the problem) or there exists an index set S such that limS∋k→∞σk = 0, {k}k∈S is bounded and

Another globalization strategy is proposed in [9]. It is based on the inexact restoration ideas [32], and uses linesearch for a primal-dual nondifferentiable penalty function. This work considers problems in the format

where a and b are finite bounds. For this problem, the natural residual (9) is given by

The corresponding sSQP subproblem again employs only partial stabilization (leaving out the bounds), and has the form

where k is a reference Lagrange multiplier approximation and Hk is a symmetric positive definite matrix. The penalty function used in [9] is

Note that this is a penalty function for the problem of minimizing ƒ (x) + (σk/2) ║λ║2 subject to h(x) - σk (λ -k) = 0, and the latter problem is equivalent to minimizing ƒ (x) + ‹k, h(x)› + 1/(2σk)║h(x)║2. Thus, this penalty function is also related to the augmented Lagrangian.

The inexact restoration strategy presented in [9] can be interpreted as two-step linesearch for the function (30), where in the first step the penalty parameter ck is increased in order to achieve ck(xk, k; k, σk) < ck(xk, λk;k, σk) with k =k + (1/σk)h(xk), and in the second step linesearch is performed along the direction (xk+1 - xk, λk+1 - k to re-define (x k+1, λk+1) so that ck (xk+1, λk+1k, σk) ≤ ck (x k, λk; k, σk). If the linesearch direction is small enough (with respect to the inexact restoration criteria), then (xk+1, λk+1) is accepted as the new primaldual iterate and the reference Lagrange multiplier k+1 is updated to λk+1. The dual stabilization parameter σk is updated by a suitable rule. According to [9, Theorem 2], if the sequence of matrices {Hk} is chosen uniformly bounded and uniformly positive definite, and if x is an accumulation point of the sequence {xk}, then x is a stationary point of the problem (28) if {σk} is bounded away from zero, or it is a stationary point of the problem of minimizing the infeasibility measure ║h(x)║2 subject to a xb if {σk } converges to zero. The algorithm of [9] solves sSQP subproblems, but in a sense they can be considered as "inner iterations" within the inexact restoration scheme (which drives global convergence of the method). In particular, it is not known whether locally the algorithm indeed behaves as sSQP under some assumptions (i.e., solves only one sSQP subproblem per inexact restoration iteration).

another very recent and promissing proposal [29] is based on linesearch in sSQP directions for the following two-parameter primal-dual exact penalty function:

where c1 > 0, c2 > 0. This function was originally introduced and studied in [6]; see also [1]. Provided the penalty parameters are updated by certain appropriate rules, very reasonable global convergence properties are established in [29]. Moreover, near qualified solutions (stationary point - noncritical multiplier pairs), the sSQP directions are always accepted by the algorithm, and then the unit stepsize in those directions is accepted by the Armijo linesearch rule. Thus, the globalized scheme inherits fast local convergence of sSQP, under weak assumptions.

We next comment on some other ideas for sSQP globalization, which led to partial developments of some promis but did not materialize into complete algorithms so far.

A globalization strategy can be attempted using the principle of the augmented Lagrangian method, i.e., decrease the augmented Lagrangian function in the primal space and increase it in the dual. Consider the classical augmented Lagrangian for problem (1):

σ > 0. It can be seen that if (xk+1, λk+1, µk+1) is a solution of the sSQP subproblem (8), then (xk+1-xk, λk+1 - λk, µk+1 - µk) is a descent direction at (xk, λk, µk) for the "difference of two augmented Lagrangians" function

for any k ∈ [k/2, k]. Moreover, the directional derivative is less than - Δk, where

It can be shown that if {Δk} tends to zero, the sequence of matrices {Hk} is chosen uniformly bounded and uniformly positive definite and (x, λ, μ) is a limit point of the sequence {(xk, λk, µk)}, then x is a stationary point of the problem (1) if {σk} is bounded away from zero, or it is a stationary point of the problem of minimizing the infeasibility measure ║h(x)║2 + ║max{0, g(x)}║2 + ║max{0,g(x)}║2 if {σk} converges to zero. However, so far there are no reasonable hypotheses to guarantee Δk → 0 from a standard linesearch. From another point of view, this strategy is related to finding a solution of an equilibrium problem. If for some , > 0 it holds that (x, λ, μ) is a solution of the optimization problem min (x, λ ,µ) ψ, (x, λ, µ; x, λ, µ, then (x, λ, µ) solves the KKT system (2). Conversely, if (x, λ, µ is a solution of the KKT system (2) satisfying SOSC (7), then (x, λ, µ) is a local minimizer of the latter problem.

Another issue concerned with global convergence of sSQP has to do with possible attraction of the iterates to critical Lagrange multipliers, and eventual slow convergence rate as a consequence; see [28, Chapter 7]. In [23,24,26,30] this phenomenon was exhibited for various Newtonian and Newton-related methods, such as SQP and its quasi-Newton implementations, and the linearly constrained (augmented) Lagrangian methods [38,34,15]. Both theoretical considerations and numerical results for SNOPT [16] and MINOS [35] solvers were presented, which put in evidence that when critical multipliers exist, they serve as attractors of the dual sequence generated by the type of methods in question. Moreover, the reason for slow convergence in the degenerate cases is precisely attraction to critical multipliers, as convergence to noncritical ones would have given the primal superlinear rate. Numerical results in [26] show that the effect of attraction (globally, i.e., from "far away" points) to critical multipliers still exists for sSQP too (when evaluating the numbers reported therein, it is important to keep in mind that critical multipliers are typically few; the usual situation is that they form a set of measure zero within the set of all multipliers), but the attraction is much less persistent for sSQP than for the other algorithms. The runs clearly split into two groups. Sometimes the (globalized, heuristically in that reference) process manages to enter the "good" primal-dual region, where the stabilization term starts working properly (has the needed "size"), and then it converges superlinearly with the dual limit being noncritical. However, in a considerable number of cases this does not happen, and then the process still converges slowly to a critical multiplier. Thus, although sSQP does help when compared to the alternatives, by itself it does not seem to be a fully reliable tool for avoiding the effect of attraction to critical multipliers and its negative consequences. It would seem that some special modifications would be needed in the "global" phase of the method to reliably avoid convergence to critical multipliers, without slowing down the overall process. Those are also some of the conclusions from the numerical results in [29].

Overall, building really satisfactory globalization techniques for sSQP is a challenging matter, which (for general problems) should still be considered an open question at this time.

4 EXTENSIONS TO VARIATIONAL PROBLEMS

Denote

and let ND(x) be the dual cone of the tangent (contingent) cone TD(x) to the set D at xRn, i.e., ND(x) = ∅ for xD and otherwise ND(x) = (TD(x)) where

Consider the variational problem (VP)

where F: RnRn; see [28, Chapters 1 and 3]. In particular, for the optimization problem (1) this VP represents the first-order necessary optimality condition

if we take F(x) = f'(x). If the set D is convex, then (31) gives the usual variational inequality

Associated to solving VP (31) is the KKT system

Define the mapping G: Rn × Rl × RmRn by

Let (xk, λk, µk) ∈ Rn × Rl × R+m be the current primal-dual approximation to a solution of (32), and let σk > 0 be the dual stabilization parameter. Define the affine mapping Φk: Rn × Rl × RmRn × Rl × Rm by

and consider the affine VI of the form

where

As can be easily seen, in the optimization case (1) the VI (33) is precisely the first-order (primal) necessary optimality condition for the sSQP subproblem (8), if one takes F(x) = f'(x). Thus this scheme contains sSQP for optimization as a special case. Note that the method makes good sense also in the variational setting, as solving the fully nonlinear VP (31) is replaced by solving a sequence of affine VIs (33) (the mapping Φk is affine and the set Qk is polyhedral). If the set D is defined by equality constraints only, then it can be seen that (33) is just a system of linear equations.

Convergence analysis of this stabilized Newton method for variational problems can be found in [10]; see also [28, Chapter 7].

5 CONCLUDING REMARKS

We presented a survey of literature and some discussion of the stabilized version of the fundamental sequential quadratic programming method for constrained optimization. Further material, in particular comprehensive local convergence analysis, can be found in the book [28].

ACKNOWLEDGMENTS

The second author is supported in part by CNPq Grant 302637/2011-7, by PRONEX-Optimization, and by FAPERJ.

REFERENCES

BERTSEKAS DP. 1982. Constrained Optimization and Lagrange Multiplier Methods. Academic Press, New York. [ Links ]

BOGGS BT & TOLLE JW. Sequential quadratic programming. Acta Numerica, 4:1-51. [ Links ]

BONNANS JF. 1994. Local analysis of Newton-type methods for variational inequalities and nonlinear programming. Applied Mathematics and Optimization, 29:161-186. [ Links ]

BONNANS JF, GILBERT JCH, LEMARÉCHAL C & SAGASTIZÁBAL C. 2006. Numerical Optimization: Theoretical and Practical Aspects. Springer-Verlag, Berlin, Germany. Second Edition. [ Links ]

CONN AR, GOULD NIM & TOINT PHL. 2000. Trust-Region Methods. SIAM, Philadelphia. [ Links ]

DI PILLO G & GRIPPO L. 1979. A new class of augmented Lagrangians in nonlinear programming. SIAM Journal on Control and Optimization, 17:618-628. [ Links ]

FERNÁNDEZ D. 2013. A quasi-Newton strategy for the sSQP method for variational inequality and optimization problems. Mathematical Programming, 137:199-223. [ Links ]

FERNÁNDEZ D, IZMAILOV AF & SOLODOV MV. 2010. Sharp primal superlinear convergence results for some Newtonian methods for constrained optimization. SIAM Journal on Optimization, 20:3312-3334. [ Links ]

FERNÁNDEZ D, PILOTTA EA & TORRES GA. 2013. An inexact restoration strategy for the globalization of the sSQP method. Computational Optimization and Applications, 54:595-617. [ Links ]

FERNÁNDEZ D & SOLODOV M. 2010. Stabilized sequential quadratic programming for optimization and a stabilized Newton-type method for variational problems., Mathematical Programming, 125:47-73. [ Links ]

FERNÁNDEZ D & SOLODOV M. 2012. Local convergence of exact and inexact augmented Lagrangian methods under the second-order sufficient optimality condition. SIAM Journal on Optimization, 22:384-407. [ Links ]

FISCHER A. 2002. Local behavior of an iterative framework for generalized equations with nonisolated solutions. Mathematical Programming, 94:91-124. [ Links ]

FLETCHER R, GOULD N, LEYFFER S, TOINT P & WÄCHTER A. 2002. Global convergence of trust-region and SQP-filter algorithms for general nonlinear programming. SIAM Journal on Optimization, 13:635-659. [ Links ]

FLETCHER R, LEYFFER S & TOINT PL. 2002. On the global convergence of a filter-SQP algorithm. SIAM Journal on Optimization, 13:44-59. [ Links ]

FRIEDLANDER MP & SAUNDERS MA. 2005. A globally convergent linearly constrained Lagrangian method for nonlinear optimization. SIAM Journal on Optimization, 15:863-897. [ Links ]

GILL PE, MURRAY W & SAUNDERS MA 2002. SNOPT: An SQP algorithm for large-scale constrained optimization. SIAM Journal on Optimization, 12:979-1006. [ Links ]

GILL PE & ROBINSON DP. 2012. A primal-dual augmented Lagrangian. Computational Optimization and Applications, 51:1-25. [ Links ]

GILL PE & WONG E. 2012. Sequential quadratic programming methods. In: Lee J & Leyffer S (eds.) , Mixed Integer Nonlinear Programming, The IMA Volumes in Mathematics and its Applications, Volume 154, Springer-Verlag, pp. 147-224. [ Links ]

HAGER WW. 1999. Stabilized sequential quadratic programming. Computational Optimization and Applications, 12:253-273. [ Links ]

IZMAILOV AF, KRYLOVA AM & USKOV EI. 2011. Hybrid globalization of stabilized sequential quadratic programming method. In Russian. In: Bereznyov VA (ed.), Theoretical and Applied Problems of Nonlinear Analysis, pp. 47-66. Computing Center RAS, Moscow. [ Links ]

IZMAILOV AF, KURENNOY AS & SOLODOV MV. 2013. A note on upper Lipschitz stability, error bounds, and critical multipliers for Lipschitz-continuous KKT systems. Mathematical Programming, 142:591-604. [ Links ]

IZMAILOV AF, KURENNOY AS & SOLODOV MV. 2014. Local convergence of the method of multipliers for variational and optimization problems under the sole noncriticality assumption. August 2013 (Revised January 2014) . Available at http://pages.cs.wisc.edu/~solodov/solodov.html [ Links ]

IZMAILOV AF & SOLODOV MV. 2009. On attraction of Newton-type iterates to multipliers violating second-order sufficiency conditions. Mathematical Programming, 117:271-304. [ Links ]

IZMAILOV AF & SOLODOV MV. 2009. Examples of dual behaviour of Newton-type methods on optimization problems with degenerate constraints. Computational Optimization and Applications, 42:231-264. [ Links ]

IZMAILOV AF & SOLODOV MV. 2010. Inexact Josephy-Newton framework for generalized equations and its applications to local analysis of Newtonian methods for constrained optimization. Computational Optimization and Applications, 46:347-368. [ Links ]

IZMAILOV AF & SOLODOV MV. 2011. On attraction of linearly constrained Lagrangian methods and of stabilized and quasi-Newton SQP methods to critical multipliers. Mathematical Programming, 126:231-257. [ Links ]

IZMAILOV AF & SOLODOV MV. 2012. Stabilized SQP revisited. Mathematical Programming, 133:93-120. [ Links ]

IZMAILOV AF & SOLODOV MV. 2014. Newton-Type Methods for Optimization and Variational Problems. Springer Series in Operations Research and Financial Engineering, 2014. www.springer.com/mathematics/applications/book/978-3-319-04246-6. [ Links ]

IZMAILOV AF, SOLODOV MV & USKOV EI. 2014. Globalizing stabilized SQP by smooth primaldual exact penalty function. IMPA preprint A5566, March 2014. [ Links ]

IZMAILOV AF & USKOV EI. 2012. Attraction of Newton method to critical Lagrange multipliers: fully quadratic case. Available at http://www.optimization-online.org/DBHTML/2013/02/3761.html. [ Links ]

LI D-H & QI L. 2000. Stabilized SQP method via linear equations. Applied Mathematics Technical Reptort AMR00/5. University of New South Wales, Sydney. [ Links ]

MARTÍNEZ JM & PILOTTA EA. 2005. Inexact restoration methods for nonlinear programming: advances and perspectives. In: Qi LQ, Teo KL & Yang XQ (eds.), Optimization and Control with Applications, pp. 271-292. Springer. [ Links ]

MOSTAFA EME, VICENTE LN & WRIGHT SJ. 2003. Numerical behavior of a stabilized SQP method for degenerateNLP problems. In: Bliek C, Jermann C & Neumaier A (eds.) , Global Optimization and Constraint Satisfaction, Lecture Notes in Computer Science 2861, pp. 123-141. Springer- Verlag, Berlin. [ Links ]

MURTAGH BA & SAUNDERS MA 1982. A projected Lagrangian algorithm and its implementation for sparse nonlinear constraints. Mathematical Programming Study, 16:84-117. [ Links ]

MURTAGH BA & SAUNDERS MA 1983. MINOS 5.0 user's guide. Technical Report SOL 83.20, Stanford University, December 1983. [ Links ]

NOCEDAL J & WRIGHT SJ. 2006. Numerical Optimization. Springer, New York. Second Edition. [ Links ]

RIBEIRO AR, KARAS EW & GONZAGA CC. 2008. Global convergence of filter methods for nonlinear programming. SIAM Journal on Optimization, 14:646-669. [ Links ]

ROBINSON SM. 1972. A quadratically convergent algorithm for general nonlinear programming problems. Mathematical Programming, 3:145-156. [ Links ]

ROBINSON D & GILL P. 2013. A globally convergent stabilized SQP method. SIAM Journal on Optimization, 23:1983-2010. [ Links ]

SCHITTKOWSKI K & YUAN Y-X. 2010. Sequential quadratic programming methods. Wiley Encyclopedia of Operations Research and Management Science, J.J. Cochran, ed. John Wiley & Sons, Inc. [ Links ]

SOLODOV MV. 2010. Constraint qualifications. Wiley Encyclopedia of Operations Research and Management Science, J.J. Cochran, ed. John Wiley & Sons, Inc. [ Links ]

WRIGHT SJ. 1998. Superlinear convergence of a stabilized SQP method to a degenerate solution. Computational Optimization and Applications, 11:253-275. [ Links ]

WRIGHT SJ. 2002. Modifying SQP for degenerate problems. SIAM Journal on Optimization, 13:470-497. [ Links ]

WRIGHT SJ. 2003. Constraint identification and algorithm stabilization for degenerate nonlinear programs. Mathematical Programming, 95:137-160. [ Links ]

Received: September 29, 2013; Accepted: January 08, 2014

*Corresponding author.

Creative Commons License This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.