1 INTRODUCTION

Consider the optimization problem

where *f:*
**R**
*n* → **R**, *h:*
**R**
*n* → **R**
*l* and *g:*
**R**^{n} → **R**
*m* are twice continuously differentiable. Let *L:*
**R**
*n* × **R**
*l* × **R**
*m* → **R**,

be the Lagrangian of problem (1), where ‹·,·› stands for the inner product (the space is always clear from the context). Stationary points of problem (1) and the associated Lagrange multipliers are characterized by the Karush-Kuhn-Tucker (KKT) optimality system

We denote by *M*(*x*) the set of
Lagrange multipliers associated with x
∈**R**^{n}, that is, the pairs
(*λ*, µ) ∈ **R**
*l* × **R**
**m** satisfying (2) for *x* = x

The fundamental Newtonian approach to solving (1) is the sequential quadratic
programming (SQP) algorithm^{[}^{2}^{,}^{40}^{,}^{18}^{]}; see also [28, Chapter 4 ]. Given the
current primal-dual iterate (*x ^{k}, λ^{k},
µ^{k}*) ∈

**R**

^{n}×

**R**

*l*×

**R**

*m*, an iteration of SQP generates the next iterate (

*x*) as a stationary point and associated Lagrange multipliers of the quadratic programming (QP) subproblem

^{k+1}, λ^{k+1}, µ^{k+1}where *H _{k}* is a symmetric

*n × n*matrix. The basic Newtonian scheme corresponds to taking

In fact, if there are no inequality constraints, it can be seen that computing
(*x ^{k+1} , λ k+1*) by solving (3) with the choice (4) is
equivalent to the usual Newton iteration from the point (

*x*), applied to the equation given by the first two equalities in the KKT system (2).

^{k}, λ^{k}To motivate the stabilized modification of SQP, we start with some comments about
convergence properties of SQP itself. The first relevant observation is that without
constraint qualification (CQ)^{[}^{41}^{]} assumptions, the QP (3) can simply be infeasible and thus the
method be not welldefined. Indeed, it can be informally stated that one of the roles of
CQs is precisely to ensure that the first-order approximation of the constraints, like
in (3), be consistent (and adequately approximate local structure of the feasible set of
the original problem (1) around the given point).

For a point x feasible in (1), denote by

the set of inequality constraints active at x. If
x is a stationary point of (1) and (λ,
μ)∈ *M*(*x*),
denote further by

the sets of strongly and weakly active constraints, respectively.

The linear independence constraint qualification (LICQ) is said to hold at x if

where from now on, the notation *M _{J}* refers to the submatrix
of the matrix

*M*comprised by the rows of

*M*indexed by the set

*J*. In particular, the LICQ condition (5) says that the gradients of all of the equality constraints together with the gradients of all of the active inequality constraints form a linearly independent set in

**R**

^{n}. The Mangasarian-Fromovitz constraint qualification(MFCQ) is said to hold at x if

where for a matrix *M* we denote its null space by ker *M*
={ξ | *M*ξ = 0}. Both LICQ and MFCQ imply that for a local solution
x of (1) the multiplier set
*M(x)* is nonempty (for this specific property,
weaker or different conditions can be used as well; see^{[}^{41}^{]}). Note that MFCQ is equivalent to the requirement
that *M(x*) be nonempty and bounded. The so-called
strict MFCQ (SMFCQ) consists of saying that, in addition to (6), the multiplier
associated to x is unique
(*M(x*) is a singleton). In the absence of
(active) inequality constraints MFCQ, SMFCQ and LICQ are all equivalent (to the
regularity condition rank h' (x) = l), but otherwise MFCQ is a
weaker assumption than SMFCQ which, in turn, is weaker than LICQ.

We say that for a given stationary point x of problem (1) and for
an associated multiplier (λ, μ) ∈
*M(x)* the second-order sufficient optimality
condition (SOSC) holds if

where

is a strict local minimizer in (1) at x. We note that SOSC implies that x is a strict local minimizer in (1).

The sharpest local superlinear convergence result for SQP is provided by the analysis
in^{[}^{3}^{]}; see also [^{28}, Chapter 4]. It assumes SMFCQ (uniqueness of the
multiplier (*λ, μ*) associated
to x) and SOSC (7). Earlier results all required, in addition to
SOSC, the stronger LICQ and the strict complementarity condition
(μ_{A(x)} > 0 (such
statements are standard; see, e.g.,^{[}^{1}^{,}^{36}
^{]}). In particular, we emphasize that convergence of SQP requires certain
regularity of constraints (a CQ).

The stabilized version of SQP (sSQP) had been developed with the goal to guarantee fast
convergence rate despite possible degeneracy of constraints (i.e., when usual CQs may
not hold), and in particular when the Lagrange multipliers associated to a solution are
not unique. The method was introduced in^{[}^{42}^{]} for the case of inequality constraints, in the form of
iteratively solving the min-max subproblems

where (*x ^{k},µ^{k}*) ∈

**R**

^{n}×

**R**

*m*

_{+}is the current approximation to a primal-dual solution of (2), and σ

*> 0 is the dual stabilization parameter. Adding also equality constraints, it can be seen*

_{k}^{[}

^{31}

^{]}that the corresponding min-max problem is equivalent to the following QP in the primal-dual space:

The dual stabilization parameter is usually based on computing the violation of the KKT
optimality conditions (2) by the point (*x ^{k}, λ^{k},
µ^{k}*). For example, for fast local convergence one chooses in
(8) σ

*= σ (*

_{k}*x*), where σ:

^{k}, λ^{k}, µ^{k}**R**

^{n}×

**R**

^{l}×

**R**

^{m}→

**R**

_{+}is the natural residual of the KKT system (2), i.e.,

with the minimum applied componentwise.

One immediate observation is that for σ* _{k}* > 0, the
constraints in (8) have the so-called "elastic mode" feature, and the subproblem is
therefore automatically feasible regardless of any CQs or convexity assumptions. For
example, fixing any

*x*∈

**R**

^{n}and taking a suitable

*λ*(uniquely defined for each

*x*by the first constraint in (8)) and µ > 0 with all the components large enough, gives points (

*x, λ, µ*) feasible in (8). This is the first major difference from the standard SQP.

Another consideration is the following. It has been observed (e.g., in [^{44}, Section 6], and in ^{[}^{23}^{,}^{24}^{,}^{26}^{]}) that the
difficulties with convergence of SQP in degenerate cases are often not because of
degeneracy as such, but are due to some undesirable behaviour of the dual sequence. The
dual regularization/stabilization term in the objective function of (8) can be regarded
as an attempt to modify this behaviour. As discussed in Section 2 below, in the sense of
local convergence it indeed does the job. The situation is more complicated in the
global sense; see Section 3.

2 LOCAL CONVERGENCE THEORY

In this section, we first survey the historical accounts on local convergence analyses of sSQP. Then, we state the current state-of-the-art results, and finally briefly describe the (relatively recent) variational tools required to establish those properties.

In ^{[}^{42}^{]}, local superlinear
convergence of sSQP is established under MFCQ (6), SOSC (7) assumed to hold for all
multipliers, the existence of a multiplier μ satisfying the strict
complementarity condition μ_{A}(x)
> 0, and the assumption that the initial dual iterate is close enough to such a
multiplier. Also, ^{[}^{42}^{]}
gives an analysis in the presence of round-off errors. In ^{[}^{44}^{]}, the assumption of strict
complementarity has been removed. Also, ^{[}^{43}^{]} suggests a certain inexact SQP framework which includes sSQP
as a particular case. The assumptions, however, still contain MFCQ. In ^{[}^{19}^{]}, CQs are not used at the expense of
employing instead of the weaker SOSC (7) the strong SOSC (SSOSC)

where

and assuming that the dual starting point is close to a multiplier satisfying this
SSOSC. (SSOSC (10) is stronger than SOSC (7) because
*C*(x, μ).) In
^{[}^{12}^{]}, the result of
^{[}^{19}^{]} was recovered from
more general principles (some details will be discussed below), and somewhat sharpened.
The iterative framework of ^{[}^{12}^{]} was further used in ^{[}^{10}^{]} to prove local superliner convergence using SOSC (7)
only, with no CQs or other assumptions. Moreover, the method was extended to variational
problems (see Section 4 below). Quasi-Newton versions of sSQP are analyzed under SOSC in
^{[}^{7}^{]}. In
^{[}^{27}^{]} it was shown that
the SOSC cannot be relaxed when inequality constraints are present, but for
equality-constrained problems the weaker condition of noncriticality of the relevant
Lagrange multiplier (see the definition immediately below) is sufficient for
convergence.

A Lagrange multiplier (λ, μ) ∈
*M(x)* is said to be *critical*
if there exists a triple (ξ, η, ζ) ∈ **R**^{n} ×
**R**^{l} ×
**R**^{m}, with ξ ≠0, satisfying the system

and *noncritical* otherwise. We refer the reader to ^{[}^{23}^{,}^{24}^{,}^{26}^{,}^{27}^{,}^{21}^{,}^{22}^{]} for the
role this notion plays in convergence properties of algorithms, stability, error bounds,
and other issues; see also [^{28}, Chapter 7]. Some
comments will also be given below. When there are no inequality constraints, it can be
seen from (11) that a multiplier λ ∈
*M(x)* being critical means that

It can be easily seen, essentially observing that
*im*(*h'*(x))^{T}
= (ker *h*'(x))^{⊥}, that this
noncriticality property for equality-constrained problems is implied by the
corresponding version of SOSC (7), but not vice versa. The same conclusion holds for the
general case: if (11) has a solution with ξ ≠0, multiplying the first equality in (11)
by this ξ and using the other relations in (11), one arrives to a contradiction with
SOSC (7). It should be emphasized that SOSC is a much stronger assumption than
noncriticality. Noncritical multipliers, when they exist, form a relatively open and
dense subset of the multiplier set *M*(x), which is
of course not so for multipliers satisfying SOSC.

We are now in position to formally state local convergence properties of sSQP
^{[}^{10}^{,}^{27}^{]}; see also [^{28}, Chapter 7]. Note that in Theorem 1 below, if there are equality
constraints only, everything that involves the multiplier *µ* disappears
from the statement. Note also that in the equalityconstrained case, finding a stationary
point of the sSQP subproblem (8) is equivalent to solving the linear system of
equations

in the variables (*x, λ*).

**Theorem 1.**
*Let ƒ:*
**R**^{n} → **R**, *h:*
**R**^{n} →
**R**^{l}
*and g:*
**R**^{n} →
**R**^{m}
*be twice differentiable in a neighborhood of x, with their
second derivatives being continuous at x. Let (σ _{k} = σ
(x^{k}, λ^{k}, μ^{k}), where σ is given by (9). Let
(x, λ, μ) be a
solution of the KKT system (2), satisfying SOSC (7). If there are equality
constraints only, let instead λ be a noncritical multiplier
(i.e., the right-most relation in (12) does not hold for any ξ ≠ 0.)*

*Then for any c* > 0 *large enough and any starting
point* (*x*^{0},λ^{0},µ^{0)} ∈
**R**^{n} ×
**R**^{l} ×
**R**^{m}_{+}
*close enough to (x, λ,
μ), there exists a sequence {( x^{k},
λ^{k}, µ^{k})} ⊂
R^{n} ×
R^{l}
×R^{m}
such that for each k = 0,1, ..., x^{k+1}
is a stationary point of sSQP subproblem (8) with associated
Lagrange multipliers (λ^{k+1},
µ^{k+1}) which satisfies*

*any such sequence converges to (x, λ ^{∗},
μ^{∗}) with some (λ*, µ*) ∈ M(x), and the rates of
convergence of {(x^{k}, λ^{k}, µ^{k})} to
(x, λ^{∗}, μ^{∗}) and of {dist
((x^{k}, λ^{k}, µ^{k}), {x} ×
M(x))} to zero are superlinear. Moreover, the rates of
convergence are quadratic provided the second derivatives of f, h and g are locally
Lipschitz-continuous with respectto x*.

Some comments are in order. Under SSOSC (10), solutions of sSQP subproblems (8) can in
addition be shown to be unique in some neighbourhood ^{[}^{10}^{]}. Furthermore, in the equality-constrained case,
locally and under the noncriticality assumption, the linear system (13) has the unique
solution, i.e., the sSQP subproblem has the unique stationary point ^{[}^{27}^{]}. For the equality-constrained case,
sSQP is the only currently known method that solves a linear system or a QP per
iteration (i.e., an explicitly Newtonian method), and which requires for convergence
something weaker than SOSC (in particular, noncriticality of the multiplier) and does
not need any CQs. In ^{[}^{27}^{]}
it is shown that when there are inequality constraints, SOSC cannot be replaced by
noncriticality. Whether convergence of the primal (rather than primal-dual) sequence
generated by sSQP is also superlinear is an open question ^{[}^{8}^{]}. Recall that in general, superlinear
convergence of primal-dual sequence does not imply any rate for the primal (or dual)
sequence separately [^{4}, Exercise 14.8]. For SQP,
the primal rate is superlinear ^{[}^{8}^{]}. For sSQP, only a kind of "two-step" superlinear estimate for
the primal sequence is available ^{[}^{7}^{]}.

We illustrate the convergence result in Theorem 1 with the following example.

**Example 1.** Consider the optimization problem

It can be seen that x= (0,0) is the unique solution of this problem, and that the associated set of Lagrange multipliers is given by

In particular, MFCQ (6) does not hold and *M*(x) is
unbounded. Furthermore, SOSC (7) holds at (x,
μ) for any μ
∈*M*(x) with μ_{1}
> 0, but SSOSC (10) is not satisfied for any multiplier.

Experiments were performed with Matlab implementation of sSQP, using the built-in
subroutine quadprog for solving QP subproblems (8) and choosing random starting points
*x _{i}*

^{0}∈ [-1/2, 1/2],

*i*= 1, 2, and

*µ*∈ [0, 1],

^{0}_{j}*j*= 1, 2, 3. The stopping criterion is σ (

*x*, µ

^{k}*) < 10*

_{k}^{-15}.

In about 10% of the cases, the sequence converged linearly to (x,
μ) with μ_{1} = 0 (SOSC is not
valid at this solution). Such cases appear to correspond to the choices of starting
points that are not close enough to a solution satisfying SOSC (so that Theorem 1 does
not apply). About 3% of the starting points produced unsolvable subproblems at the first
iteration (for the same reason as above - starting points not being close enough to a
solution). All the remaining runs converged superlinearly to a primal-dual solution
satisfying SOSC. Table 1 shows the average
values of ║*x ^{k}* - x║+

*dist*(

*µ*,

^{k}*M*(x)) for the last 5 iterations in the cases of convergence to a primal-dual solution satisfying SOSC.

The analysis that leads to Theorem 1 relies on the variational Newtonian framework for
generalized equations (GEs) with nonisolated solutions, developed in ^{[}^{12}^{]}; see also [^{28}, Chapter 7] (in our context, in the absence of CQs dual solutions
of the KKT system (2) are not isolated). To that end, consider the generalized equation
(GE)

where Φ : **R**^{ν} → **R**^{ν} is a smooth
(single-valued) mapping, and *N* (·) is a set-valued mapping from
**R**^{ν} to the subsets of **R**^{ν}. As is well
known, the KKT system (2) corresponds to the GE (15) with ν = *n + l +
m*, the mapping Φ : **R**^{n} ×
**R**^{l} ×
**R**^{m} →
**R**^{n} ×
**R**^{l} ×
**R**^{m} given by

and with *N* beining the normal cone to the set

Consider the class of methods for solving (15) that, given the current iterate *u
k* ∈ **R**^{ν}, generate the next iterate *u
^{k+1}* as a solution of the subproblem of the form

where for *ũ* ∈ **R**^{ν} the mapping A (ũ,·) is some
kind of approximation of Φ around ũ. For example, if

the iteration subproblem (18) becomes that of the Josephy-Newton method for GEs (see
[^{28}, Chapter 3]), and when applied to the KKT
system (2) (i.e., the special case of GE described above), it corresponds to the SQP
subproblem (3). For each ũ ∈ **R**^{ν}, define the set

so that *U* (*u ^{k}*) is the solution set of the
iteration subproblem (18). As is usual

*and natural*in local convergence considerations, one has to specify which of the solutions of (18) are allowed to be the next iterate (solutions "far away" must clearly be discarded from local analysis; note that sSQP subproblem (8) need not be strongly convex and thus may have such "far away" solutions). In other words, we have to restrict the distance from the current iterate

*u*to the next one, i.e., to an element of

^{k}*U*(

*u*) that can be declared to be

^{k}*u*. To that end, for an arbitrary but fixed

^{k+1}*c*> 0 define the subset of the solution set of the subproblem (18) by

where U is the solution set of the GE (15), and consider the iterative scheme

Superlinear convergence of this scheme is established under the following three conditions:

(i) *Upper Lipschitzian behavior of solutions of GE under canonical
perturbations* - For every *r* ∈ **R**^{ν}
close enough to 0, any solution *u*(*r*) of the perturbed
GE

close enough to u satisfies the estimate

(ii) *Precision of approximation of* Φ *in subproblems* -
There exists a function *ω*: **R**_{+} →
**R**_{+} such that *ω*(*t*) =*
o *(*t*) as *t* → 0 and the estimate

holds for all ũ ∈**R**^{ν} close enough to u.

(iii) *Solvability of subproblems with the localization condition* - For
any ũ ∈**R**^{ν} close enough to u the set
*U _{c}* (ũ) defined by (19), (20) is nonempty.

Some comments are in order. For GE corresponding to the KKT system (2), the canonically perturbed problem has the form

for *r* = (*a,b,c*) ∈
**R**^{n} ×
**R**^{l} ×
**R**^{m}. For KKT systems, the upper
Lipschitzian behavior of solutions under canonical perturbations (the first assumption
above) is equivalent to noncriticality of the Lagrange multiplier (under the smoothness
assumptions in this survey) ^{[}^{21}^{]}. In particular, it is implied by the SOSC (7). The second
assumption above naturally holds for Newton-type methods, and in particular for sSQP if
the stabilization parameter is properly chosen (for example, based on the KKT natural
residual (9)). The third assumption on solvability of subproblems and localization
condition is where the most work is required ^{[}^{10}^{,}^{27}^{]}. And it
is here where noncriticality of the multiplier needs to be strengthened to SOSC if
inequality constraints are present.

We finally note that sSQP can also be interpreted within the perturbed Josephy-Newton
framework of ^{[}^{25}^{]}; see also
[^{28}, Chapter 3]. However, the main convergence
result in this framework requires SMFCQ. If the method is interpreted instead via
^{[}^{12}^{,}^{10}^{]} as outlined above, no CQs are needed
to prove local convergence. It is also interesting to mention that a modification of the
Newtonian framework of ^{[}^{12}^{]}
is used in ^{[}^{11}^{]} to derive
local convergence and rate of convergence results for the augmented Lagrangian algorithm
(method of multipliers) under SOSC (7) only, significantly improving on the classical
results such as in ^{[}^{1}^{]} that
assume in addition LICQ (5) and strict complementarity. Moreover, for the
equality-constrained case SOSC can be relaxed to noncriticality ^{[}^{22}^{]}, as is the case for sSQP (the
required analysis is very different though). It is interesting that even though the
augmented Lagrangian method is not of Newton type, the Newtonian lines of analysis
turned to be very fruitful for this context as well.

3 GLOBALIZATION ISSUES

As any Newtonian method, sSQP is a local scheme, guaranteed to converge if initialized
at a point close enough to a solution with the properties discussed in Section 2. To
obtain a complete algorithm, some strategy to globalize convergence is needed (so that
arbitrary starting points can be used). This proved to be a rather difficult task.
Recall that to globalize SQP at least three different approaches are available; see
[^{28}, Chapter 6]. Globalization can be
organized using linesearch [^{4}, Chapter 17 or
trust-region [^{5}, Chapter 15.4 ] for a nonsmooth
penalty function, and the filter technique ^{[}^{13}^{,}^{14}^{,}^{37}^{]}. For example, if a positive definite
matrix *H _{k}* is employed in the QP (3), then the generated
direction

*x*-

^{k+1}*x*is that of descent for the penalty function

^{k}provided one takes *c _{k}* >
║(λ

*, µ*

^{k+1}*)║∞. One can then perform linesearch in the obtained direction to guarantee progress towards solving (1) via decreasing the penalty function with respect to its value at the previous iterate*

^{k+1}*x*

_{k}andthenre-defining

*x*accordingly. To find a suitable penalty function for which the direction computed by the sSQP subproblem (8) is of descent, proved a challenge. In particular, the penalty function like (24), or other "usual" candidates, do not do the job.

^{k+1}Some numerical results on global behaviour of sSQP, without attempting to globalize the
method itself, are reported in ^{[}^{33}^{]}, but this experience is rather limited (just a few test
problems are considered). More test problems have been employed in ^{[}^{26}^{]} but globalization used there is a
heuristic not supported by a proof. As for any local algorithm, so-called "hybrid"
strategies can certainly be used (see [^{28},
Chapter 5] for one family of hybrid globalizations of local methods for variational
problems). Something like this is certainly applicable to sSQP. In ^{[}^{20}^{]}, this approach was implemented in
conjunction with the augmented Lagrangian as the globally convergent method. We next
survey the few more direct approaches to globalize sSQP that have been proposed so
far.

In ^{[}^{39}^{]} the globalization
technique is based on a linesearch for the so-called primal-dual augmented Lagrangian
^{[}^{17}^{]}. This work deals
with optimization problems in the format

(The more general problem (1) can be reformulated into this setting using slack
variables.) For the optimization problem (25), taking *µ =
f'*(*x*) +
(*h'*(*x*))^{T} λ, the
natural residual (9) can be written as

In ^{[}^{39}^{]} the nonnegativity
constraint is excluded from stabilization, and instead of (8), the sSQP subproblem is
given by

where *k* is
a reference Lagrange multiplier estimate and *H _{k}* is a
symmetric matrix such that

*H*+ (1/σ

_{k}*) (*

_{k}*h'*(

*x*))

^{k}*(*

^{T}h'*x*) is positive definite. It should be noted that there does not seem to be any theory to justify that this "partial" (excluding the constraint

^{k}*x*≥ 0) stabilization in (26) actually inherits local convergence properties of sSQP under some reasonable assumptions. In terms of local convergence, the idea of

^{[}

^{39}

^{]}is to use in addition identification of active inequality constraints, so that the overall algorithm eventually becomes sSQP for the associated equalityconstrained problem. From the analysis in

^{[}

^{39}

^{]}, if (

*x*, λ

^{k+1}*) is the (unique) solution of (26) then (*

^{k+1}*x*-

^{k+1}*x*, λ

^{k}*- λ*

^{k+1}^{k}) is a descent direction at (

*x*, λ

^{k}*) for the primal-dual penalty function*

^{k}where *c* > 0 is a fixed parameter. This penalty function is minimized
using linesearch, thus re-defining (*x ^{ k+1}*,
λ

*). The reference Lagrange multiplier*

^{k+1}*is updated to λ*

^{k+1}*if either the weighted natural residual for problem (25) is small or if the natural residual for the problem of minimizing ϕ*

^{k+1}*(*

_{c}*x*, λ;

^{k}, σ

*) subject to*

_{k}*x*≥ 0 is small. The dual stabilization parameter σ

*and other algorithmic parameters are updated by certain rules. According to [*

_{k}^{39}, Theorem 4.2 ], if the generated sequence {

*x*} is bounded and the sequence {

^{k}*H*} is chosen bounded with {

_{k}*H*+ (1/σ

_{k}*)(*

_{k}*h*'(

*x*))

^{k}*(*

^{T}h'*x*)} being also uniformly positive definite, then either there exists an index set

^{k}*K*such that lim

*σ(*

_{K∋k→∞}*x*, λ

^{k}^{k}) = 0 (accumulation points of this subsequence solve the KKT system of the problem) or there exists an index set S such that lim

*σ*

_{S∋k→∞}*= 0, {*

_{k}^{k}}

_{k∈S}is bounded and

Another globalization strategy is proposed in ^{[}^{9}^{]}. It is based on the inexact restoration ideas
^{[}^{32}^{]}, and uses
linesearch for a primal-dual nondifferentiable penalty function. This work considers
problems in the format

where *a* and *b* are finite bounds. For this problem, the
natural residual (9) is given by

The corresponding sSQP subproblem again employs only partial stabilization (leaving out the bounds), and has the form

where ^{k} is a reference Lagrange multiplier approximation
and *H _{k}* is a symmetric positive definite matrix. The penalty
function used in

^{[}

^{9}

^{]}is

Note that this is a penalty function for the problem of minimizing ƒ
(*x*) + (σ* _{k}*/2) ║λ║

^{2}subject to

*h*(

*x*) - σ

*(λ -*

_{k}^{k}) = 0, and the latter problem is equivalent to minimizing ƒ (

*x*) + ‹

^{k},

*h*(

*x*)› + 1/(2σ

*)║*

_{k}*h*(

*x*)║

^{2}. Thus, this penalty function is also related to the augmented Lagrangian.

The inexact restoration strategy presented in ^{[}^{9}^{]} can be interpreted as two-step linesearch for the
function (30), where in the first step the penalty parameter
*c _{k}* is increased in order to achieve

*c*(

_{k}*x*,

^{k}^{k};

^{k}, σ

*) <*

_{k}*c*(

_{k}*x*, λ

^{k}*;*

^{k}*, σ*

^{k}*) with*

_{k}^{k}=

*k*+ (1/σ

*)*

_{k}*h*(

*x*), and in the second step linesearch is performed along the direction (

^{k}*x*-

^{k+1}*x*, λ

^{k}*-*

^{k+1}^{k}to re-define (

*x*, λ

^{k+1}*) so that c*

^{k+1}*(*

_{k }*x*, λ

^{k+1}

^{k+1}^{k}, σ

*) ≤*

_{k}*c*(

_{k }*x k*, λ

*;*

^{k}*, σ*

^{k}*). If the linesearch direction is small enough (with respect to the inexact restoration criteria), then (*

_{k}*x*, λ

^{k+1}*) is accepted as the new primaldual iterate and the reference Lagrange multiplier*

^{k+1}*is updated to λ*

^{k+1}*. The dual stabilization parameter σ*

^{k+1}*is updated by a suitable rule. According to [*

_{k}^{9}, Theorem 2], if the sequence of matrices {

*H*} is chosen uniformly bounded and uniformly positive definite, and if x is an accumulation point of the sequence {

_{k}*x*}, then x is a stationary point of the problem (28) if {σ

^{k}*} is bounded away from zero, or it is a stationary point of the problem of minimizing the infeasibility measure ║*

_{k}*h*(

*x*)║

^{2}subject to

*a*≤

*x*≤

*b*if {σ

*} converges to zero. The algorithm of*

_{k}^{[}

^{9}

^{]}solves sSQP subproblems, but in a sense they can be considered as "inner iterations" within the inexact restoration scheme (which drives global convergence of the method). In particular, it is not known whether locally the algorithm indeed behaves as sSQP under some assumptions (i.e., solves only one sSQP subproblem per inexact restoration iteration).

another very recent and promissing proposal ^{[}^{29}^{]} is based on linesearch in sSQP directions for the following
two-parameter primal-dual exact penalty function:

where *c*_{1} > 0, *c*_{2} > 0. This
function was originally introduced and studied in ^{[}^{6}^{]}; see also ^{[}^{1}^{]}. Provided the penalty parameters are updated by certain
appropriate rules, very reasonable global convergence properties are established in
^{[}^{29}^{]}. Moreover, near
qualified solutions (stationary point - noncritical multiplier pairs), the sSQP
directions are always accepted by the algorithm, and then the unit stepsize in those
directions is accepted by the Armijo linesearch rule. Thus, the globalized scheme
inherits fast local convergence of sSQP, under weak assumptions.

We next comment on some other ideas for sSQP globalization, which led to partial developments of some promis but did not materialize into complete algorithms so far.

A globalization strategy can be attempted using the principle of the augmented Lagrangian method, i.e., decrease the augmented Lagrangian function in the primal space and increase it in the dual. Consider the classical augmented Lagrangian for problem (1):

σ > 0. It can be seen that if (*x ^{k+1}*,
λ

*,*

^{k+1}*µ*) is a solution of the sSQP subproblem (8), then (

^{k+1}*x*, λ

^{k+1}-x^{k}*- λ*

^{k+1}^{k},

*µ*-

^{k+1}*µ*) is a descent direction at (

^{k}*x*) for the "difference of two augmented Lagrangians" function

^{k}, λ^{k}, µ^{k}for any _{k} ∈ [* _{k}*/2,

*]. Moreover, the directional derivative is less than - Δ*

_{k}*, where*

_{k}It can be shown that if {Δ* _{k}*} tends to zero, the sequence of
matrices {

*H*} is chosen uniformly bounded and uniformly positive definite and (x, λ, μ) is a limit point of the sequence {(

_{k}*x*)}, then x is a stationary point of the problem (1) if {σ

^{k}, λ^{k}, µ^{k}*} is bounded away from zero, or it is a stationary point of the problem of minimizing the infeasibility measure ║*

_{k}*h*(

*x*)║

^{2}+ ║max{0,

*g*(

*x*)}║

^{2}+ ║max{0,g(

*x*)}║

^{2}if {σ

*} converges to zero. However, so far there are no reasonable hypotheses to guarantee Δ*

_{k}*→ 0 from a standard linesearch. From another point of view, this strategy is related to finding a solution of an equilibrium problem. If for some , > 0 it holds that (x, λ, μ) is a solution of the optimization problem min (*

_{k}*x*, λ ,

*µ*) ψ, (

*x*, λ,

*µ*;

*x*, λ,

*µ*, then (

*x*, λ,

*µ*) solves the KKT system (2). Conversely, if (

*x*, λ,

*µ*is a solution of the KKT system (2) satisfying SOSC (7), then (

*x*, λ,

*µ*) is a local minimizer of the latter problem.

Another issue concerned with global convergence of sSQP has to do with possible
attraction of the iterates to critical Lagrange multipliers, and eventual slow
convergence rate as a consequence; see [^{28},
Chapter 7]. In ^{[}^{23}^{,}^{24}^{,}^{26}^{,}^{30}^{]} this
phenomenon was exhibited for various Newtonian and Newton-related methods, such as SQP
and its quasi-Newton implementations, and the linearly constrained (augmented)
Lagrangian methods ^{[}^{38}^{,}^{34}^{,}^{15}^{]}. Both theoretical considerations and
numerical results for SNOPT ^{[}^{16}^{]} and MINOS ^{[}^{35}^{]} solvers were presented, which put in evidence that when
critical multipliers exist, they serve as attractors of the dual sequence generated by
the type of methods in question. Moreover, the reason for slow convergence in the
degenerate cases is precisely attraction to critical multipliers, as convergence to
noncritical ones would have given the primal superlinear rate. Numerical results in
^{[}^{26}^{]} show that the
effect of attraction (globally, i.e., from "far away" points) to critical multipliers
still exists for sSQP too (when evaluating the numbers reported therein, it is important
to keep in mind that critical multipliers are typically few; the usual situation is that
they form a set of measure zero within the set of all multipliers), but the attraction
is much less persistent for sSQP than for the other algorithms. The runs clearly split
into two groups. Sometimes the (globalized, heuristically in that reference) process
manages to enter the "good" primal-dual region, where the stabilization term starts
working properly (has the needed "size"), and then it converges superlinearly with the
dual limit being noncritical. However, in a considerable number of cases this does not
happen, and then the process still converges slowly to a critical multiplier. Thus,
although sSQP does help when compared to the alternatives, by itself it does not seem to
be a fully reliable tool for avoiding the effect of attraction to critical multipliers
and its negative consequences. It would seem that some special modifications would be
needed in the "global" phase of the method to reliably avoid convergence to critical
multipliers, without slowing down the overall process. Those are also some of the
conclusions from the numerical results in ^{[}^{29}^{]}.

Overall, building really satisfactory globalization techniques for sSQP is a challenging matter, which (for general problems) should still be considered an open question at this time.

4 EXTENSIONS TO VARIATIONAL PROBLEMS

Denote

and let *N _{D}*(

*x*) be the dual cone of the tangent (contingent) cone

*T*(

_{D}*x*) to the set

*D*at

*x*∈

**R**

^{n}, i.e.,

*N*(

_{D}*x*) = ∅ for

*x*∉

*D*and otherwise

*N*(

_{D}*x*) = (

*T*(

_{D}*x*))

^{◦}where

Consider the variational problem (VP)

where *F:*
**R**^{n} →
**R**^{n}; see [^{28}, Chapters 1 and 3]. In particular, for the optimization problem (1) this
VP represents the first-order necessary optimality condition

if we take *F*(*x*) =
*f'*(*x*). If the set *D* is convex,
then (31) gives the usual variational inequality

Associated to solving VP (31) is the KKT system

Define the mapping *G:*
**R**^{n} ×
**R**^{l} × **R**^{m}
→ **R**^{n} by

Let (*x ^{k}, λ^{k}, µ^{k}*) ∈

**R**

^{n}×

**R**

^{l}×

**R**

_{+}

^{m}be the current primal-dual approximation to a solution of (32), and let σ

*> 0 be the dual stabilization parameter. Define the affine mapping Φ*

_{k}

_{k}:**R**

^{n}×

**R**

^{l}×

**R**

^{m}→

**R**

^{n}×

**R**

^{l}×

**R**

^{m}by

and consider the affine VI of the form

where

As can be easily seen, in the optimization case (1) the VI (33) is precisely the
first-order (primal) necessary optimality condition for the sSQP subproblem (8), if one
takes *F*(*x*) = *f'*(*x*).
Thus this scheme contains sSQP for optimization as a special case. Note that the method
makes good sense also in the variational setting, as solving the fully nonlinear VP (31)
is replaced by solving a sequence of affine VIs (33) (the mapping
Φ* _{k}* is affine and the set

*Q*is polyhedral). If the set

_{k}*D*is defined by equality constraints only, then it can be seen that (33) is just a system of linear equations.

Convergence analysis of this stabilized Newton method for variational problems can be
found in ^{[}^{10}^{]}; see also
[^{28}, Chapter 7].

5 CONCLUDING REMARKS

We presented a survey of literature and some discussion of the stabilized version of the
fundamental sequential quadratic programming method for constrained optimization.
Further material, in particular comprehensive local convergence analysis, can be found
in the book ^{[}^{28}^{]}.