SciELO - Scientific Electronic Library Online

 
vol.28 issue2On the generalized nonlinear ultra-hyperbolic heatequation related to the spectrumEffect of Hall current on the velocity and temperature distributions of Couette flow with variable properties and uniform suction and injection author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Computational & Applied Mathematics

On-line version ISSN 1807-0302

Comput. Appl. Math. vol.28 no.2 São Carlos  2009

http://dx.doi.org/10.1590/S1807-03022009000200003 

A filter SQP algorithm without a feasibility restoration phase*

 

 

Chungen ShenI,II; Wenjuan XueIII; Dingguo PuI

IDepartment of Mathematics, Tongji University, China, 200092
IIDepartment of Applied Mathematics, Shanghai Finance University, China, 201209
IIIDepartment of Mathematics and Physics, Shanghai University of Electric Power, China, 200090 E-mail: shenchungen@gmail.com

 

 


ABSTRACT

In this paper we present a filter sequential quadratic programming (SQP) algorithm for solving constrained optimization problems. This algorithm is based on the modified quadratic programming (QP) subproblem proposed by Burke and Han, and it can avoid the infeasibility of the QP subproblem at each iteration. Compared with other filter SQP algorithms, our algorithm does not require any restoration phase procedure which may spend a large amount of computation. We underline that global convergence is derived without assuming any constraint qualifications. Preliminary numerical results are reported.
Mathematical subject classification: 65K05, 90C30.

Key words: filter, SQP, constrained optimization, restoration phase.


 

 

1 Introduction

In this paper, we consider the constrained optimization problem:

where c(x) = (c1(x), c2(x), ... , cm1 (x))T , c(x) = (cm1+1(x), cm1+2(x), ... , cm(x))T , ={1, 2, ... , m1}, = {m1 + 1, m1 + 2, ... , m}, ƒ: n and ci : n , i are twice continuously differentiable functions. The feasible set of the problem (NLP) is denoted by X = {x Rn | c(x) = 0, c(x) < 0}.

The sequential quadratic programming (SQP) method has been widely used for solving the problem (NLP). It generates a sequence {xk } converging to the desired solution by solving the quadratic programming (QP) subproblem:

where ρ> 0 is the trust region radius, Bk n×n is a symmetric positive definite matrix. At each iteration xk , the solution d of the QP subproblem is regardedas a trial step and the next trial iterate has the form xk + d. The acceptance of this trial step depends on whether the trial iterate xk + d makes some merit function descent. Generally, this merit function is a type of penalty function with some parameters whose adjustment can be problematic. Fletcher and Leyffer [7] proposed a trust region filter SQP method to solve the problem (NLP) instead of traditional merit function SQP methods. In addition, the computational results presented in Fletcher and Leyffer [7] are also very encouraging.

Recently, this topic got high importance in recent years (see [6, 17, 18, 22, 23]). Trust region filter SQP methods have been studied by Fletcher, Leyffer and Toint in [8] and by Fletcher, Gould, Leyffer, Toint and Wächter in [9]. In this latter paper, an approximate solution of the QP subproblem is computed and the trial step is decomposed into normal and tangential components. Gonzaga, Karas and Vanti [10] presented a general framework for filter methods where each step is composed of a feasibility phase and an optimality phase. Similar filter method was proposed by Ribeiro, Karas and Gonzaga in [21]. In all these papers only the global convergence of the proposed methods is analyzed. On the other hand, in [24], Ulbrich studied the local convergence of a trust region filter SQP method. Anyway, the components in the filter adopted in [24], differs from those in [7, 8, 9]. It should be underlined that the filters approach has been used also in conjunction with the line search strategy (see Wächter and Biegler [26, 27]); with interior point methods (see Benson, Shanno and Vanderbei [5]; Ulbrich, Ulbrich and Vicente [25]) and with the pattern search method (see Audet and Dennis [1]; Karas, Ribeiro, Sagastizábal and Solodov [15]).

However, all filter algorithms mentioned above include the provision for a feasibility restoration phase if the QP subproblem becomes inconsistent. Although any method (e.g. [3, 4, 16]) for solving a nonlinear algebraic system of equalities and inequalities can be used to implement this calculation, a large amount of computation may be spent. In this paper, we incorporate the filter technique with the modified QP subproblem for solving general constrained optimization problems. The main feature of this paper is that there is no restoration phase procedure. Another feature is that global convergence of our algorithm is established without assuming any constraint qualifications.

This paper is organized as follows. In section 2, we describe how the modified QP subproblem is embedded in the filter algorithm. In section 3, it is proved that the algorithm is well defined, and is globally convergent to a stationary point. If Mangasarian-Fromowitz constraint qualification (MFCQ) holds at this stationary point, then it is a KKT point. In section 4, we report some numerical results.

 

2 Algorithm

It is well-known that the Karush-Kuhn-Tucker (KKT) conditions for the problem (NLP) are:

where

is the Lagrangian function, µ Rm1 and λ Rm-m1 are the multipliers corresponding to the constraints of the problem (NLP).

The SQP method generates iterates by solving the subproblem (1.1). However, the subproblem (1.1) may be inconsistent if xk is not a feasible point for the problem (NLP). In order to avoid this bad situation, we solve a modified QP subproblem instead of the subproblem (1.1). Before introducing the modified QP subproblem, we first solve the problem

where the scalar σk > 0 is used to restrict d in norm. Let k denote the solution of (2.4). If σk < ρ and ql (k ) = 0, then the QP subproblem (1.1) is consistent. The problem (2.4) can be reformulated as a linear program

where vectors em1 = (1, 1, ... , 1)T m1 and em-m1 = (1, 1, ... , 1)T m-m1.

Now we define that (xk, σk ) and (xk ,σk ) are equal to 2 and c(xk)Tk + c(xk), respectively, where (k , 1, 2) denotes the solution of (2.5). At each iteration, we generate the trial step by solving the modified QP subproblem

Here, at each iterate, we require that ρk be greater than σk . So, the choice of σk depends on ρk.

Define Φ(xk ,σk ) = , where i (xk ,σk ) and i (xk ,σk ) denote the ith component of (xk ,σk ) and (xk ,σk ) respectively. Actually, Φ(xk ,σk ) = ql (k ) with the scalar σk > 0. Obviously, d = 0 is feasible for the problem (2.4) and then ql (k ) < ql (0) = V (xk), where the constraint violation V (x) is defined as

Therefore, Φ(xk ,σk ) < V (xk).

Let d denote the solution of QP(xk, Bk, σk, ρk ). Let the trial point be = xk + d. The KKT conditions for the subproblem QP(xk, Bk, σk, ρk ) are

where en = (1, 1, ... , 1)T n, µk m1, λk m-m1, λlk n and λuk n.

In our filter, we consider pairs of values (V (x), ƒ(x)). Definitions 1-4 below are very similar to those in [7, 8].

Definition 1. The iterate xk dominates the iterate xl if and only if V (xk) < V (xl) and ƒ(xk) < ƒ(xl). And it is denoted by xk < xl .

Thus, if xk < xl , the latter is of no real interest to us since xk is at least as good as xl with respect to the objective function's value and the constraint violation. Furthermore, if xk < xl , we say that the pair (V (xk), ƒ(xk)) dominates the pair (V (xl), ƒ(xl)).

Definition 2. The kth filter is a list of pairs {(V (xl), ƒ(xl))}l<k , such that nopair dominates any other.

Let k denote the indices in the kth filter, i.e.,

Filter methods accept a trial point = xk + d if its corresponding pair (V (), ƒ()) is not dominated by any other pair in the kth filter, neither the pair corresponding to xk , i.e., (V (xk), ƒ (xk)).

In a filter algorithm, one accepts a new pair (V (), ƒ()) if it cannot be dominated by other pairs in the current filter. Although the definition of filter is simple, it needs to be refined a little in order to guarantee the global convergence.

Definition 3. A new trial point is said to be "acceptable to xl ", if

is satisfied, where γi (0, ½), i = 1, 2 are two scalars.

Definition 4. A new trial point is said to be "acceptable to the kth filter" if is acceptable to xl for all l k .

Therefore, to accept a new pair into the current filter, we should test the conditions defined in Definition 4. If the new trial point is acceptable in the sense of Definitions 3 and 4, we may wish to add the corresponding pair to the filter. Meanwhile, any pair in the current filter dominated by the new pair is removed.

In order to restrict V (xk), it still needs an upper bound condition for accepting a point. The trial point satisfies the upper bound condition if

holds, where k is a positive scalar. Here k is updated at each iteration and it may converge to zero for some instances. We aim to control the constraint violation by setting k . In current iteration k, if a trial step is accepted and Φ(xk ,σk ) = 0, then we will keep k unchanged in next iteration. Otherwise we will reduce k in next iteration. The detailed information about the update of k is given in Algorithm 2.1.

Denote

as the quadratic reduction of ƒ(x), and

as the actual reduction of ƒ(x). In our algorithm, if a trial point xk + d is acceptable to the current filter and xk , and

the sufficient reduction criterion of the objective function ƒ should be satisfied, i.e.,

Now we are in the position to state our filter SQP algorithm.

Algorithm 2.1.

Step 1. Initialization.

Given initial point x0, parameters η (0, ½), γ1 (0, ½), γ2 (0, ½), γ (0, 0.1), r (0, 1), 0 > 0, ρmin > 0, > ρmin, ρ [ρmin, ], σ0 γρ, ρ). Set k = 0, Flag=1, put k into k .

Step 2. Compute the search direction.

Compute the search direction d from the subproblem QP(xk, Bk, σk, ρ). If ρ > ρmin, set ds = d and Φs = Φ(xk, σk).

Step 3. Check for termination.

If d = 0 and V (xk) = 0, then stop;

Else if V (xk) > 0 and Φ(xk, σk ) - V (xk) = 0, then stop.

Step 4. Test to accept the trial point.

4.1. Check acceptability to the filter.

If the upper bound condition (2.8) holds for = xk + d, Φ(xk, σk) = 0, and xk + d is acceptable to both the kth filter and xk , then go to 4.3.

Otherwise, if Φ(xk, σk) = 0, then set ρ = ½ ρ and choose σk (γρ, ρ), go to Step 2;

4.2. Compute tk , the first number t of the sequence {1, r, r2 , ... , } satisfying

Set Flag=0, d = ds, and go to 4.4.

4.3. Check for sufficient reduction criterion.

If Δ ƒ(d) < ηΔq(d) and Δq(d) > 0, then ρ = ½ ρ and choose σk (γρ, ρ), and go to Step 2;

otherwise go to 4.4.

4.4. Accept the trial point.

Set

and

Step 5. Augment the filter if necessary.

If Flag=1 and Δq(dk) < 0, then include (V (xk), ƒ(xk)) in the kth filter.

If Flag=0, then set k +1 = V (xk + tkdk); otherwise, set k +1 = k and leave the filter unchanged.

Step 6. Update.

Compute Bk +1. Set σk +1 (γρ, ρ) and k := k + 1, Flag=1, go to Step 2.

Remark 1. The role ds in Step 2 is to record a trial step d with ρ > ρmin. By Step 4.1, we know that if Φ(xk ,σk ) > 0, then both conditions in Step 4.1 are violated. A trial iterate xk + d with Φ(xk, σk ) > 0 may be considered as a "worse" step which may be far away from the feasible region. So Step 4.2 is executed to reduce the constraint violation. The step ds is used in Step 4.2 due to the descent property of ds proved by Lemma 3.4. In addition, it should be interpreted that, at the beginning of iteration k, the pair (V (xk), ƒ(xk)) is not in the current filter, but xk must be acceptable to the current filter.

Remark 2. Our algorithm has three loops: the loop 2-(4.1)-2, the loop 2-(4.3)-2, and the loop 2-6-2. We observed that the radius ρ is allowed to reduce to a value less than ρmin in the loops 2-(4.1)-2 and 2-(4.3)-2. But in the loop 2-6-2 the radius ρ is not less than ρmin. At each iteration, the first trial radius ρ is greater than or equal to ρmin. Subsequently, trial radius ρ may not stop decreasing until either the filter acceptance criteria are satisfied or Step 4.2 is executed. Hence, ρ may be less than ρmin during the execution process of the loops 2-(4.1)-2 and 2-(4.3)-2.

 

 

As for proving global convergence, we use the terminology firstly introduced by Fletcher, Leyffer and Toint [8]. We call d an ƒ-type step if Δq(d) > 0, indicating that then the sufficient reduction criterion (2.12) is required. If d is accepted as the final step dk in the kth iteration, we refer to k as an ƒ-type iteration.

Similarly, we call d a V -type step if Δq(d) < 0. If d is accepted as the final step dk in iteration k, we refer to k as a V -type iteration. In addition, if xk is generated by Step 4.2, we also refer to it as a V -type iteration.

If ƒ (xk +1) < ƒ (xk), then we regard the step dk as an ƒ monotone step. Obviously, an ƒ -type step must be an ƒ monotone step.

 

3 Global convergence

In this section, we prove the global convergence of Algorithm 2.1. Firstly, we give some assumptions:

(A1) Let {xk} be generated by Algorithm 2.1 and {xk}, {xk + dk} are contained in a closed and bounded set S of n;

(A2) All the functions ƒ , ci , i are twice continuously differentiable on S.

(A3) The matrix Bk is uniformly positive definite and bounded for all k.

Remark 3. Assumption (A1) is reasonable. It may be forced if, for example, the original problem involves a bounded box among its constraints.

Remark 4. A consequence of Assumption (A3) is that there exist constants δ, M > 0, independent of k such that δy2 < yT Bky < My2 for all y n. Assumptions (A1) and (A2) imply boundedness of 2ci (x) i , and 2 ƒ (x) on S. Without loss of generality, we may assume 2ci (x) < M, i , 2ƒ (x) < M, x S.

Lemma 3.1. Assume is not a stationary point of V in the sense that 0 V (), where V denotes the Clarke subdifferential of V . Then there exists a scalar > 0 and a neighborhood N() of , such that Φ(x, σ) - V(x) < - for all σ > γρmin and all x N ().

Proof. By [2, Lemma 2.1], 0 V () implies Φ(, γρmin) - V () < 0, where γ and ρmin are from Algorithm 2.1. By the continuity of the function Φ (· ,γρmin) - V (·) on n, there exists a neighborhood N() and a scalar > 0 such that Φ(x,γρmin) - V(x) < - whenever x N (). The condition σ > γρmin together with the definition of Φ yields Φ (x,σ) - V (x) < Φ(x, γρmin) - V (x). Therefore, Φ(x, σ) - V (x) < - holds for all σ > γρmin and all x N ().                                                                                             

Remark 5. It can be seen that depends on such that 0 V () (i.e., = ()), because it must satisfy Φ(,σ) - V ()< -.

Lemma 3.2. Let dk = 0 be a feasible point of QP (xk, Bk, σk, ρk ). Then xk is a stationary point of V (x). Moreover, if xk X, then xk is a KKT point of the problem (NLP).

Proof. Since dk = 0 implies Φ(xk ,σk ) - V (xk) = 0, it follows from [2, Lemma 2.1] that xk is a stationary point of V (x). If V (xk) = 0 and dk = 0, then it follows from [2, Lemma 2.2] that xk is a KKT point for the problem (NLP).                                                    

Lemma 3.3. Let Assumptions (A1)-(A3) hold and d be a feasible point of the subproblem QP(x, B, σ, ρ), then we have

for t [0, 1].

Proof. By Taylor Expansion formula, the feasibility of d and Assumption (A2), we have that for i and t [0, 1],

and for i and t [0, 1],

where the vector zi is between x and x + td. The term ½t2nMρ2 in (3.15) and (3.16) is derived because

where we use that ·2 < n·2. Formulae (3.15) and (3.16) combining with definitions of V (x) and Φ(x,σ) yield (3.14).                                                                           

The following lemma shows that the loop in Step 4.2 is finite.

Lemma 3.4. Let Assumptions (A1)-(A3) hold, η (0, ½) and satisfies 0 V (). Then, there exist a scalar > 0 and a neighborhood N () of such that for any x N () and any d feasible for the problem QP(x, B, σ, ρ) with γρmin < σ < ρ < , it holds that

for all t (0, ].

Proof. It follows from Lemma 3.1 that there exists a neighborhood N () and () > 0 such that Φ(x, σ) - V(x) < -() whenever x N (). Combining this with (3.14), we have

Hence, the inequality (3.17) holds for all t (0, ()], where

Hence, if 0 V (xk), (2.13) follows taking = x = xk and d = ds. It implies that the loop in Step 4.2 is finite.

The following lemma shows that the iterate sequence {xk} approaches a stationary point of V (x) when Step 4.2 of Algorithm 2.1 is invoked infinitely many times.

Lemma 3.5 If Step 4.2 of Algorithm 2.1 is invoked infinitely many times, then there exists an accumulation point of {xk } such that the sequence {k } converges to V (), where 0 V (), i.e., is a stationary point of V (x).

Proof. Since Step 4.2 is invoked infinitely many times. By Step 5 k +1 is reset by V (xk + tkdk) infinitely. From Step 2 and 4.2, we note that if Step 4.2 is invoked at iteration k, the radius ρ in QP(xk, Bk, σk, ρ) associated with ds is greater than or equal to ρmin. Define ={k | k +1 is reset by V (xk + tkdk)}. Obviously, is an infinite set. The inequality (2.13) ensures V (xk +1) = k +1 for all k . The upper bound condition (2.8) ensures V (xk +1) < k = k +1 for all k . Therefore, V (xk) < k for all k. This together with (2.13) yields k +1 < V (xk) < k for all k . For k , k +1 = k . Therefore, {k } is a monotonically decreasing sequence and also has a lower bound zero. Then there exists an accumulation point of {xk } such that {k +1} V (). If 0 V (), by Lemma 3.1, there exists a neighborhood N () of and () > 0, such that

whenever xk N (). This together with (2.13) yields

for k and xk N(). By the mechanism of Algorithm 2.1 and Lemma 3.4, we have tk > r(). It follows with (3.19) that

for k and xk N(). Letting k tend to infinity in above inequality, the limit in the left-hand side is zero while the limit in the right-hand side is less than zero, which is a contradiction. Therefore 0 V ().                                                                  

Lemma 3.6. Consider sequences {V (xk)} and {ƒ (xk)} such that ƒ(xk) is monotonically decreasing and bounded below. If for all k, either V (xk+1)-V (xk) < -γ1V (xk+1) or ƒ(xk) - ƒ(xk+1) > γ2V (xk+1) holds, where constants γ1 and γ2 are from (2.7a) and (2.7b), then V (xk) 0, for k +.

Proof. See [8, Lemma 1].                                                                                    

Lemma 3.7. Suppose Assumptions (A1)-(A3) hold. If there exists an infinite sequence of iterates {xk } on which (V (xk), ƒ(xk)) is added to the filter, where V (xk) > 0 and {ƒ(xk)} is bounded below, then V (xk) 0 as k +.

Proof. Since inequalities (2.7a) and (2.7b) are the same as [8, (2.6)], the conclusion follows from Lemma 3.6 and [8, Corollary].                                                                                                        

Lemma 3.8. Suppose Assumptions (A1)-(A3) hold. Let d be a feasible point of QP(xk, Bk, σk, ρ). If Φ(xk, σk ) = 0, it then follows that

Proof. The proof of this lemma is very similar to the proof [8, Lemma 3]. By Taylor's theorem, we have

where y denotes some point on the line segment from xk to xk + d. This together with (2.9) and (2.10) implies

Then (3.20) follows from the boundedness of 2 ƒ(y) and Bk, and d < ρ. Since Φ(xk ,σk ) = 0, QP(xk, Bk, σk, ρ) reduces to (1.1). For i , it follows that

where yi denotes some point on the line segment from xk to xk + d. By feasibility of d,

and

follow in a similar way. It follows with the definition of V (x) that (3.21) holds.              

Lemma 3.9. Suppose Assumptions (A1)-(A3) hold. Let d be a feasible point of QP(xk, Bk, σk, ρ) with Φ(xk, σk ) = 0. Then xk + d is acceptable to the filter if ρ2 < , where .                                                                                   

Proof. It follows from (3.21) that V (xk + d) - τk < -γ1V (xk + d) holds if ρ2 < . By the definition of τk, (2.7a) is satisfied. Hence xk + d is acceptable to the filter.                                                                                                               

In order to prove that the iterate sequence generated from Algorithm 2.1 converges to a KKT point for the problem (NLP), some constraint qualification should be required, such as the well-known MFCQ. Thus we review its definition as follows.

Definition 5 (See [2]). MFCQ is said to be satisfied at x, with respect to the underlying constraint system c(x) = 0, c(x) < 0, if there is a z n such that the gradients ci (x), i are linearly independent and the following systems

are satisfied.

Proposition 3.10. Suppose that MFCQ is satisfied at x* X, then there exists a neighborhood N (x* ) of x* such that

1. MFCQ is satisfied at every point in N(x* ) X;

2. inequality

holds, where the vectors λk, µk generated by QP(xk, Bk, σk, ρk ) are multiplier vectors associated with xk N (x* ).

Proof. The first result is established in [20, Theorem 3]. The second result is established in [2, Theorem 5.1].                                                                                           

If MFCQ does not hold at a feasible x*, then the second statement of Proposition 3.10 cannot be guaranteed. All feasible points of the problem (NLP) at which MFCQ does not hold will be called non-MFCQ points.

The following lemma shows that (1.1) is consistent when xk approaches a feasible point at which MFCQ holds and both the quadratic reduction and the actual reduction of the objective function have sufficient reduction. Its proof is similar to that of [8, Lemma 5].

Lemma 3.11. Suppose Assumptions (A1)-(A3) hold and let x* S be a feasible point of problem (NLP) at which MFCQ holds but which is not a KKT point. Then there exists a neighborhood Nº of x* and positive constants , ν and , such that for all xk S Nº and all ρ for which

it follows that QP(xk, Bk, σk, ρ) with Φ(xk, σk ) = 0 has a feasible solution d at which the predicted reduction satisfies

the sufficient reduction condition (2.12) holds, and the actual reduction satisfies

Proof. Since x* is a feasible point at which MFCQ holds but it is not a KKT point, it follows that the vectors ci (x*), i are linearly independent, and there exists a vector s* that satisfies

where s* = 1. Let c(xk)+ = c(xk)T c(xk))-1c(xk)T and c(xk) denote the matrix with columns ci (xk), i . It follows from linear independence and continuity that there exists a neighborhood of x* in which c(xk)+ is bounded. Let P = -c(xk)+T c(xk) and s = (I - c(xk)c(xk)+) s* / (I -c(xk)c(xk)+) s* if is not empty, otherwise P = 0 and s = s*. Let p = P. It follows from (3.25) and (3.27) by continuity that there exists a (smaller) neighborhood * and a constant > 0 such that

when xk *. By definition of P, it follows that p = O(V (xk)), and thus we can choose the constant ν in (3.22) sufficiently large so that ρ> p for all xk *.

We now consider the solution of (1.1). The line segment defined by

for a fixed value of ρ > p. Since ρ > p, and P and s are orthogonal, it implies

From (3.29) and the definitions of P, s, dα satisfies the equality constraints c(xk) +c(xk)Tdα = 0 of (1.1) for all α [0, 1].

If xk * S and i \(x*), then there exists positive constants and , independent of ρ, such that

for all vectors s such that s < 1, by the continuity of ci (xk) and boundedness of ci (xk) on S. It follows that

for all vectors d such that Cd < ρ. Therefore, inactive constraints do not affect the solution of (1.1) if ρ satisfies ρ < /.

For active inequality constraints i (x*), it follows from (3.28) and (3.29) that ci (xk) +ci (xk)T d1 = ci (xk) +ci (xk)TP +(ρ - p)ci (xk)Ts < ci (xk) + ci (xk)TP - (ρ - p) < 0 if ρ > p + (ci (xk) +ci (xk)TP)/. We obtain from the definition of P that

Thus we can choose the constant ν in (3.22) sufficiently large so that ci (xk) + ci (xk)Td1 < 0, i (x*). Therefore d1 is feasible in (1.1) with respect to the active inequality constraints, and hence to all the constraints by above results. Combining with the fact that (1.1) is equivalent to QP(xk, Bk, σk, ρ) with Φ(xk, σk ) = 0, we obtain that QP(xk, Bk, σk, ρ) with Φ(xk, σk ) = 0 is consistent for all xk * and all ρ satisfying (3.22) for any value of < /.

Next we aim to obtain a bound on the predicted reduction Δq(d). We note that q(0) - q(d1) = - ƒ(xk)T (P + (ρ - p)s) - ½ d1T Bkd1. Using (3.28), bounds on Bk and P, and ρ> p = O(V (xk)), we have

If ρ < , then q(0) - q(d1) > ½ ρ + O(V (xk)). Since d1 is feasible and p = O(V (xk)), it follows that the predicted reduction (2.9) satisfies

for some sufficiently large and independent of ρ. Hence, (3.23) is satisfied if ρ > 6 V (xk)/. This condition can be achieved by making the constant ν in (3.22) sufficiently large.

Next we aim to prove (3.24). By (3.20) and (3.23), we have

Then, if ρ < (1 - η)/(3nM), it follows that (2.12) holds. By (2.12), (3.21) and (3.23), we have ƒ(xk) - ƒ(xk + d) - γ2V (xk + d) = Δƒ(d) - γ2V (xk + d) > ηρ - ½ γ2mnρ2 M > 0 if ρ < η/(γ2mnM). Therefore, we may define the constant in (3.22) to be the least of η/(γ2mnM) and the values (1 - η)/ (3nM), ρ < and /, as required earlier in the proof.                                                                                            

Lemma 3.12. Suppose Assumptions (A1)-(A3) hold and let x* S be a feasible point of problem (NLP) at which MFCQ holds but which is not a KKT point. Then there exists a neighborhood Nº of x* and a positive constant ν, such that for all xk S Nº, all ρ and all σk for which

it follows that Φ(xk, σk ) = 0 and QP(xk, Bk, σk, ρ) has a feasible solution d at which the predicted reduction satisfies

Proof. By Lemma 3.11, there exists a neighborhood N1 of x* and positive constants ν, such that for all xk S and all ρ and σk for which ν V (xk) < ρ < , it follows that the QP subproblem (1.1) has a feasible solution d at which the predicted reduction satisfies (3.32). Since the global optimality of d ensures that Δq(d) decreases monotonically as ρ decreases, the predicted reduction satisfies (3.32) whenever ν V (xk) < ρ.

From the earlier proof, if ν V (xk) < σk < , we also have that the QP subproblem (1.1) is consistent by taking ρ = σk . It means that the problem (2.4) has the optimal value 0. If σk increases to a value larger than , then the feasible region of the problem (2.4) with ρ = σk is also enlarged correspondingly and the optimal value is still 0. Therefore, Φ(xk, σk ) = 0 whenever ν V (xk) < σk.

The above two conclusions complete the proof.                                                      

As a last preparation for the proof of Theorem 3.14, we proceed as in [8] and show that the loops 2-(4.1)-2 and 2-(4.3)-2 terminate finitely.

Lemma 3.13. Suppose Assumptions (A1)-(A3) hold, then the loops 2-(4.1)-2 and 2-(4.3)-2 terminate finitely.

Proof. If xk is a KKT point for the problem (NLP), then d = 0 is the solution of QP(xk, Bk, σk, ρ), and Algorithm 2.1 terminates, so do the loops 2-(4.1)-2 and 2-(4.3)-2. If V (xk) > 0 and Φ(xk, σk ) - V (xk) = 0, then stop by Step 3 of Algorithm 2.1. In fact, it follows from [2, Lemma 2.1] that V (xk) > 0 and 0 V (xk), which means that xk is a stationary point of V (x) and not feasible for the problem (NLP). If both above situations do not occur, and the loops 2-(4.1)-2 and 2-(4.3)-2 do not terminate finitely, then ρ 0 from Algorithm 2.1. There are two cases to be considered.

Case (i). V (xk) > 0 and 0 V (xk).

(a) If i and ci (xk) > 0, then for all d (d < ρ),

if either ci (xk) = 0 or ρ < . Thus, for sufficiently small ρ, the equality constraints cannot be satisfied and (1.1) is inconsistent. Therefore, Φ(xk, σk) > 0.

(b) If i and ci (xk) > 0, a similar conclusion is obtained.

(c) If i and ci (xk) < 0, a similar conclusion is also obtained.

So, for some ρ sufficiently small, Φ(xk, σk ) > 0 and 0 V (xk). By Step 4, the procedure executes at 4.2 of Step 4. It follows from Lemma 3.4 that (2.13) holds. By the mechanism of Algorithm 2.1, the loop 2-(4.1)-2 terminates finitely.

Case (ii). V (xk) = 0. Then Φ(xk, σk ) = 0.

For the inactive constraints at xk , by a similar argument, it will still be inactive for sufficiently small ρ. Thus, we only need to consider the active constraints.

Since xk is not a KKT point, there exists a vector s, s = 1, and a scalar > 0 such that ƒ(xk)Ts < -, ci (xk)Ts = 0, i and ci (xk)Ts < 0, i (xk) = {i : ci (xk) = 0, i }. We note that q(0) - q(ρs) = -ρƒ(xk)Ts - ½ ρ2sT Bks. Using bound on Bk, we have q(0) - q(ρs) > ρ - ½ ρ2nM. If ρ < , then q(0) - q(ρs) > ½ ρ. Since ρs is feasible for QP(xk, Bk, σk, ρ), it follows that the predicted reduction (2.9) satisfies

where d is the solution of QP(xk, Bk, σk, ρ). If ρ < , it follows with (3.20) that Δƒ(d) > ηΔq(d) > 0. So the sufficient reduction condition (2.12) for an ƒ-type iteration is satisfied. Moreover, by (3.21), we have

if ρ < η/(γ2mnM). Thus, xk + d is acceptable to xk . Of course, the upper bound condition (2.8) is satisfied.

Finally, it follows with Lemma 3.9 that for a sufficiently small ρ, an ƒ-type iteration is generated and the loop 2-(4.3)-2 terminates finitely.                                                                                                            

Lemma 3.13 together with Lemma 3.4 implies that Algorithm 2.1 is well-defined. We are now able to adapt [8, Theorem 7] to Algorithm 2.1.

Theorem 3.14. Let Assumptions (A1)-(A3) hold. {xk } is a sequence generated by Algorithm 2.1, then there is an accumulation point that is a KKT point for the problem (NLP), or a non-MFCQ point for the problem (NLP) or a stationary point of V (x) that is infeasible for the problem (NLP).

Proof. Since the loops 2-(4.1)-2 and 2-(4.3)-2 are finite, we only need to consider that the loop 2-6-2 is infinite. All iterates lie in S, which is bounded, so it follows that the iteration sequence has at least one accumulation point.

Case (i). Step 4.2 of Algorithm 2.1 is invoked finitely many times. Then Step 4.2 of Algorithm 2.1 is not invoked for all sufficiently large k.

Sub-case (i). There are infinite V-type steps in the main iteration sequence. Then, from Lemma 3.7, V (xk) 0 and τk 0 on this subsequence. Moreover, there exists a subsequence indexed by k Sφ of V-type iterations for which xk x*, V (xk) 0 and τk +1 < τk . One consequence is that x* is a feasible point for the problem (NLP). If MFCQ is not satisfied at x*, then x* is a non-MFCQ point for the problem (NLP). We therefore assume that MFCQ is satisfied at x* and consider the assumption that x* is not a KKT point (to be contradicted).

From Lemma 3.12, we know that Φ(xk, σk ) = 0 and the step d is an ƒ-type step for all sufficiently large k if ρ > γ V (xk). So, xk+d is acceptable to the filter for sufficiently large k from Lemma 3.9 if ρ2 < . Thus, by Lemma 3.11 we deduce that if ρ satisfies

then k is an ƒ-type iterate for sufficiently large k.

Now we need to show that a value of ρ > νV (xk) can be found in the loop 2-(4.3)-2 such that k is an ƒ-type iterate for sufficiently large k. Since τk 0 when k( Sφ) +, the range (3.34) becomes

From the definition of Sφ, we know that τk +1 = V (xk) < τk . Because of the square root, the upper bound in (3.35) can be greater than twice the lower bound. From Algorithm 2.1, a value ρ > ρmin is chosen at the beginning of each iteration, then it will be greater than the upper bound in (3.35) for sufficiently large k. We can see that successively halving ρ in the loops 2-(4.1)-2 and 2-(4.3)-2 will eventually locate in the range of (3.35), or the right of this interval.Lemma 3.12 implies that it is not possible for any value of ρ > νV (xk) to produce an V-type step. Thus if k( Sφ) is sufficiently large, an ƒ-type iteration is generated, which contradicts the definition of Sφ. So x* is a KKT point.

Sub-case (ii). There are finite V-type steps in the main iteration sequence. Then, there exists a positive integer K , such that for all k > K , xk is an ƒ-type iteration. So {ƒ(xk)}k>K is strictly monotonically decreasing. It follows from Lemma 3.6 that V (xk) 0 and hence, any accumulation point x* of the main iteration sequence is a feasible point. Since ƒ(x) is bounded, we get that

It follows that Δƒ(dk) 0, k > K as k +.

Now we assume that MFCQ is satisfied at x* and x* is not a KKT point. Similarly to sub-case (i), when

ƒ-type iterations are generated. The right-hand side of (3.37) is a constant, independent of k. Since the upper bound of (3.37) is a constant and the lower bound converges to zero, the upper bound must be more than twice the lower bound. So a value of ρ will be located in this interval, or a value to the right of this interval. Hence, ρ > min{½, ρmin}. Since the global optimality of d ensures that Δq(d) decreases monotonically as ρ decreases, by Lemma 3.11, Δq(d) > min{½ , ρmin} holds even if ρ is greater than . This together with (2.12) yields that ƒ(dk) > η min{½,ρmin}, which is a contradiction. Thus, x* is a KKT point.

Case (ii). Step 4.2 of Algorithm 2.1 is invoked infinitely many times. Then it follows from Lemma 3.5 that there exists an accumulation point x* of {xk} such that 0 V (x*), where is an infinite index set satisfying that any k is a V-type iteration number and k +1 = V (xk + tkdk). If V (x*) > 0, then x* is a stationary point of V (x) that is infeasible for the problem (NLP). If V (x*) = 0 and MFCQ fails to be satisfied at x*, then x* is a non-MFCQ point for the problem (NLP). If V (x*) = 0 and MFCQ holds at x*, then we prove that x* is a KKT point for the problem (NLP).

Suppose x* is not a KKT point for the problem (NLP), but V (x*) = 0, and MFCQ holds at x*. Then, from Lemma 3.5, k 0 on this sequence. Therefore, for all k , V (xk) < k 0 and k +1 = V (xk + tkdk). By Step 4.4, at any new iteration the first trial radius ρ is greater than or equal to ρmin. So σk > γρmin is true from the fact σk (γρ, ρ). Combining this with the fact V (xk) 0 as k() yields νV (xk) < σk < ρ for all sufficiently large k. It follows with Lemma 3.12 that Φ(xk, σk ) = 0 and (3.32) is satisfied for all sufficiently large k . Similar to Sub-case (i) of Case (i), k is an ƒ-type iterate for sufficiently large k . Hence Step 4.2 could not be invoked at this iteration and k +1 = k which contradicts the definition of . So x* is a KKT point.                        

 

4 Numerical results

We give some numerical results of Algorithm 2.1 coded in Matlab for the constrained optimization problems. The details about the implementation are described as follows:

(a) Termination criteria. Algorithm 2.1 stops if

(b) Update Bk . Initiate B0 = I , where I is the identity matrix with appropriate dimension. Update Bk by the BFGS formula with Powell's modifications [19], which is described as following:

set

where

and

(c) The parameters are chosen as: ρ0 = 5.0, ρmin = 10-4, η = 0.1, γ1 = γ2 = 2 × 10-4, ε = 10-6 , 0 = 10 max{1, V (x0)}, σk = 0.9ρ for all k = 0, 1, 2 ... and all ρ > 0.

Firstly, the following examples is from [29]. Since EXAMPLE 3 in [29] is unbounded below, we do not give it here. The numerical results of other examples are described in the following.

EXAMPLE 1

x* = 0, ƒ (x* ) = 0. The iteration number of Algorithm 2.1 is 2.

EXAMPLE 2

x*= (1.224745, 1.224745, 1.224745, 1.224745)T , ƒ(x*) = 6. The iteration number of Algorithm 2.1 is 6.

EXAMPLE 4

x*= (0, 0, 2)T, ƒ (x*) = -2. The iteration number of Algorithm 2.1 is 4.

Compared with the results in [29, 28], the computation at each iteration in this paper is less than those in them.

Except for above numerical experiments, we also test some examples from [12]. We compare these numerical results with those in [13]. The detailed results of the numerical tests on these problems are summarized in Table 1, where NIT, NF, and NG represent the numbers of iterations, function, and gradient calculations, respectively. The problems are numbered in the same way as in Hock and Schittkowski [12]. For example, HS022 represents problem 22 in Hock and Schittkowski [12].

 

 

For Problem HS022, Φ(x0) = 0.3941 implies that QP problem is inconsistent at the first iteration. After this, all QP problems are consistent in the subsequentiterations. Similarly, for Problem HS063, only Φ(x0) = 1.25 implies the same situation as Problem HS022. Except for Problem HS022 and HS063, for all the other problems in Table 1, their QP problems are consistent in all iterations.

The above analysis shows that our algorithm deals with inconsistent QP problem effectively and is comparable to the algorithm in [13]. So, the numerical tests confirm the robustness of our algorithm.

Acknowledgement. We are deeply indebted to the editor Prof. José Mario Martínez and two anonymous referees whose insightful comments helped us a lot to improve the quality of the paper.

 

REFERENCES

[1] C. Audet and J.E. Dennis. A pattern search filter method for nonlinear programming without derivatives. SIAM Journal on Optimization, 14 (2004), 980-1010.         [ Links ]

[2] J.V. Burke and S.P. Han. A robust sequential quadratic programming method. Mathematical Programming, 43 (1989), 277-303.         [ Links ]

[3] S. Bellavia, M. Macconi and B. Morini. STRSCNE: a scaled trust region solver for constrained nonlinear equations. Computational Optimization and Applications, 28 (2004), 31-50.         [ Links ]

[4] S. Bellavia, M. Macconi and B. Morini. STRSCNE: http://ciro.de.unifi.it/STRSCNE/, (2007).         [ Links ]

[5] H.Y. Benson, D.F. Shanno and R. Vanderbei. Interior-point methods for nonconvex nonlinear programming jamming and numerical test. Mathematical Programming, 99 (2004), 35-48.         [ Links ]

[6] C.M. Chin and R. Fletcher. On the global convergence of an SLP-filter algorithm that takes EQP steps. Mathematical Programming, 96 (2003), 161-177.         [ Links ]

[7] R. Fletcher and S. Leyffer. Nonlinear programming without a penalty function. Mathematical Programming, 91 (2002), 239-269.         [ Links ]

[8] R. Fletcher, S. Leyffer and Ph.L. Toint. On the global convergence of a filter-SQP algorithm. SIAM Journal on Optimization, 13 (2002), 44-59.         [ Links ]

[9] R. Fletcher, N.I.M. Gould, S. Leyffer, Ph.L. Toint and A. Wächter. Global convergence of a trust region SQP-filter algorithms for general nonlinear programming. SIAM Journal on Optimization, 13 (2002), 635-659.         [ Links ]

[10] C.C. Gonzaga, E. Karas and M. Vanti. A globally convergent filter method for nonlinear programming. SIAM Journal on Optimization, 14 (2004), 646-669.         [ Links ]

[11] J. Gauvin. A necessary and sufficient regularity condition to have bounded multipliers in nonconvex programming. Mathematical Programming, 12 (1977), 136-138.         [ Links ]

[12] W. Hock and K. Schittkowski. Test Examples for Nonlinear Programming Codes. Lecture Notes in Econom. and Mathematical Systems 187, Springer-Verlag, Berlin (1981).         [ Links ]

[13] X.W. Liu and Y.X. Yuan. A robust algorithm for optimization with general equality and inequality constraints. SIAM Journal on Scientific Computing, 22 (2000), 517-534.         [ Links ]

[14] F. John. Extremum problems with inequalities as subsidiary conditions. Studies and Essays Presented to R. Courant on his 60th Birthday (Interscience, New York, NY 1948), 187-204.         [ Links ]

[15] E.W. Karas, A. Ribeiro, C. Sagastizábal and M. Solodov. A bundle-filter method for nonsmooth convex constrained optimization. Mathematical Programming, 116 (2009), 297-320.         [ Links ]

[16] M. Macconi, B. Morini and M. Porcelli. Trust-region quadratic methods for nonlinear systems of mixed equalities and inequalities. Applied Numerical Mathematics, to appear.         [ Links ]

[17] P.Y. Nie. A filter method for solving nonlinear complementarity problems. Applied Mathematics and Computation, 167 (2005), 677-694.         [ Links ]

[18] P.Y. Nie. Sequential penalty quadratic programming filter methods for nonlinear programming. Nonlinear Analysis: Real World Applications, 8 (2007), 118-129.         [ Links ]

[19] M.J.D. Powell. A fast algorithm for nonlinearly constrained optimization calculations, in Numerical Analysis, Proceedings, Biennial conference, Dundee, Lecture Notes In Math. 630, G.A. Waston, ed., Springer-Verlag, Berlin, New York (1977), 144-157.         [ Links ]

[20] R.M. Robinson. Stability theory for systems of inequalities, part II: differential nonlinear systems. SIAM Journal on Numerical Analysis, 4 (1976), 497-513.         [ Links ]

[21] A.A. Ribeiro, E.W. Karas and C.C. Gonzaga. Global convergence of filter methods for nonlinear programming. SIAM Journal on Optimization, 19 (2008), 1231-1249.         [ Links ]

[22] C. Sainvitu and Ph.L. Toint. A filter-trust-region method for simple-bound constrained optimization. Optimization Methods and Software, 22 (2007), 835-848.         [ Links ]

[23] C. Shen, W. Xue and D. Pu. Global convergence of a tri-dimensional filter SQP algorithm based on the line search method. Applied Numerical Mathematics, 59 (2009), 235-250.         [ Links ]

[24] S. Ulbrich. On the superlinear local convergence of a filter-SQP method. Mathematical Programming, 100 (2004), 217-245.         [ Links ]

[25] M. Ulbrich, S. Ulbrich and L.N. Vicente. A globally convergent primal-dual interior filter method for nonconvex nonlinear programming. Mathematical Programming, 100 (2004), 379-410.         [ Links ]

[26] A. Wächter and L.T. Biegler. Line search filter methods for nonlinear programming: motivation and global convergence. SIAM Journal on Optimization, 16 (2005), 1-31.         [ Links ]

[27] A. Wächter and L.T. Biegler. Line search filter methods for nonlinear programming: local convergence. SIAM Journal on Optimization, 16 (2005), 32-48.         [ Links ]

[28] J.L. Zhang and X.S. Zhang. A modified SQP method with nonmonotone linsearch technique. Journal of Global Optimization, 21 (2001), 201-218.         [ Links ]

[29] G.L. Zhou. A modified SQP method and its global convergence. Journal of Global Optimization, 11 (1997), 193-205.         [ Links ]

 

 

Received: 10/III/08.
Accepted: 03/IV/09.

 

 

#753/08.

 

 

* This research is supported by National Science Foundation of China (No. 10771162), Talents Introduction Foundation (No. F08027) and Innovation Program of Shanghai Municipal Education Commission (No. 09YZ408).