Abstract
In this short note a sensitivity result for quadratic semidefinite programming is presented under a weak form of second order sufficient condition. Based on this result, also the local convergence of a sequential quadratic semidefinite programming algorithm extends to this weak second order sufficient condition. Mathematical subject classification: 90C22, 90C30, 90C31, 90C55.
semidefinite programming; second order sufficient condition; sequential quadratic programming; quadratic semidefinite programming; sensitivity; convergence
A sensitivity result for quadratic semidefinite programs with an application to a sequential quadratic semidefinite programming algorithm
Rodrigo GarcésI; Walter GómezII,* * Corresponding author ; Florian JarreIII
IEuroAmerica S.A., Av. Apoquindo 3885, Las Condes, Santiago, Chile E-mail: rgarces@euroamerica.cl
IIDepartment of Mathematical Engineering, Universidad de La Frontera, Av. Francisco Salazar, 01145, Temuco, Chile E-mail: wgomez@ufro.cl
IIIInstitut für Mathematik, Universität Düsseldorf, Universitätsstraβe 1,D-40225 Düsseldorf, Germany E-mail: jarre@opt.uni-duesseldorf.de
ABSTRACT
In this short note a sensitivity result for quadratic semidefinite programming is presented under a weak form of second order sufficient condition. Based on this result, also the local convergence of a sequential quadratic semidefinite programming algorithm extends to this weak second order sufficient condition.
Mathematical subject classification: 90C22, 90C30, 90C31, 90C55.
Key words: semidefinite programming, second order sufficient condition, sequential quadratic programming, quadratic semidefinite programming, sensitivity, convergence.
1 Introduction
A sequential semidefinite programming (SSDP) algorithm for solving nonlinear semidefinite programs was proposed in [5, 7]. It is a generalization of the well-known sequential quadratic programming (SQP) method and considerslinear semidefinite subproblems that can be solved using standard interior point packages. Note that linear SDP relaxations with a convex quadratic objective function can be transformed to equivalent linear SDP subproblems. However, as shown in [4], under standard assumptions, local superlinear convergence is possible only when the iterates are defined by SDP relaxations with a nonconvex quadratic objective function. Since this class of problems is no longer equivalent to the linear semidefinite programming case we refer to the algorithm in this note as Sequential Quadratic Semidefinite Programming (SQSDP) method.
In the papers [5, 7] a proof is given showing local quadratic convergence of the SSDP algorithm to a local minimizer assuming a strong second order sufficient condition. This condition ensures, in particular, that the quadratic SDP subproblems close to the local minimizers are convex, and therefore reducible to the linear SDP case. However, as pointed out in [4], there are examples of perfectly well-conditioned nonlinear SDP problems that do not satisfy the strong second order sufficient condition used in [5, 7].
These examples satisfy a weaker second order condition [10], that considers explicitly the curvature of the semidefinite cone.
In this short note we study the sensitivity of quadratic semidefinite problems (the subproblems of SQSDP), using the weaker second order condition. Based on this sensitivity result, the fast local convergence of the SQSDP method can also be established under the weaker assumption in [10]; in this case the quadratic SDP subproblems may be nonconvex.
The sensitivity results presented in this paper were used in [8] for the study of a local self-concordance property for certain nonconvex quadratic semidefinite programming problems.
2 Notation and preliminaries
By
m we denote the linear space of m × m real symmetric matrices. The space m×n is equipped with the inner product
The corresponding norm is the Frobenius norm defined by ||A||F = . The negative semidefinite order for A, B ∈ m is defined in the standard form, that is, A B iff A - B is a negative semidefinite matrix. The order relations , and are defined similarly. By we denote the set of positive semidefinite matrices.
The following simple Lemma is used in the sequel.
Lemma 1 (See [7]). Let Y, S ∈ m.
(a) If Y, S 0 then
(b) If Y + S 0 and YS + SY = 0 then Y, S 0.
(c) If Y + S 0 and YS + SY = 0 then for any , ∈ m,
Moreover, Y, S have representations of the form
where U is an m × m orthogonal matrix, Y1 0 is a (m - r) × (m - r) diagonal matrix and S2 0 is a r × r diagonal matrix, and any matrices , ∈ m satisfying (2) are of the form
where
Proof. For (a), (c) see [7].
(b) By contradiction we assume that λ is a negative eigenvalue of S and u a corresponding eigenvector. The equality YS + SY = 0 implies that
Now using the fact that Y + S 0 , we have
0 < uT(Y + S)u = uTYu + uTSu = λuTu = λ||u||2 < 0
which is a contradiction. Hence, S 0. The same arguments give us Y 0.
□
Remark 1. Due to (3) and the positive definiteness of the diagonal matrices Y1 and S2, it follows that (3)ij(3)ij < 0 whenever (3)ij≠ 0. Hence, if, in addition to (3), also ‹3, 3› = 0 holds true, then 3 = 3 = 0.
In the sequel we refer to the set of symmetric and strict complementary matrices
As a consequence of Lemma 1(b), the set C is (not connected, in general, but) contained in . Moreover, Lemma 1(c) implies that the rank of the matrices Y and S is locally constant on C.
2.1 Nonlinear semidefinite programs
Given a vector b ∈ n and a matrix-valued function G : n → m, we consider problems of the following form:
Here, the function G is at least C3-differentiable.
For simplicity of presentation, we have chosen a simple form of problem (5). All statements about (5) in this paper can be modified so that they apply to additional nonlinear equality and inequality constraints and to nonlinear objective functions. The notation and assumptions in this subsection are similar tothe ones used in [8].
The Lagrangian : n × m→ of (5) is defined as follows:
Its gradient with respect to x is given by and its Hessian by
and its Hessian by
Assumptions.
(A1) We assume that is a local minimizer of (5) that satisfies the Mangasarian-Fromovitz constraint qualification, i.e., there exists a vector Δx ≠ 0 such that G() + DG()[Δx] 0, where by definition DG(x)[s] = G(x).
Assumption (A1) implies that the first-order optimality condition is satisfied, i.e., there exist matrices , ∈ m such that
A triple (, , ) satisfying (9), will be called a stationary point of (5).
Due to Lemma 1(a) the third equation in (9) can be substituted by
+ = 0. This reformulation does not change the set of stationary points, but it reduces the underlying system of equations (via a symmetrization of YS) in the variables (x, Y, S), such that it has now the same number of equations and variables. This is a useful step in order to apply the implicit function theorem.(A2) We also assume that is unique and that , are strictly complementary, i.e. (, ) ∈ C.
According to Lemma 1(c), there exists a unitary matrix U = [U1, U2] that simultaneously diagonalizes and . Here, U2 has r := rank() columns and U1 has m - r columns. Moreover the first m - r diagonal entries of UT
U are zero, and the last r diagonal entries of UTU are zero. In particular, we obtain
A vector h ∈ n is called a critical direction at if bTh = 0 and it is the limit of feasible directions of (5), i.e. if there exist hk ∈ n and єk > 0 with limk→∞ hk = h, limk→ ∞ єk = 0, and G( + єkhk) 0 for all k. As shown in [1] the cone of critical directions at a strictly complementary local solution is given by
In the following we state second order sufficient conditions due to [10] that are weaker than the ones used in [5, 7].
(A3) We further assume that , satisfies the second order sufficient condition:
Here is a nonnegative matrix related to the curvature of the semidefinite cone in G() along direction (see [10]) and is given by its matrix entries
where Gi() := DG()[ei] with ei denoting the i-th unit vector. Furthermore, G() denotes the Moore-Penrose pseudo-inverse of G(), i.e.
where λi are the nonzero eigenvalues of G() and ui corresponding orthonormal eigenvectors.
Remark 2. The Moore-Penrose inverse M is a continuous function of M,when the perturbations of M do not change its rank, see [3].
The curvature term can be rewritten as follows:
Note that in the particular case where G is affine (i.e. G(x) = (x) + C, with a linear map and C ∈ m), the curvature term is given by
The following very simple example of [4] shows that the classical second order sufficient condition is generally too strong in the case of semidefiniteconstraints, since it does not exploit curvature of the non-polyhedral semidefinite cone.
It is a trivial task to check that the constraint G(x) 0 is equivalent to the inequality < 1, such that = (0, -1)T is the global minimizer of the problem.
The first order optimality conditions (9) are satisfied at with associated multiplier
The strict complementarity condition also holds true, since
The Hessian of the Lagrangian at (, ) for this problem can be calculated as
It is negative definite, and the stronger second order condition is not satisfied.
In order to calculate the curvature term in (12) let us consider the orthogonal matrix U, which simultaneously diagonalizes , G()
The Moore-Penrose pseudoinverse matrix at is then given by
and the matrix associated to the curvature becomes
Finally, every h ∈ 2 such that ∇f()T · h = 0, has the form h = (h1, 0)T with h1∈ . Therefore, the weaker second order sufficient condition holds, i.e.,
3 Sensitivity result
Let us now consider the following quadratic semidefinite programming problem
Here, : n → m is a linear function, b ∈ n, and C, H ∈ m. The data to this problem is
In the next theorem, we present a sensitivity result for the solutions of (17), when the data is changed to + Δ where
is a sufficiently small perturbation.
The triple (, , ) ∈ n × m × m is a stationary point for (17), if
Remark 3. Below, we consider tiny perturbations Δ such that there is an associated strictly complementary solution (x, Y, S) (Δ) of (20). For such x there exists U1 = U1(x) and an associated cone of critical directions C(x). The basis U1(x) generally is not continuous with respect to x. However, the above characterization (11) of C() under strict complementarity can be stated using any basis of the orthogonal space of G(). Since such basis can be locally parameterized in a smooth way over the set C in (4) it follows that locally, the set C(x) forms a closed point to set mapping.
The following is a slight generalization of Theorem 1 in [7].
Theorem 1. Let the point (, , ) be a stationary point satisfying the assumptions (A1)-(A3) for the problem (17) with data . Then, for all sufficiently small perturbations Δas in (19), there exists a locally unique stationary point (( + Δ), ( + Δ), ( + Δ)) of the perturbed program (17) with data + Δ. Moreover, the point ((), (), ()) is a differentiable function of the perturbation (19), and for Δ = 0, we have ((), (), ()) = (, , ). The derivative ((), (), ()) of ((), (), ()) with respect to evaluated at (, , ) is characterized by the directional derivatives
for any Δ. Here, (, , ) is the unique solution of the system of linear equations,
for the unknowns∈ n, , ∈ m. Finally, the second-order sufficient condition holds at ((), ()) whenever Δis sufficiently small.
This theorem is related to other sensitivity results for semidefinite programming problems (see, for instance, [2, 6, 11]). Local Lipschitz properties under strict complementarity can be found in [9]. In [10] the directional derivative is given as solution of a quadratic problem.
Proof. Following the outline in [7] this proof is based on the application ofthe implicit function theorem to the system of equations (20). In order to apply this result we show that the matrix of partial derivatives of system (20) with respect to the variables (x, Y, S) is regular. To this end it suffices to prove that the system
only has the trivial solution = 0, = = 0.
Let (, , ) ∈ n × m × m be a solution of (22). Since and are strictly complementary, it follows from part (c) of Lemma 1, the existence of an orthonormal matrix U such that:
where
with
1, 2 diagonal and positive definite. Furthermore, the matrices , ∈ m satisfying (22) fulfill the relations
where
Using the decomposition given in (10), the first equation of (22), and (25) we have
It follows that ∈ C(). Now using (14), (23-26), the first equation in (20), and the first equation in (22), we obtain
By the same way, using the first two relations in (25) and the first two equations of (22), one readily verifies that
Consequently
This implies that = 0 , since ∈ C(). Using Remark 1 it follows also that 3 = 3 = 0.
By the first equation of (22), we obtain
Thus, it only remains to show that = 0. In view of (26) we have
Now suppose that
1≠ 0. Since 1 0, it is clear that there exists some > 0 such that 1 + τ1 0 ∀τ ∈ (0,]. If we define τ := + τ it follows that
Moreover, using (20), (22), and (28), one readily verifies that (, τ, ) also satisfies (20) for all τ ∈ (0, ]. This contradicts the assumption that (, , ) is a locally unique stationary point. Hence 1 = 0 and by (29), = 0.
We can now apply the implicit function theorem to the system
As we have just seen, the linearization of (30) at the point (, , ) is nonsingular. Therefore the system (30) has a differentiable and locally unique solution ((Δ), (Δ), (Δ)). By the continuity of (Δ), (Δ) with respect to Δ it follows that for ||Δ|| sufficiently small, (Δ) + (Δ) 0, i.e. ((Δ) + (Δ)) ∈ C.
Consequently, by part (b) of Lemma 1 we have (Δ), (Δ) 0. This implies that the local solutions of the system (30) are actually stationary points.
Note that the dimension of the image space of (Δ) is constant for all ||Δ|| sufficiently small. According to Remark 2 it holds that (Δ)† → when Δ→ 0.
Finally we prove that the second-order sufficient condition is invariant under small perturbations Δ of the problem data . We just need to show that there exists > 0 such that for all Δ with ||Δ|| < it holds:
Since C((Δ))/{0} is a cone, it suffices to consider unitary vectors, i.e. ||h|| = 1. We assume by contradiction that there exists εk→ 0, {Δk} with ||Δk|| < εk, and {hk} with hk ∈ C((Δk)) \ {0} such that
We may assume that hk converges to h with ||h|| = 1, when k → ∞. Since Δk→ 0, we obtain from the already mentioned convergence (Δ)→ and simple continuity arguments that:
The left inequality of this contradiction follows from the second order sufficient condition since h ∈ C((0)) \ {0} due to Remark 3.
□
4 Conclusions
The sensitivity result of Theorem 1 was used in [7] to establish local quadratic convergence of the SSP method. By extending this result to the weaker form of second order sufficient condition, the analysis in [7] can be applied in a straightforward way to this more general class of nonlinear semidefinite programs. In fact, the analysis in [7] only used local Lipschitz continuity of the solution with respect to small changes of the data, which is obviously implied by the differentiability established in Theorem 1.
Acknowledgements. The authors thank to an anonymous referee for the helpful and constructive suggestions. This work was supported by Conicyt Chile under the grants Fondecyt Nr. 1050696 and Nr. 7070326.
Received: 27/IV/11.
Accepted: 24/V/11.
#CAM-362/11.
- [1] F. Bonnans and H.C Ramírez, Strong regularity of semidefinite programming problems. Informe Técnico, DIM-CMM, Universidad de Chile, Santiago, Departamento de Ingeniería Matemática, No CMM-B-05/06-137 (2005).
- [2] F. Bonnans and A. Shapiro, Perturbation Analysis of Optimization Problems Springer Series in Operations Research (2000).
- [3] A. Bjork, Numerical methods for least squares problems. SIAM, Society forIndustrial and Applied Mathematics, Philadelphia (1996).
- [4] M. Diehl, F. Jarre and C.H. Vogelbusch, Loss of Superlinear Convergence for an SQP-type method with Conic Constraints. SIAM Journal on Optimization, (2006), 1201-1210.
- [5] B. Fares, D. Noll and P. Apkarian, Robust control via sequential semidefinite programming. SIAM J. Control Optim., 40 (2002), 1791-1820.
- [6] R.W. Freund and F. Jarre, A sensitivity result for semidefinite programs. Oper. Res. Lett., 32(2004), 126-132.
- [7] R.W. Freund, F. Jarre and C. Vogelbusch, Nonlinear semidefinite programming: sensitivity, convergence, and an application in passive reduced-order modeling. Math. Program., 109(2-3) (2007), 581-611.
- [8] R. Garcés, W. Gómez and F. Jarre, A self-concordance property for nonconvex semidefinite programming. Math. Meth. Oper. Res., 74 (2011), 77-92.
- [9] M.V. Nayakkankuppam and M.L. Overton, Conditioning of semidefinite programs. Math. Program., 85 (1999), 525-540.
- [10] A. Shapiro, First and second order analysis of nonlinear semidefinite programs. Math. Program., 77 (1997), 301-320.
- [11] J.F. Sturm and S. Zhang, On sensitivity of central solutions in semidefinite programming. Math. Program., 90 (2001), 205-227.
Publication Dates
-
Publication in this collection
26 Apr 2012 -
Date of issue
2012
History
-
Received
27 Apr 2011 -
Accepted
24 May 2011