Abstract
In this paper, two different approaches to solve underdetermined nonlinear system of equations are proposed. In one of them, the derivative-free method defined by La Cruz, Martínez and Raydan for solving square nonlinear systems is modified and extended to cope with the underdetermined case. The other approach is a Quasi-Newton method that uses the Broyden update formula and the globalized line search that combines the strategy of Grippo, Lampariello and Lucidi with the Li and Fukushima one. Global convergence results for both methods are proved and numerical experiments are presented.
underdetermined nonlinear systems; Quasi-Newton method; derivative-free line search; spectral step length; global convergence
Two derivative-free methods for solving underdetermined nonlinear systems of equations
N. EchebestI; M. L. SchuverdtII, *; R. P. VignauI
IDepartment of Mathematics, FCE, University of La Plata CP 172, 1900 La Plata Bs. As., Argentina
IICONICET, Department of Mathematics, FCE, University of La Plata CP 172, 1900 La Plata Bs. As., Argentina. E-mails: schuverd@mate.unlp.edu.ar / opti@mate.unlp.edu.ar / vignau@mate.unlp.edu.ar
ABSTRACT
In this paper, two different approaches to solve underdetermined nonlinear system of equations are proposed. In one of them, the derivative-free method defined by La Cruz, Martínez and Raydan for solving square nonlinear systems is modified and extended to cope with the underdetermined case. The other approach is a Quasi-Newton method that uses the Broyden update formula and the globalized line search that combines the strategy of Grippo, Lampariello and Lucidi with the Li and Fukushima one. Global convergence results for both methods are proved and numerical experiments are presented.
Mathematical subject classification: Primary: 65H10; Secondary: 90C57.
Key words: underdetermined nonlinear systems, Quasi-Newton method, derivative-free line search, spectral step length, global convergence.
1 Introduction
We consider the problem of determining a solution x* ∈ Rn that verifies the nonlinear system of equations
where F : Rn→ Rm is a continuously differentiable function and m < n, making special emphasis when m < n. We are interested in systems for which the Jacobian matrix of F, denoted by J(x), is not available or requires a prohibitive amount of storage. This situation is common when functional values come from experimental measures, for example: from physics, chemistry or economics.
Such kind of problems also appears as the feasible set of general nonlinear programming problems. Our main motivation is the application of the proposed methods as subalgorithms for finding feasible points in derivative-free nonlinear programming algorithms such as, for example two-phases algorithms [3, 8, 16, 26], feasible methods [1, 12, 18, 19, 20, 22, 23, 24, 25] and inexact restoration methods [15, 17].
The resolution of square nonlinear systems without using derivatives has been addressed using the spectral residual approach in [11] and the Broyden Quasi Newton approach in [13]. Some ideas of those papers are incorporated in the present work. In [11], the authors defined the derivative-free spectral algorithm for nonlinear equations called DF-SANE. Furthermore, global convergence was proved by using a derivative-free line search that combines the nonmonotone strategy of Grippo, Lampariello and Lucidi [9] with the Li and Fukushima scheme [13].
In [13], a Quasi Newton method based on the Broyden update formula and on a nonmonotone derivative-free line search was defined for the square case. Also in [4] an inexact Quasi-Newton method was introduced with a similar line search technique to the one introduced in [13] and using Bk = J(xk) periodically. Under appropriate hypotheses, global and superlinear convergence were proved in both papers.
In the present paper we define two approaches for the nonsquare system of equations based on the ideas appearing in [11] and [13]. The proposed algorithms use a generalization of the derivative-free nonmonotone line search defined in [11]. The search direction in [11] is computed using the residual vector F (xk) and the spectral coefficient [2]. For the underdetermined system, the current point is computed by considering a fixed number of Lm directions that are the solution of some appropriate square systems using a spectral coefficient under the idea explained in [11]. The direction in the second algorithm is computed as an approximate solution of a linear system using the nonsquare Broyden update formula for matrices. It is well established in the literature that, for the square case, this is the most used secant approximation to the Jacobian and it works very well locally [6]. As we mentioned before, the Broyden update formula has been previously used by Li and Fukushima and combined with a derivative-free line search for the square case. In [13] the current direction is the solution to the linear system
and the update matrix Bk+1 is defined as
where yk = F (xk) F (xk1), sk = xk xk1 and the parameter βk is chosen such that |βk 1| < < 1 for which Bk+1 is nonsingular. In this paper, when there is a solution of the linear system (2) we use such solution as a search direction and, when there is none, we solve the linear system approximately as proposed in [14]. Consequently, we avoid the necessity of choosing the parameter βk.
Under appropriate hypotheses, global convergence of the sequence generated by both methods will be proved. The global convergence result obtained for the algorithm based on the spectral ideas extends the convergence result in [11]. For the Quasi Newton method that uses the Broyden update formula, we obtain convergence using a Dennis-Moré type condition. We show that this condition can be dropped out for a particular derivative-free line search. Thus, both algorithms can be viewed as extensions of the well known methods defined in [11] and [13] for square nonlinear systems.
We consider the usual continuously differentiable merit function f: Rn → R, which consists in a measure of the residual F(x) at x, f(x) = ||F(x)||2.
The iterative algorithms generate a sequence {xk}, for k = 1, 2, ..., starting from a given initial point x0. A subsequence of {xk} will be indicated by {xk}k ∈ K where K is some infinite index set.
This paper is organized as follows. In Section 2 we present the algorithm that performs the derivative-free line search and we establish there some of its properties. In Section 3 we define DF-SAUNE Algorithm, a modification of DF-SANE Algorithm of [11] for handling efficiently problem (1) and we prove the global convergence results. In Section 4 we define the Quasi Newton method using the Broyden update formula and the derivative-free line search introduced in Section 2. We analyze the conditions under which it is possible to obtain global convergence. Both algorithms are tested and the numerical results are presented in Section 5. Finally, some conclusions are drawn in Section 6.
Notation.
-
||.|| denotes the Euclidean norm.
-
Given a matrix
B ∈
Rm
×
n,
N (
B) denotes the null space of the matrix
B.
-
For
i = 1, ...,
n; ei is the canonical vector in
Rn.
-
In denotes the identity matrix in
Rn
×
n.
-
2 The nonmonotone line search without derivatives
In this section we shall be concerned with the nonmonotone derivative-free line search that will be used in the methods defined in the following sections. As we mentioned before, the strategy is based on the line search proposed in [11]. Given the current iterate xk and a search direction dk, the algorithm looks for a steplength αk such that
where M is a nonnegative integer, 0 < γ < 1 and ηk = η < ∞.
This procedure combines the well known nonmonotone line search strategy for unconstrained optimization introduced by Grippo, Lampariello and Lucidi [9]:
with the Li-Fukushima derivative-free line search scheme [13]:
The combined strategy (4) produces a robust nonmonotone derivative-free line search that takes into account the advantages of both schemes. The line search (4) is strongly based on the fact that the search direction comes from the residual vector. For details we refer to [11]. In such paper this strategy was implemented and tested using an extensive set of numerical experiments which showed its competitiveness for square systems of nonlinear equations.
In this paper, given a general search direction dk, based on (5) and (6), we consider the following line search condition:
where ||dk||2 takes the place of f(xk) in (4), obtaining a more general strategy. For completeness, we establish here the implemented process.
Algorithm 1. Nonmonotone line search
Given d ∈ Rn, 0 < τmin < τmax < 1, 0 < γ < 1, M ∈ N, {ηk} such that ηk > 0 and ηk = η < ∞
Step 1:Compute = max {f(xk), ..., f (xmax{0, k-M+1})}
α+ = α- = 1
Step 2:If f(xk + α+d) < + ηk - γ||d||2,
define dk = d, αk = α+, xk+1 = xk + αkdk
else if f (xk - α-d) < + η k - γ || d ||2,
define dk = -d, αk = a-, xk+1 = xk + αkdk
else
choose α+new ∈ [τminα+, τmaxα+], α-new ∈ [τminα-, τmaxα-] α+ = α+new, α- = α-new and go to step 2
Proposition 2.1.Algorithm 1 is well defined.
Proof. See Proposition 1 of [11].
The new algorithms for solving (1) will follow the next procedure.
Algorithm 2. General Algorithm
Given x0, F(x0), M ∈ N, 0 < τmin < τmax < 1, 0 < ∈ < 1, 0 < γ < 1, {η k} such that ηk > 0 and ηk = η = ∞.
Set k ← 0
Step 1:If ||F(xk)|| < ∈ max{1, ||F(x0)||} stop.
Step 2: Compute a search direction dk.
Step 3:Find αk and xk+1 = xk + αkdk using Algorithm 1.
Update k ← k + 1 and go to Step 1.
By considering the procedure above it is possible to obtain results (see below) that will be used in the next sections for obtaining the convergence results. The proofs of Propositions 2.2 and 2.3 below follow from Propositions 2 and 3 in [11] updated for the line search condition (7). We establish them here for completeness.
Proposition 2.2.For all k ∈ N consider
and define ν(k) ∈ {(k 1)M + 1, ..., kM} the index for which f(xv(k)) = Uk. Then for all k = 1, 2, ...
where η =
Proof. We have that
then
Thus, by an induction argument we obtain:
for l = 1, 2, ....
Since ν(k + 1) ∈ {kM + 1, ..., kM + M}
Thus, for all k = 1 , 2, . . . we have that
as we wanted to prove.
Proposition 2.3.
Proof. For all k = 1, 2, ... we have
Writing the last inequality for k = 1, 2, ..., L and adding these L inequalities we obtain
Therefore
Thus, the series is convergent and
as claimed.
Proposition 2.4.The sequence {xk} generated by the General Algorithm is contained in
Proof. From Proposition 2.2 we have that, for all k > 1,
Therefore, as we wanted to prove.
The results obtained up to here depend strongly on the line search technique without taking into account the way in which the direction dk in the step 2 of the Algorithm 2 was computed.
From now on we will consider the set
and from Proposition 2.3 we have that
Observe that from the proof of Propositions 2.3 and 2.4 we obtain the following result:
Proposition 2.5.If we take M = 1 in Algorithm 1 then
-
the sequence {x
k} generated by the General Algorithm is contained in
-
the series
is convergent.
3 Derivative-Free Spectral Algorithm for solving Underdetermined Nonlinear Equations (DF-SAUNE)
In this section, we develop a derivative-free method based on the algorithm DF-SANE [11] updated for the underdetermined case. DF-SANE is a derivative-free method for solving square nonlinear systems that uses the ndimensional residual vector as a search direction together with a spectral step length and a globalization strategy that produces a nonmonotone process. The spectral coefficient is the inverse of an approximation of the Rayleigh quotient with respect to a secant approximation of the Jacobian:
where see [2, 11].
The iterative process in DF-SANE can be viewed as a Quasi Newton method considering the matrix together with the iteration where αk is computed using the derivative-free nonmonotone strategy (4).
In order to solve the underdetermined case we propose to combine the idea of the augmented Jacobian algorithm, see for example [29], with the spectral residual vector explained above. In order to do that we consider Lm as the ceil number of , that is,
For each j = 0, ..., Lm 1 we define the matrices Ej ∈ Rmxn as follows. If then Ej is the matrix whose rows are the m canonical vectors in Rn. If then for j = 0, ..., Lm 2, Ej is the same matrix defined above and ELm -1 is the matrix whose rows are the m canonical vectors in Rn.
Given xk, for j = 0, ..., Lm 1 we consider the direction dk+j ∈ Rn as the solution to the square linear system
where Ej was defined above, Vj ∈ R(nm)xn is the matrix whose rows are the canonical vectors in Rn that span N (Ej) and σk+j is a spectral coefficient. Thus we can observe that dk+j = σk+j (xk+j) and
Once the direction dk+j is obtained, the line search is performed using Algorithm 1.
Observe that the system (11) is a square system that resembles the one used in DF-SANE for obtaining its current direction.
Given an arbitrary initial point x0e Rn, the algorithm that allows us to obtain the next iterate is given below:
Algorithm 3. DF-SAUNE
Given x0, F(x0), 0 < y < 1, 0 < τmin < τmax < 1, σ0= 1, 0 < σmm < σmax < ∞, 0 < € < 1.
Set k ← 0.
Main Step:Given xk, F(xk), σk.
(1) If ||F(xk)|| < ∈ max{1, ||F(x0)||}, stop.
(2) For j = 0 : Lm 1
-
Compute dk+j = Ok+j
F(xk+j)
-
Find aj and xk+j+1 = xk+j + ajdk+j using the Algorithm 1.
-
If ||F(xk+j+1)|| < ∈ max{1, ||
F(
x
0)||
}, stop.
-
Compute s = xk+j+1 xk+j, y = F (xk+j+1)
F (xk+j) and σ
-
If |σ
| ∈ [σ
min, σ
max]
define σ
k+j+1 = 0. If not, choose σ
k+j+1 such that |σ
k+j+1|
e [σmm, σmax]
End
(3) Update k ← k + Lm and repeat the main step.
End
By (9) we have that
Thus, using (12) and considering that each |σk| ∈ [σmin, σmax] we obtain that
In the following Theorem we prove the main convergence result associated to DF-SAUNE Algorithm. The proof follows the idea of the Theorem 1 in [11] updated for the Algorithm 3.
Theorem 3.1.Assume that {xk}k∈N is the sequence generated by Algorithm 3. Then, for every limit point x* of {xk}k∈K there exists an index jo∈ {0, ..., Lm 1} such that
Proof. Let x* be a limit point of {xk}keK. Thus, there is an infinite index set K1 c K such that
Then, by (13),
We have two possibilities:
(1) The sequence {αk}k∈K1 does not tend to zero;
(2) The sequence {αk}k∈K1 tends to zero.
In the first case there exists an infinite sequence of indices K2⊂ K1 such that αk> c > 0 for all k ∈ K2. Then, by (15),
Since f is continuous this implies that f(x*) = 0.
Suppose that case 2 happens. Once the direction dk was computed, in the step 2 of the Algorithm 1, one tests the inequality
If this inequality does not hold, one tests the inequality
The first trial points at (16)-(17) are α+ = α = 1. Since αk= 0, there exists k0 ∈ K1 such that αk < 1 for all k ∈ K1 , k > k0. Therefore, for those iterations k, there are steps and that do not satisfy (16) and (17) respectively for which
So, for all k ∈ K1 , k > k0, we have that
Since |αk| ∈ [σmin, σmax], we have that (18) implies
and (19) implies
where
The inequality (20) implies that
So,
By Proposition 2.4, {f (xk)} is a sequence bounded above by a constant C > 0. Thus,
which implies that
So,
By the Mean Value Theorem, there exists ξk∈ [0, 1] such that
Since Lm is finite there exists j0∈ {0, ..., Lm 1}, K2⊂ K1 such that, for all k ∈ K2
Thus, we have that
Now, if σk > 0 for infinitely many indices k ∈ K2 the last inequality implies that, for k ∈ K2, k > k0
Using (21) and proceeding in the same way, we obtain that, for k ∈ K2, k > k0,
for some ξ'ke [0, 1].
Since {σk} and {f (xk)} are bounded we have that {||dk||} is bounded.
Then, using that → 0, → 0, and taking limits in (23) and (24), we obtain that
If σk < 0 for infinitely many indices, proceeding in an analogous way, we deduce the same equation as we wanted to prove. □
Remark 3.2. We have proved that there exists an index j0∈ {0, . . . , Lm 1 } such that the gradient of ||F(x)||2 at x* is orthogonal to the residual ET0F(x*). Note that when m = n, the result of Theorem 3.1 coincides with the convergence result obtained in [11]. Thus, we can say that DF-SAUNE Algorithm is an extension of DF-SANE.
Corollary 3.3.Assume that x* is a limit point of the sequence {xk}k∈K generated by Algorithm 3 and suppose that for all j ∈ { , . . . , Lm 1 } e have that {J(x*)v, v) = 0 for all v ∈ Rn, v = 0. Then F(x*) = 0.
4 Derivative-Free Quasi Newton method using the Broyden update formula (DF-QNB)
In this section we will define a Quasi Newton method based on the Broyden update formula for the matrices with the derivative-free line search defined in Algorithm 1 and we will establish the convergence results.
We will define the search direction as an approximate solution to the linear system Bkd = F (xk) and we will use the nonsquare rank one Broyden update formula:
where Sk = xk+1 xk and yk = F (xk+1) F (xk).
As previously stated, the aim is to solve the linear system accurately when it is possible and in an approximate way in any other case, as considered in [14].
Algorithm 4. DF-QNB
Given x0, B0 e Rmxn, F(x0), 0 < y < 1, 0 < θ0 < 1, A > 0, 0 < τmin < τmax < 1, 0< ∈ < 1.
Set k ← 0.
Step1:If ||F(xk)|| < || ∈ max{1, ||F(x0)||} stop.
Step 2: Coputing the direction dk
Step 2.1: Find d such that
If such direction d is found, define dk = d, Qk+1 = Qk and go to Step 3.
Step 2.2:Find a solution d of min || + F (xk) ||.
If d satisfies
define dk = d, θk+1 = θk and go to Step 3.
Else, set dk = 0, xk+1 = xk, θk+1 = and go to Step 5.
Step 3:Find ak and xk+1 = xk + αkdkusing Algorithm 1.
Step 4:Update Bk+1 using (25).
Step 5:Update k ← k + 1 and go to Step 1.
In [14], the subproblems in Step 2 were defined using two different matrices.
Remark 4.1. When m = n the Algorithm 4 is similar to the Algorithm defined in [13], in the sense that both of them use the Broyden update formula. In [13] the authors solve the linear system (2) by using the update (3). We acknowledge that, in large scale problems, it is difficult to solve linear systems, even when they do have a solution. Having that in mind we have tried to find an approximate solution in the sense of (27) for those cases.
The following theorems establish the necessary hypotheses for obtaining the convergence results.
Theorem 4.2.Assume that Algorithm 4 generates an infinite sequence {xk}. Suppose that
and there exists k0 e N such that, for all k > k0, θk = < 1. Then, every limit point of {xk}k∈K is a solution of (1), where K is given by (8).
Proof. Let x* be a limit point of {xk}k∈K, then there exists K1⊂ K such that lim xk = x*.
We know that
We will consider two cases: a)
Let suppose the first case. Then, we have that
(1) First we assume that in the process of the Algorithm 4, the direction dk was calculated finitely times solving the linear system Bkd + F(xk) = 0.
Then, there exists k1∈ K1 such that ∀k> k1, k ∈ K1, dk verifies the formula ||Bkd + F(xk)|| < θk||F(xk) = 0. Thus, for all k ∈ K1, k > max {k0, k1} we have that
This implies that
Taking limits for k ∈ K1, k > max{k0, k1} in the last expression, and using the continuity of F and J and (28)-(29), we obtain that
Since < 1, we have that ||F(x*)||= 0 as we wanted to prove.
(2) Let assume now that the direction dk was obtained infinitely many times solving the linear system Bkd + F(xk) = 0.
Then, there exists K2 ⊂ K1 such that for all k ∈ K2 we have that Bkdk + F(xk) = 0.
Thus, for all k ∈ K2,
Taking limits for k ∈ K2, k > k0 and using the continuity of F and J and (28)-(29), we obtain that
Let suppose the case b). Since the sequence {dk}k∈N is bounded (and then it is bounded in K1), there exists K2⊂ K1 and ∈ Rn such that dk.
In the line search, to choose the step αk the algorithm DF-QNB tests the following inequalities
The initial values of α+ and α- are 1. Since α k = 0 , there exists ∈ K2 such that αk < 1 for all k ∈ K2, k > . Thus, for those iterations k there exist steps and that do not satisfy (30) and So we have that
Considering (32), the following inequality holds
Since we obtain that
By the Mean Value Theorem there exists such that
Considering (33), we have that
Since we obtain that
By the Mean Value Theorem there exists such that
Taking limits in (34) and (35) when k → ∞, k ∈ K2, we obtain that
Thus,
Now, as we did before, we have to consider two cases: (i) the direction dk was calculated finitely many times solving the linear system Bkd + F (xk) = 0; and (ii) dk was obtained infinitely many times solving the linear system Bkd + F(xk) = 0.
Proceeding in analogous way as we did when and using (28) and (36) we obtain that
as we wanted to prove.
Observe that, according to Proposition 2.4, we have that ||F(xk)|| is bounded. Thus, if
we obtain the hypothesis (28). The last condition is known as a necessary and sufficient condition for obtaining q-superlinear convergence of classical Quasi Newton methods [6].
If we can not guarantee that then we can prove the following result which is similar to one obtained in [14].
Theorem 4.3. Suppose that, in Algorithm 4, 0k is increased infinitely many times and define
Assume that
Then, every limit point x* of the sequence {xk}k∈K* is a solution of (1) or it is a global minimizer of ||F(x*) + J(x*)(x x*)||.
Proof. If F(x*) = 0 we are done. Let us assume that ||F(x*)|| > 0. Suppose that x* is not a global minimizer of ||F(x*) + J(x*)(x x*)||, therefore, there exists d such that ||d|| and
thus
By (37) and the continuity of F and J, we have that
for all k large enough k ∈ K*. But, since ||d|| and xk = x* we have that, for all large enough k ∈ K*, | x* xk + d|| < Δ. So, (38) contradicts the fact that: θk→ 1 and a direction verifying (27) can be obtained.
A point x* that is a global minimizer of the function ||F (x*) + J (x*)(x x*)|| can be viewed as the solution of the linear least squares problem
where A = J(x*) and b = F(x*). The linear function is the affine model of the function F around x *. We can not expect, in general, to find x* such that F(x*) = 0 since the problem could not have a solution. Likewise, we can not expect to find x* such that F(x*) + J(x*)(x x*) = 0 since this is an underdetermined linear system of equations and J(x*) could not have full rank. Because of that, it seems reasonable to find a global minimizer of (39) when the problem has no solutions.
4.1 Analysis of the case M = 1
In this subsection we analyze the case in which the derivative-free line search used in Algorithm DF-QNB considers M = 1. By the presence of rk the line search is still nonmonotone but it imposes an almost monotone behavior of the merit function when xk is close to a solution. Thus, M = 1 is plausible considering we are working with a Quasi Newton method. M = 1, as pointed out in [11], could be inconvenient for the spectral residual method because it performs highly nonmonotone even in a neighborhood of an isolated solution.
Next, we will demonstrate that, in this case, our algorithm verifies the assumption (28).
Firstly, under this case, using Proposition 2.5 it can be observed that
Secondly, it will be convenient to define the matrix
Thus, and
Finally, we can use the result that appears in Lemma 2.6 of [13] and that we present here for completeness.
Lemma 4.4 (Lemma 2.6, [13]). Let us suppose that the set is bounded and that J(x) is Lipschitz continuous in Ω. If (40) is verified then
where
In particular, there is a subsequence of {pk} tending to zero.
Thus, we can prove the following convergence result.
Theorem 4.5.Assume that Algorithm 4 generates an infinite sequence {xk}, that M = 1 in the line search and that the hypotheses of Lemma 4.4 hold. Then, if there exists k0such that θk = < 1 for all k > kj, there is a limit point x * of {xk}k∈N that is a solution of (1).
Proof. By Proposition 2.5 we have that (40) holds. Thus, by Lemma 4.4, there is a subset ⊂ N such that
Let x* be a limit point of the subsequence . Note that
By Proposition 2.5 we have that || F(xk)|| is bounded. Thus, taking limit when k ∈ , k → ∞ and using that {dk} is bounded and (41), we obtain that ||(Ak+1 Bk)dk|| → 0. Also, by definition of Ak+1 we obtain that ||(J(xk) Ak+1)dk|| → 0. Thus, we prove that (28) happens for k ∈ K and the proof follows from Theorem 4.2. □
Observe that this particular line search improves the results of Theorem 4.2.
5 Numerical experiments
In this section we present some computational results obtained with a Fortran 77 implementation of DF-SAUNE and DF-QNB algorithms. All experiments were run on a personal computer with INTEL(R) Core (TM) 2 Duo CPU E8400 at 3.00 GHz and 3.23 GB of RAM.
As it is usual in derivative-free optimization articles we are interested in the number of function evaluations for both codes. We also report for medium size problems the CPU time obtained for both algorithms and we include a comparison with NEWUOA algorithm developed by M.J.D. Powell for unconstrained optimization [21]. In these experiments, NEWUOA algorithm solves the least squares problem: min ||F(x)||2.
5.1 Test problems
The problems used for these numerical experiments, of the form F(x) = 0, F : Rn→ W, m < n, were a set of problems defined by feasible sets of nonlinear programming problems in Hock and Schittkowski [10], where the number of variables ranges from 2 to 10, and the number of equations from 1 to 4. Also, conceiving tests for larger dimension problems, we tested some problems described in [5]. Some of the test problems analyzed here have been previously examined in [7] for problems with bound constraints and using derivatives in order to achieve similar results to those pursued here; that is, to solve under-determined nonlinear systems.
In Table 1 we show the data of the problems. In column 1 we show the number of the problem, in column 2 the source of the problem and in the last columns the number of equations (m) and variables (n).
Initial points were the same as in the cited references.
Remark 5.1. The case (a) in Problems 2 and 4 of [5] corresponds to the use of the initial point x0(1 : n) = 2. The case (b) in Problem 2 and 4 of [5] corresponds to the use of x0(1 : n) = 150 as initial point.
5.2 Implementation
In the implementations of DF-SAUNE and DF-QNB algorithms:
1. The parameters for Algorithm 1 were:
-
for small size problems.
-
for medium size problems.
2. The parameters for DF-SAUNE Algorithm were:
3. The parameters for DF-QNB Algorithm were:
4. In DF-QNB, the first B0 matrix was computed by finite differences as an aproximation to the Jacobian matrix in x0.
5. In DF-QNB, we have used ACCIM algorithm described in [27] to find a solution to Bkd = F(xk). For solving the least squares problem we have used BVLS algorithm described in [28].
6. The stopping condition for the two new algorithms was: ||F(xk)|| < 106 max{1, ||F(x0)||}.
7. The maximum number of function evaluations allowed was:
-
MAXFE = 5000, for small size problems,
-
MAXFE = 15000, when
n = 150,
-
MAXFE = 30000, when
n = 300.
For DF-QNB method we have also added the required evaluations to calculate the initial matrix B0.
The implementation of NEWUOA is the original version of M.J.D. Powell [21] with its stopping criterion, that is, the algorithm stops when the trust-region radius is lower than a tolerance ρend = 106. As previously mentioned, NEWUOA algorithm solved the least squares problem min ||F(x)||2 in our trials.
5.3 Numerical results
In Table 2 we show the results obtained by DF-SAUNE (DF-S) and DF-QNB (DF-B) algorithms taking into account the final value ||F (xend)|| and the number function evaluations. The results correspond to the stopping criterion satisfaction or to internal conditions that do not allow further improvement.
These results illustrate DF-QNB effectiveness, although DF-SAUNE has also a satisfactory behavior. We can see in 15 out of the 20 test problems DF-QNB used less function evaluations than DF-SAUNE. It is also worth mentioning that when DF-QNB requires more function evaluations than DF-SAUNE such difference becomes particulary significant. These problems are too small to consider useful showing CPU time readings.
In problem 5, we have seen that, in many interations the norm of the solution dk of the linear system Bkd + F(xk) = 0 is bigger than A and, in those iterations, DF-QNB has to solve subproblem (27). Because of that, the decrease of the merit function ||F(x)|| is very slow and the algorithm requires many functional evaluations.
In problem 19, DF-SAUNE, stopped without obtaining a satisfactory decrease in the residual when reaching the maximum number of function evaluations allowed. We believe that the bad perfomance of DF-SAUNE it is related to the strategy used to define the matrices Ej when . We think that it is an interesting future work to study a better strategy to complete the last matrix Ej in that specific case.
In Table 3 we show the number of iterations, number of function evaluations and the CPU time in seconds required by DF-SAUNE, DF-QNB and NEWUOA algorithms running the medium size problems 21, 22, 23, 24, 25 and 26. In that table we indicate the number of equations (m) and variables (n) of the problems.
Firstly, an overall review of the numerical results shows that final residual values are similar for all tested methods.
Secondly, we observe DF-SAUNE performed a more significant amount of function evaluations in the last two problems. It should be highlighted that CPU
time for DF-SAUNE was always substantially shorter than the one for DF-QNB. The reason is that DF-QNB algorithm has to solve a linear system of equations or a least squares problem in every iteration.
Finally we ran the well known NEWUOA solver in order to measure the number of function evaluations of our algorithms. Although NEWUOA was designed to solve unconstrained optimization problems, we can conclude that our methods performs satisfactorily.
6 Conclusions
Many practical optimization methods require specific algorithms for obtaining feasible points at every iteration. Thus, our aim in this paper was to define algorithms capable of dealing with the feasible set defined by nonsquare system of equations. We present two derivative-free algorithms that exploit this structure: one of them is based on the spectral residual approach and the other on the Broyden Quasi Newton method. The algorithms can be viewed as generalizations of the algorithms defined in [11] and [13], the last one combined with [14].
From a theoretical point of view we were able to obtain some convergence results. Under usual assumptions on the Jacobian matrix we establish global convergence of the scheme that uses the spectral residual idea. This convergence result can be seen as the underdetermined counterpart of the result presented in [11] for the square case.
For the Broyden Quasi Newton method we obtain global convergence under a Dennis Moré type condition. We have shown that this condition can be dropped out for a particular line search. It remains a challenge to reduce the restrictive hypotheses required in Theorem 4.3 of Section 4.
Numerical experiments suggest that the algorithms behave as expected. We consider both approaches are promising even though we also believe that it is necessary to test a more challenging set of problems in order to decide which is more suitable. Furthermore, such decision could depend on the requirements of each user.
Since many nonlinear programming problems have also box constraints, future research will consider the extension of this type of algorithms to solve under-determined nonlinear systems with bound constraints.
Acknowledgements. The authors are indebted to two anonymous referees and to the Guest Editor for their careful reading of this paper.
Received: 15/VIII/10.
Accepted: 05/I/11.
#CAM-323/10.
- [1] J. Abadie, J. Carpentier, Generalization of the Wolfe Reduced-Gradient Method to the Case of Nonlinear Constraints. Optimization, Edited by R. Fletcher, Academic Press, New York (1968), 3717.
- [2] J. Barzilai and J.M. Borwein, Two-point stepsize gradient methods. IMA J. of Numerical Analysis, 8 (1988), 141-148.
- [3] R.H.Bielschowsky and F.A.M. Gomes, Dynamic control of infeasibility in equalitty constrained optimization. SIAM J. on Opt., 19 (2008), 1299-1325.
- [4] E.G. Birgin, N. Krejic and J.M. Martínez, Globally convergent inexact quasiNewton methods for solving nonlinear systems. Numerical Algorithms, 32 (2003), 249-260.
- [5] N. Dan, N. Yamashita and M. Fukushima, Convergence properties of inexact Levenberg-Marquardt method under local error bound conditions. Opt. Methods and Software, 17 (2002), 605-626.
- [6] J.E. Dennis and R.B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice-Hall, Englewood Cliffs, 1983.
- [7] J.B. Francisco, N. Krejic and J.M. Martinez, An interior point method for solving box-constrained underdetermined nonlinear systems. J. of Comp. and Applied Math., 177 (2005), 67-88.
- [8] N.I.M. Gould and Ph. L. Toint, Nonlinear programming without a penalty function or a filter. Math. Prog., 122 (2010), 155-196.
- [9] L. Grippo, F. Lampariello and S. Lucidi, Anonmonotone line search technique for Newton's method. SIAM J. on Numerical Analysis, 23 (1986), 707-716.
- [10] W. Hock and K. Schittkowski, Test Examples for Nonlinear Programming Codes Springer Series Lectures Notes in Economics Mathematical Systems, (1981).
- [11] W. La Cruz, J.M. Martinez and M. Raydan, Spectral residual method without gradient information for solving large-scale nonlinear systems of equations. Mathematics of Computation, 75 (2006), 1429-1448.
- [12] L.S. Lasdon, Reduced Gradient methods. Nonlinear Optimization 1981, edited by M.J.D. Powell, Academic Press, New York (1982), 235-242.
- [13] D.H. Li and M. Fukushima, A derivative-free line search and global convergence of Broyden-like method for nonlinear equations. Opt. Meth. and Software, 13 (2000), 181-201.
- [14] J.M. Martínez, Quasi-Inexact Newton methods with global convergence for solving constrained nonlinear systems. Nonlinear Analysis, 30 (1997), 1-7.
- [15] J.M. Martínez, Inexact Restoration method with Lagrangian tangent decrease and new merit function for nonlinear programming. J. of Opt. Theory and Applications, 111 (2001), 39-58.
- [16] J.M. Martinez, Two-Phase Model Algorithm with Global Convergence for Nonlinear Programming. J. of Opt. Theory and Applications, 96 (1998), 397-436.
- [17] J.M. Martinez and E.A. Pilotta, Inexact Restoration algorithms for constrained optimization. J. of Opt. Theory and Applications, 104 (2000), 135-163.
- [18] A. Miele, H.Y. Huang and J.C. Heideman, Sequential Gradient-Restoration Algorithm for the Minimization of Constrained Functions, Ordinary and Conjugate Gradient Version. J. of Opt. Theory and Applications, 4 (1969), 213-246.
- [19] A. Miele, A.V. Levy and E.E. Cragg, Modifications and Extensions of the Conjugate-Gradient Restoration Algorithm for Mathematical Programming Problems. J. of Opt. Theory and Applications, 7 (1971), 450-72.
- [20] A. Miele, E.M. Sims and V.K. Basapur, Sequential Gradient-Restoration Algorithm for Mathematical Programming Problems with Inequality Constraints, Part 1, Theory. Rice University, Aero-Astronautics Report No. 168 (1983).
- [21] M.J.D. Powell, The NEWUOA software for unconstrained optimization without derivatives. Nonconvex Opt. and its Applications, 83 (2006), 255-297.
- [22] M. Rom and M. Avriel, Properties of the Sequential Gradient-Restoration Algorithm (SGRA), Part 1: Introduction and Comparison with Related Methods. J. of Opt. Theory and Applications, 62 (1989), 77-98.
- [23] M. Rom and M. Avriel Properties of the Sequential Gradient-Restoration Algorithm (SGRA), Part 2: Convergence Analysis J. of Opt. Theory and Applications, 62 (1989), 99-126.
- [24] J.B. Rosen, The Gradient Projection Method for Nonlinear Programming, Part 1, Linear Constraints. SIAM J. on Applied Math., 8 (1960), 181-217.
- [25] J.B. Rosen, The Gradient Projection Method for Nonlinear Programming, Part 2, Nonlinear Constraints. SIAM J. on Applied Math., 9 (1961), 514-532.
- [26] J.B.Rosen, Two-Phase Algorithm for Nonlinear Constraint Problems. Nonlinear Prog. 3, Edited by O.L. Mangasarian, R.R. Meyer and S.M. Robinson, Academic Press, London and New York (1978), 97-124.
- [27] H.D. Scolnik, N. Echebest, M.T. Guardarucci and M.C. Vacchino, Incomplete Oblique Projections for Solving Large Inconsistent Linear Systems. Math. Prog. B, 111 (2008), 273-300.
- [28] P.B. Stark and R.L. Parker, Bounded Variable Least Squares: An Algorithm and Applications. J. Computational Statistics, 10 (1995), 129-141.
- [29] H.F. Walker and L.T. Watson, Least-Change secant update methods for under-determined systems. SIAM J. on Numerical Analysis, 27 (1990), 1227-1262.
Publication Dates
-
Publication in this collection
22 Mar 2011 -
Date of issue
2011
History
-
Received
15 Aug 2010 -
Accepted
05 Jan 2011