Two derivative-free methods for solving underdetermined nonlinear systems of equations

Echebest, N.; Schuverdt, M. L.; Vignau, R. P.

doi:10.1590/S1807-03022011000100011

Abstract

In this paper, two different approaches to solve underdetermined nonlinear system of equations are proposed. In one of them, the derivative-free method defined by La Cruz, Martínez and Raydan for solving square nonlinear systems is modified and extended to cope with the underdetermined case. The other approach is a Quasi-Newton method that uses the Broyden update formula and the globalized line search that combines the strategy of Grippo, Lampariello and Lucidi with the Li and Fukushima one. Global convergence results for both methods are proved and numerical experiments are presented.

underdetermined nonlinear systems; Quasi-Newton method; derivative-free line search; spectral step length; global convergence

Two derivative-free methods for solving underdetermined nonlinear systems of equations

N. Echebest^I; M. L. Schuverdt^{II, *}; R. P. Vignau^I

^IDepartment of Mathematics, FCE, University of La Plata CP 172, 1900 La Plata Bs. As., Argentina

^IICONICET, Department of Mathematics, FCE, University of La Plata CP 172, 1900 La Plata Bs. As., Argentina. E-mails: schuverd@mate.unlp.edu.ar / opti@mate.unlp.edu.ar / vignau@mate.unlp.edu.ar

ABSTRACT

In this paper, two different approaches to solve underdetermined nonlinear system of equations are proposed. In one of them, the derivative-free method defined by La Cruz, Martínez and Raydan for solving square nonlinear systems is modified and extended to cope with the underdetermined case. The other approach is a Quasi-Newton method that uses the Broyden update formula and the globalized line search that combines the strategy of Grippo, Lampariello and Lucidi with the Li and Fukushima one. Global convergence results for both methods are proved and numerical experiments are presented.

Mathematical subject classification: Primary: 65H10; Secondary: 90C57.

Key words: underdetermined nonlinear systems, Quasi-Newton method, derivative-free line search, spectral step length, global convergence.

1 Introduction

We consider the problem of determining a solution x* ∈ Rⁿ that verifies the nonlinear system of equations

where F : Rⁿ→ R^m is a continuously differentiable function and m < n, making special emphasis when m < n. We are interested in systems for which the Jacobian matrix of F, denoted by J(x), is not available or requires a prohibitive amount of storage. This situation is common when functional values come from experimental measures, for example: from physics, chemistry or economics.

Such kind of problems also appears as the feasible set of general nonlinear programming problems. Our main motivation is the application of the proposed methods as subalgorithms for finding feasible points in derivative-free nonlinear programming algorithms such as, for example two-phases algorithms [3, 8, 16, 26], feasible methods [1, 12, 18, 19, 20, 22, 23, 24, 25] and inexact restoration methods [15, 17].

The resolution of square nonlinear systems without using derivatives has been addressed using the spectral residual approach in [11] and the Broyden Quasi Newton approach in [13]. Some ideas of those papers are incorporated in the present work. In [11], the authors defined the derivative-free spectral algorithm for nonlinear equations called DF-SANE. Furthermore, global convergence was proved by using a derivative-free line search that combines the nonmonotone strategy of Grippo, Lampariello and Lucidi [9] with the Li and Fukushima scheme [13].

In [13], a Quasi Newton method based on the Broyden update formula and on a nonmonotone derivative-free line search was defined for the square case. Also in [4] an inexact Quasi-Newton method was introduced with a similar line search technique to the one introduced in [13] and using B_k = J(x_k) periodically. Under appropriate hypotheses, global and superlinear convergence were proved in both papers.

In the present paper we define two approaches for the nonsquare system of equations based on the ideas appearing in [11] and [13]. The proposed algorithms use a generalization of the derivative-free nonmonotone line search defined in [11]. The search direction in [11] is computed using the residual vector F (x_k) and the spectral coefficient [2]. For the underdetermined system, the current point is computed by considering a fixed number of L_m directions that are the solution of some appropriate square systems using a spectral coefficient under the idea explained in [11]. The direction in the second algorithm is computed as an approximate solution of a linear system using the nonsquare Broyden update formula for matrices. It is well established in the literature that, for the square case, this is the most used secant approximation to the Jacobian and it works very well locally [6]. As we mentioned before, the Broyden update formula has been previously used by Li and Fukushima and combined with a derivative-free line search for the square case. In [13] the current direction is the solution to the linear system

and the update matrix B_k+1 is defined as

where y_k = F (x_k) F (x_k1), s_k = x_k x_k1 and the parameter β_k is chosen such that |βk 1| < < 1 for which B_k+1 is nonsingular. In this paper, when there is a solution of the linear system (2) we use such solution as a search direction and, when there is none, we solve the linear system approximately as proposed in [14]. Consequently, we avoid the necessity of choosing the parameter β_k.

Under appropriate hypotheses, global convergence of the sequence generated by both methods will be proved. The global convergence result obtained for the algorithm based on the spectral ideas extends the convergence result in [11]. For the Quasi Newton method that uses the Broyden update formula, we obtain convergence using a Dennis-Moré type condition. We show that this condition can be dropped out for a particular derivative-free line search. Thus, both algorithms can be viewed as extensions of the well known methods defined in [11] and [13] for square nonlinear systems.

We consider the usual continuously differentiable merit function f: Rⁿ → R, which consists in a measure of the residual F(x) at x, f(x) = ||F(x)||².

The iterative algorithms generate a sequence {x_k}, for k = 1, 2, ..., starting from a given initial point x₀. A subsequence of {x_k} will be indicated by _{x^k}k_∈_K where K is some infinite index set.

This paper is organized as follows. In Section 2 we present the algorithm that performs the derivative-free line search and we establish there some of its properties. In Section 3 we define DF-SAUNE Algorithm, a modification of DF-SANE Algorithm of [11] for handling efficiently problem (1) and we prove the global convergence results. In Section 4 we define the Quasi Newton method using the Broyden update formula and the derivative-free line search introduced in Section 2. We analyze the conditions under which it is possible to obtain global convergence. Both algorithms are tested and the numerical results are presented in Section 5. Finally, some conclusions are drawn in Section 6.

Notation.

||.|| denotes the Euclidean norm.
Given a matrix

B ∈

R^m

^×

ⁿ,

N (

B) denotes the null space of the matrix

B.
For

i = 1, ...,

n; e_i is the canonical vector in

Rⁿ.
I_n denotes the identity matrix in

Rⁿ

^×

ⁿ.

2 The nonmonotone line search without derivatives

In this section we shall be concerned with the nonmonotone derivative-free line search that will be used in the methods defined in the following sections. As we mentioned before, the strategy is based on the line search proposed in [11]. Given the current iterate x_k and a search direction d_k, the algorithm looks for a steplength α_k such that

where M is a nonnegative integer, 0 < γ < 1 and η_k = η < ∞.

This procedure combines the well known nonmonotone line search strategy for unconstrained optimization introduced by Grippo, Lampariello and Lucidi [9]:

with the Li-Fukushima derivative-free line search scheme [13]:

The combined strategy (4) produces a robust nonmonotone derivative-free line search that takes into account the advantages of both schemes. The line search (4) is strongly based on the fact that the search direction comes from the residual vector. For details we refer to [11]. In such paper this strategy was implemented and tested using an extensive set of numerical experiments which showed its competitiveness for square systems of nonlinear equations.

In this paper, given a general search direction d_k, based on (5) and (6), we consider the following line search condition:

where ||d_k||² takes the place of f(x_k) in (4), obtaining a more general strategy. For completeness, we establish here the implemented process.

Algorithm 1. Nonmonotone line search

Given d ∈ Rⁿ, 0 < τ_min < τ_max < 1, 0 < γ < 1, M ∈ N, {η_k} such that η_k > 0 and η_k = η < ∞

Step 1:Compute = max {f(x_k), ..., f (x_{max{0, k-M+1})}}

α₊ = α_- = 1

Step 2:If f(x_k + α₊d) < + η_{k -}γ||d||²,

define dk = d, α_k = α₊, x_k+1 = x_k + α_kd_k

else if f (x_k - α_-d) < + η _k - γ || d ||²,

define d_k = -d, α_k = a_-, x_k+₁ = x_k + α_kd_k

else

choose α+new ∈ [τminα+, τmaxα+], α-new ∈ [τminα-, τmaxα-] α+ = α+new, α- = α-new and go to step 2

Proposition 2.1.Algorithm 1 is well defined.

Proof. See Proposition 1 of [11].

The new algorithms for solving (1) will follow the next procedure.

Algorithm 2. General Algorithm

Given x₀, F(x₀), M ∈ N, 0 < τ_min < τ_max< 1, 0 < ∈ < 1, 0 < γ < 1, {η _k} such that η_k > 0 and η_k = η = ∞.

Set k ← 0

Step 1:If ||F(x_k)|| < ∈ max{1, ||F(x0)||} stop.

Step 2: Compute a search direction d_k.

Step 3:Find αk and x_k+1 = x_k + α_kd_k using Algorithm 1.

Update k ← k + 1 and go to Step 1.

By considering the procedure above it is possible to obtain results (see below) that will be used in the next sections for obtaining the convergence results. The proofs of Propositions 2.2 and 2.3 below follow from Propositions 2 and 3 in [11] updated for the line search condition (7). We establish them here for completeness.

Proposition 2.2.For all k ∈ N consider

and define ν(k) ∈ {(k 1)M + 1, ..., kM} the index for which f(x_v(k)) = U_k. Then for all k = 1, 2, ...

where η =

Proof. We have that

then

Thus, by an induction argument we obtain:

for l = 1, 2, ....

Since ν(k + 1) ∈ {kM + 1, ..., kM + M}

Thus, for all k = 1 , 2, . . . we have that

as we wanted to prove.

Proposition 2.3.

Proof. For all k = 1, 2, ... we have

Writing the last inequality for k = 1, 2, ..., L and adding these L inequalities we obtain

Therefore

Thus, the series is convergent and

as claimed.

Proposition 2.4.The sequence {x_k} generated by the General Algorithm is contained in

Proof. From Proposition 2.2 we have that, for all k > 1,

Therefore, as we wanted to prove.

The results obtained up to here depend strongly on the line search technique without taking into account the way in which the direction d_k in the step 2 of the Algorithm 2 was computed.

From now on we will consider the set

and from Proposition 2.3 we have that

Observe that from the proof of Propositions 2.3 and 2.4 we obtain the following result:

Proposition 2.5.If we take M = 1 in Algorithm 1 then

the sequence {x

_k} generated by the General Algorithm is contained in
the series

is convergent.

3 Derivative-Free Spectral Algorithm for solving Underdetermined Nonlinear Equations (DF-SAUNE)

In this section, we develop a derivative-free method based on the algorithm DF-SANE [11] updated for the underdetermined case. DF-SANE is a derivative-free method for solving square nonlinear systems that uses the ndimensional residual vector as a search direction together with a spectral step length and a globalization strategy that produces a nonmonotone process. The spectral coefficient is the inverse of an approximation of the Rayleigh quotient with respect to a secant approximation of the Jacobian:

where see [2, 11].

The iterative process in DF-SANE can be viewed as a Quasi Newton method considering the matrix together with the iteration where α_k is computed using the derivative-free nonmonotone strategy (4).

In order to solve the underdetermined case we propose to combine the idea of the augmented Jacobian algorithm, see for example [29], with the spectral residual vector explained above. In order to do that we consider L_m as the ceil number of , that is,

For each j = 0, ..., L_m  1 we define the matrices E_j ∈ R^mxn as follows. If then E_j is the matrix whose rows are the m canonical vectors in Rⁿ. If then for j = 0, ..., L_m  2, E_j is the same matrix defined above and E_Lm _-1 is the matrix whose rows are the m canonical vectors in Rⁿ.

Given x_k, for j = 0, ..., L_m  1 we consider the direction d_k+j ∈ Rⁿ as the solution to the square linear system

where E_j was defined above, V_j ∈ R⁽ⁿ^m^)xn is the matrix whose rows are the canonical vectors in Rⁿ that span N (Ej) and σ_k+j is a spectral coefficient. Thus we can observe that d_k+j = σ_k+j (x_k+j) and

Once the direction d_k+j is obtained, the line search is performed using Algorithm 1.

Observe that the system (11) is a square system that resembles the one used in DF-SANE for obtaining its current direction.

Given an arbitrary initial point x₀e Rⁿ, the algorithm that allows us to obtain the next iterate is given below:

Algorithm 3. DF-SAUNE

Given x0, F(x0), 0 < y < 1, 0 < τmin < τmax < 1, σ₀= 1, 0 < σmm < σmax < ∞, 0 < € < 1.

Set k ← 0.

Main Step:Given x_k, F(x_k), σ_k.

(1) If ||F(x_k)|| < ∈ max{1, ||F(x₀)||}, stop.

(2) For j = 0 : L_m  1

Compute dk+j = Ok+j

F(xk+j)
Find aj and xk+j+1 = xk+j + ajdk+j using the Algorithm 1.
If ||F(xk+j+1)|| < ∈ max{1, ||

F(

x

₀)||

}, stop.
Compute s = xk+j+1 xk+j, y = F (xk+j+1)

 F (xk+j) and σ
If |σ

| ∈ [σ

_min, σ

_max]

define σ

_k+j+1 = 0. If not, choose σ

_k+j+1such that |σ

k+j+1|

e [σmm, σmax]

End

(3) Update k ← k + L_m and repeat the main step.

End

By (9) we have that

Thus, using (12) and considering that each |σ_k| ∈ [σ_min, σ_max] we obtain that

In the following Theorem we prove the main convergence result associated to DF-SAUNE Algorithm. The proof follows the idea of the Theorem 1 in [11] updated for the Algorithm 3.

Theorem 3.1.Assume that {xk}k∈N is the sequence generated by Algorithm 3. Then, for every limit point x* of {x_k}k∈_K there exists an index j_o∈ {0, ..., L_m 1} such that

Proof. Let x* be a limit point of {x_k}_keK. Thus, there is an infinite index set K₁ c K such that

Then, by (13),

We have two possibilities:

(1) The sequence {α_k}_k∈_K₁ does not tend to zero;

(2) The sequence {α_k}_k∈_K₁ tends to zero.

In the first case there exists an infinite sequence of indices K₂⊂ K₁ such that α_k> c > 0 for all k ∈ K₂. Then, by (15),

Since f is continuous this implies that f(x^*) = 0.

Suppose that case 2 happens. Once the direction dk was computed, in the step 2 of the Algorithm 1, one tests the inequality

If this inequality does not hold, one tests the inequality

The first trial points at (16)-(17) are α+ = α = 1. Since αk= 0, there exists k0 ∈ K1 such that αk < 1 for all k ∈ K1 , k > k0. Therefore, for those iterations k, there are steps and that do not satisfy (16) and (17) respectively for which

So, for all k ∈ K1 , k > k₀, we have that

Since |α_k| ∈ [σ_min, σ_max], we have that (18) implies

and (19) implies

where

The inequality (20) implies that

So,

By Proposition 2.4, {f (x_k)} is a sequence bounded above by a constant C > 0. Thus,

which implies that

So,

By the Mean Value Theorem, there exists ξ_k∈ [0, 1] such that

Since L_m is finite there exists j₀∈ {0, ..., L_m  1}, K₂⊂ K₁ such that, for all k ∈ K2

Thus, we have that

Now, if σ_k > 0 for infinitely many indices k ∈ K₂ the last inequality implies that, for k ∈ K2, k > k₀

Using (21) and proceeding in the same way, we obtain that, for k ∈ K2, k > k0,

for some ξ'_ke [0, 1].

Since {σ_k} and {f (x_k)} are bounded we have that {||d_k||} is bounded.

Then, using that → 0, → 0, and taking limits in (23) and (24), we obtain that

If σ_k < 0 for infinitely many indices, proceeding in an analogous way, we deduce the same equation as we wanted to prove. □

Remark 3.2. We have proved that there exists an index j₀∈ {0, . . . , L_m  1 } such that the gradient of ||F(x)||² at x* is orthogonal to the residual E^T₀F(x*). Note that when m = n, the result of Theorem 3.1 coincides with the convergence result obtained in [11]. Thus, we can say that DF-SAUNE Algorithm is an extension of DF-SANE.

Corollary 3.3.Assume that x^* is a limit point of the sequence {x_k}k∈K generated by Algorithm 3 and suppose that for all j ∈ { , . . . , L_m  1 } e have that {J(x*)v, v) = 0 for all v ∈ Rⁿ, v = 0. Then F(x*) = 0.

4 Derivative-Free Quasi Newton method using the Broyden update formula (DF-QNB)

In this section we will define a Quasi Newton method based on the Broyden update formula for the matrices with the derivative-free line search defined in Algorithm 1 and we will establish the convergence results.

We will define the search direction as an approximate solution to the linear system B_kd = F (x_k) and we will use the nonsquare rank one Broyden update formula:

where Sk = xk+1 xk and yk = F (xk+1) F (xk).

As previously stated, the aim is to solve the linear system accurately when it is possible and in an approximate way in any other case, as considered in [14].

Algorithm 4. DF-QNB

Given x₀, B₀ e R^mxn, F(x₀), 0 < y < 1, 0 < θ₀ < 1, A > 0, 0 < τ_min < τmax ^< ¹^, ⁰^< ∈ ^< ¹^.

Set k ← 0.

Step1:If ||F(xk)|| < || ∈ max{1, ||F(x₀)||} stop.

Step 2: Coputing the direction d_k

Step 2.1: Find d such that

If such direction d is found, define dk = d, Qk+1 = Qk and go to Step 3.

Step 2.2:Find a solution d of min || + F (x_k) ||.

If d satisfies

define d_k = d, θ_k+1 = θ_k and go to Step 3.

Else, set dk = 0, xk+1 = xk, θk+1 = and go to Step 5.

Step 3:Find ak and xk+1 = xk + αkd_kusing Algorithm 1.

Step 4:Update Bk+1 using (25).

Step 5:Update k ← k + 1 and go to Step 1.

In [14], the subproblems in Step 2 were defined using two different matrices.

Remark 4.1. When m = n the Algorithm 4 is similar to the Algorithm defined in [13], in the sense that both of them use the Broyden update formula. In [13] the authors solve the linear system (2) by using the update (3). We acknowledge that, in large scale problems, it is difficult to solve linear systems, even when they do have a solution. Having that in mind we have tried to find an approximate solution in the sense of (27) for those cases.

The following theorems establish the necessary hypotheses for obtaining the convergence results.

Theorem 4.2.Assume that Algorithm 4 generates an infinite sequence {x^k}. Suppose that

and there exists k⁰ e N such that, for all k > k⁰, θk = < 1. Then, every limit point of {x_k}k∈_K is a solution of (1), where K is given by (8).

Proof. Let x* be a limit point of {x_k}_k∈_K, then there exists K₁⊂ K such that lim xk = x^*.

We know that

We will consider two cases: a)

Let suppose the first case. Then, we have that

(1) First we assume that in the process of the Algorithm 4, the direction dk was calculated finitely times solving the linear system B_kd + F(x_k) = 0.

Then, there exists k₁∈ K₁ such that ∀_k> k₁, k ∈ K₁, d_k verifies the formula ||B_kd + F(x_k)|| < θ_k||F(x_k_{) = 0}_.Thus, for all k ∈ K₁, k > max {k0, k1} we have that

This implies that

Taking limits for k ∈ K₁, k > max{k₀, k₁} in the last expression, and using the continuity of F and J and (28)-(29), we obtain that

Since < 1, we have that ||F(x*)||= 0 as we wanted to prove.

(2) Let assume now that the direction d_k was obtained infinitely many times solving the linear system B_kd + F(xk) = 0.

Then, there exists K2 ⊂ K1 such that for all k ∈ K2 we have that Bkdk + F(xk) = 0.

Thus, for all k ∈ K2,

Taking limits for k ∈ K₂, k > k₀ and using the continuity of F and J and (28)-(29), we obtain that

Let suppose the case b). Since the sequence {d_k}_k∈N is bounded (and then it is bounded in K₁), there exists K₂⊂ K₁ and ∈ Rⁿ such that d_k.

In the line search, to choose the step α_k the algorithm DF-QNB tests the following inequalities

The initial values of α+ and α- are 1. Since α _k = 0 , there exists ∈ K₂such that α_k < 1 for all k ∈ K₂, k > . Thus, for those iterations k there exist steps and that do not satisfy (30) and So we have that

Considering (32), the following inequality holds

Since we obtain that

By the Mean Value Theorem there exists such that

Considering (33), we have that

Since we obtain that

By the Mean Value Theorem there exists such that

Taking limits in (34) and (35) when k → ∞, k ∈ K₂, we obtain that

Thus,

Now, as we did before, we have to consider two cases: (i) the direction dk was calculated finitely many times solving the linear system B_kd + F (x_k) = 0; and (ii) d_k was obtained infinitely many times solving the linear system Bkd + F(xk) = 0.

Proceeding in analogous way as we did when and using (28) and (36) we obtain that

as we wanted to prove.

Observe that, according to Proposition 2.4, we have that ||F(x_k)|| is bounded. Thus, if

we obtain the hypothesis (28). The last condition is known as a necessary and sufficient condition for obtaining q-superlinear convergence of classical Quasi Newton methods [6].

If we can not guarantee that then we can prove the following result which is similar to one obtained in [14].

Theorem 4.3. Suppose that, in Algorithm 4, 0k is increased infinitely many times and define

Assume that

Then, every limit point x* of the sequence {x_k}k∈_K* is a solution of (1) or it is a global minimizer of ||F(x*) + J(x*)(x x*)||.

Proof. If F(x^*) = 0 we are done. Let us assume that ||F(x^*)|| > 0. Suppose that x^* is not a global minimizer of ||F(x^*) + J(x^*)(x x^*)||, therefore, there exists d such that ||d|| and

thus

By (37) and the continuity of F and J, we have that

for all k large enough k ∈ K*. But, since ||d|| and x_k = x* we have that, for all large enough k ∈ K*, | x^* xk + d|| < Δ. So, (38) contradicts the fact that: θ_k→ 1 and a direction verifying (27) can be obtained.

A point x* that is a global minimizer of the function ||F (x*) + J (x*)(x x*)|| can be viewed as the solution of the linear least squares problem

where A = J(x*) and b = F(x*). The linear function is the affine model of the function F around x *. We can not expect, in general, to find x* such that F(x^*) = 0 since the problem could not have a solution. Likewise, we can not expect to find x* such that F(x*) + J(x*)(x x*) = 0 since this is an underdetermined linear system of equations and J(x^*) could not have full rank. Because of that, it seems reasonable to find a global minimizer of (39) when the problem has no solutions.

4.1 Analysis of the case M = 1

In this subsection we analyze the case in which the derivative-free line search used in Algorithm DF-QNB considers M = 1. By the presence of r_k the line search is still nonmonotone but it imposes an almost monotone behavior of the merit function when xk is close to a solution. Thus, M = 1 is plausible considering we are working with a Quasi Newton method. M = 1, as pointed out in [11], could be inconvenient for the spectral residual method because it performs highly nonmonotone even in a neighborhood of an isolated solution.

Next, we will demonstrate that, in this case, our algorithm verifies the assumption (28).

Firstly, under this case, using Proposition 2.5 it can be observed that

Secondly, it will be convenient to define the matrix

Thus, and

Finally, we can use the result that appears in Lemma 2.6 of [13] and that we present here for completeness.

Lemma 4.4 (Lemma 2.6, [13]). Let us suppose that the set is bounded and that J(x) is Lipschitz continuous in Ω. If (40) is verified then

where

In particular, there is a subsequence of {pk} tending to zero.

Thus, we can prove the following convergence result.

Theorem 4.5.Assume that Algorithm 4 generates an infinite sequence {x_k}, that M = 1 in the line search and that the hypotheses of Lemma 4.4 hold. Then, if there exists k₀such that θk = < 1 for all k > k_j, there is a limit point x * of {x_k}k∈N that is a solution of (1).

Proof. By Proposition 2.5 we have that (40) holds. Thus, by Lemma 4.4, there is a subset ⊂ N such that

Let x* be a limit point of the subsequence . Note that

By Proposition 2.5 we have that || F(x_k)|| is bounded. Thus, taking limit when k ∈ , k → ∞ and using that {d_k} is bounded and (41), we obtain that ||(A_k+₁ B_k)d_k|| → 0. Also, by definition of A_k+₁ we obtain that ||(J(x_k) A_k+₁)d_k|| → 0. Thus, we prove that (28) happens for k ∈ K and the proof follows from Theorem 4.2. □

Observe that this particular line search improves the results of Theorem 4.2.

5 Numerical experiments

In this section we present some computational results obtained with a Fortran 77 implementation of DF-SAUNE and DF-QNB algorithms. All experiments were run on a personal computer with INTEL(R) Core (TM) 2 Duo CPU E8400 at 3.00 GHz and 3.23 GB of RAM.

As it is usual in derivative-free optimization articles we are interested in the number of function evaluations for both codes. We also report for medium size problems the CPU time obtained for both algorithms and we include a comparison with NEWUOA algorithm developed by M.J.D. Powell for unconstrained optimization [21]. In these experiments, NEWUOA algorithm solves the least squares problem: min ||F(x)||².

5.1 Test problems

The problems used for these numerical experiments, of the form F(x) = 0, F : Rⁿ→ W, m < n, were a set of problems defined by feasible sets of nonlinear programming problems in Hock and Schittkowski [10], where the number of variables ranges from 2 to 10, and the number of equations from 1 to 4. Also, conceiving tests for larger dimension problems, we tested some problems described in [5]. Some of the test problems analyzed here have been previously examined in [7] for problems with bound constraints and using derivatives in order to achieve similar results to those pursued here; that is, to solve under-determined nonlinear systems.

In Table 1 we show the data of the problems. In column 1 we show the number of the problem, in column 2 the source of the problem and in the last columns the number of equations (m) and variables (n).

Thumbnail

Initial points were the same as in the cited references.

Remark 5.1. The case (a) in Problems 2 and 4 of [5] corresponds to the use of the initial point x0(1 : n) = 2. The case (b) in Problem 2 and 4 of [5] corresponds to the use of x₀(1 : n) = 150 as initial point.

5.2 Implementation

In the implementations of DF-SAUNE and DF-QNB algorithms:

1. The parameters for Algorithm 1 were:

for small size problems.
for medium size problems.

2. The parameters for DF-SAUNE Algorithm were:

3. The parameters for DF-QNB Algorithm were:

4. In DF-QNB, the first B0 matrix was computed by finite differences as an aproximation to the Jacobian matrix in x₀.

5. In DF-QNB, we have used ACCIM algorithm described in [27] to find a solution to B_kd = F(xk). For solving the least squares problem we have used BVLS algorithm described in [28].

6. The stopping condition for the two new algorithms was: ||F(xk)|| < 10⁶ max{1, ||F(x0)||}.

7. The maximum number of function evaluations allowed was:

MAX_FE = 5000, for small size problems,
MAX_FE = 15000, when

n = 150,
MAX_FE = 30000, when

n = 300.

For DF-QNB method we have also added the required evaluations to calculate the initial matrix B₀.

The implementation of NEWUOA is the original version of M.J.D. Powell [21] with its stopping criterion, that is, the algorithm stops when the trust-region radius is lower than a tolerance ρ_end = 10⁶. As previously mentioned, NEWUOA algorithm solved the least squares problem min ||F(x)||² in our trials.

5.3 Numerical results

In Table 2 we show the results obtained by DF-SAUNE (DF-S) and DF-QNB (DF-B) algorithms taking into account the final value ||F (x_end)|| and the number function evaluations. The results correspond to the stopping criterion satisfaction or to internal conditions that do not allow further improvement.

Thumbnail

Table 2

These results illustrate DF-QNB effectiveness, although DF-SAUNE has also a satisfactory behavior. We can see in 15 out of the 20 test problems DF-QNB used less function evaluations than DF-SAUNE. It is also worth mentioning that when DF-QNB requires more function evaluations than DF-SAUNE such difference becomes particulary significant. These problems are too small to consider useful showing CPU time readings.

In problem 5, we have seen that, in many interations the norm of the solution dk of the linear system B_kd + F(xk) = 0 is bigger than A and, in those iterations, DF-QNB has to solve subproblem (27). Because of that, the decrease of the merit function ||F(x)|| is very slow and the algorithm requires many functional evaluations.

In problem 19, DF-SAUNE, stopped without obtaining a satisfactory decrease in the residual when reaching the maximum number of function evaluations allowed. We believe that the bad perfomance of DF-SAUNE it is related to the strategy used to define the matrices Ej when . We think that it is an interesting future work to study a better strategy to complete the last matrix Ej in that specific case.

In Table 3 we show the number of iterations, number of function evaluations and the CPU time in seconds required by DF-SAUNE, DF-QNB and NEWUOA algorithms running the medium size problems 21, 22, 23, 24, 25 and 26. In that table we indicate the number of equations (m) and variables (n) of the problems.

Thumbnail

Firstly, an overall review of the numerical results shows that final residual values are similar for all tested methods.

Secondly, we observe DF-SAUNE performed a more significant amount of function evaluations in the last two problems. It should be highlighted that CPU

time for DF-SAUNE was always substantially shorter than the one for DF-QNB. The reason is that DF-QNB algorithm has to solve a linear system of equations or a least squares problem in every iteration.

Finally we ran the well known NEWUOA solver in order to measure the number of function evaluations of our algorithms. Although NEWUOA was designed to solve unconstrained optimization problems, we can conclude that our methods performs satisfactorily.

6 Conclusions

Many practical optimization methods require specific algorithms for obtaining feasible points at every iteration. Thus, our aim in this paper was to define algorithms capable of dealing with the feasible set defined by nonsquare system of equations. We present two derivative-free algorithms that exploit this structure: one of them is based on the spectral residual approach and the other on the Broyden Quasi Newton method. The algorithms can be viewed as generalizations of the algorithms defined in [11] and [13], the last one combined with [14].

From a theoretical point of view we were able to obtain some convergence results. Under usual assumptions on the Jacobian matrix we establish global convergence of the scheme that uses the spectral residual idea. This convergence result can be seen as the underdetermined counterpart of the result presented in [11] for the square case.

For the Broyden Quasi Newton method we obtain global convergence under a Dennis Moré type condition. We have shown that this condition can be dropped out for a particular line search. It remains a challenge to reduce the restrictive hypotheses required in Theorem 4.3 of Section 4.

Numerical experiments suggest that the algorithms behave as expected. We consider both approaches are promising even though we also believe that it is necessary to test a more challenging set of problems in order to decide which is more suitable. Furthermore, such decision could depend on the requirements of each user.

Since many nonlinear programming problems have also box constraints, future research will consider the extension of this type of algorithms to solve under-determined nonlinear systems with bound constraints.

Acknowledgements. The authors are indebted to two anonymous referees and to the Guest Editor for their careful reading of this paper.

Received: 15/VIII/10.

Accepted: 05/I/11.

#CAM-323/10.

[1] J. Abadie, J. Carpentier, Generalization of the Wolfe Reduced-Gradient Method to the Case of Nonlinear Constraints. Optimization, Edited by R. Fletcher, Academic Press, New York (1968), 3717.
[2] J. Barzilai and J.M. Borwein, Two-point stepsize gradient methods. IMA J. of Numerical Analysis, 8 (1988), 141-148.
[3] R.H.Bielschowsky and F.A.M. Gomes, Dynamic control of infeasibility in equalitty constrained optimization. SIAM J. on Opt., 19 (2008), 1299-1325.
[4] E.G. Birgin, N. Krejic and J.M. Martínez, Globally convergent inexact quasiNewton methods for solving nonlinear systems. Numerical Algorithms, 32 (2003), 249-260.
[5] N. Dan, N. Yamashita and M. Fukushima, Convergence properties of inexact Levenberg-Marquardt method under local error bound conditions. Opt. Methods and Software, 17 (2002), 605-626.
[6] J.E. Dennis and R.B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice-Hall, Englewood Cliffs, 1983.
[7] J.B. Francisco, N. Krejic and J.M. Martinez, An interior point method for solving box-constrained underdetermined nonlinear systems. J. of Comp. and Applied Math., 177 (2005), 67-88.
[8] N.I.M. Gould and Ph. L. Toint, Nonlinear programming without a penalty function or a filter. Math. Prog., 122 (2010), 155-196.
[9] L. Grippo, F. Lampariello and S. Lucidi, Anonmonotone line search technique for Newton's method. SIAM J. on Numerical Analysis, 23 (1986), 707-716.
[10] W. Hock and K. Schittkowski, Test Examples for Nonlinear Programming Codes Springer Series Lectures Notes in Economics Mathematical Systems, (1981).
[11] W. La Cruz, J.M. Martinez and M. Raydan, Spectral residual method without gradient information for solving large-scale nonlinear systems of equations. Mathematics of Computation, 75 (2006), 1429-1448.
[12] L.S. Lasdon, Reduced Gradient methods. Nonlinear Optimization 1981, edited by M.J.D. Powell, Academic Press, New York (1982), 235-242.
[13] D.H. Li and M. Fukushima, A derivative-free line search and global convergence of Broyden-like method for nonlinear equations. Opt. Meth. and Software, 13 (2000), 181-201.
[14] J.M. Martínez, Quasi-Inexact Newton methods with global convergence for solving constrained nonlinear systems. Nonlinear Analysis, 30 (1997), 1-7.
[15] J.M. Martínez, Inexact Restoration method with Lagrangian tangent decrease and new merit function for nonlinear programming. J. of Opt. Theory and Applications, 111 (2001), 39-58.
[16] J.M. Martinez, Two-Phase Model Algorithm with Global Convergence for Nonlinear Programming. J. of Opt. Theory and Applications, 96 (1998), 397-436.
[17] J.M. Martinez and E.A. Pilotta, Inexact Restoration algorithms for constrained optimization. J. of Opt. Theory and Applications, 104 (2000), 135-163.
[18] A. Miele, H.Y. Huang and J.C. Heideman, Sequential Gradient-Restoration Algorithm for the Minimization of Constrained Functions, Ordinary and Conjugate Gradient Version. J. of Opt. Theory and Applications, 4 (1969), 213-246.
[19] A. Miele, A.V. Levy and E.E. Cragg, Modifications and Extensions of the Conjugate-Gradient Restoration Algorithm for Mathematical Programming Problems. J. of Opt. Theory and Applications, 7 (1971), 450-72.
[20] A. Miele, E.M. Sims and V.K. Basapur, Sequential Gradient-Restoration Algorithm for Mathematical Programming Problems with Inequality Constraints, Part 1, Theory. Rice University, Aero-Astronautics Report No. 168 (1983).
[21] M.J.D. Powell, The NEWUOA software for unconstrained optimization without derivatives. Nonconvex Opt. and its Applications, 83 (2006), 255-297.
[22] M. Rom and M. Avriel, Properties of the Sequential Gradient-Restoration Algorithm (SGRA), Part 1: Introduction and Comparison with Related Methods. J. of Opt. Theory and Applications, 62 (1989), 77-98.
[23] M. Rom and M. Avriel Properties of the Sequential Gradient-Restoration Algorithm (SGRA), Part 2: Convergence Analysis J. of Opt. Theory and Applications, 62 (1989), 99-126.
[24] J.B. Rosen, The Gradient Projection Method for Nonlinear Programming, Part 1, Linear Constraints. SIAM J. on Applied Math., 8 (1960), 181-217.
[25] J.B. Rosen, The Gradient Projection Method for Nonlinear Programming, Part 2, Nonlinear Constraints. SIAM J. on Applied Math., 9 (1961), 514-532.
[26] J.B.Rosen, Two-Phase Algorithm for Nonlinear Constraint Problems. Nonlinear Prog. 3, Edited by O.L. Mangasarian, R.R. Meyer and S.M. Robinson, Academic Press, London and New York (1978), 97-124.
[27] H.D. Scolnik, N. Echebest, M.T. Guardarucci and M.C. Vacchino, Incomplete Oblique Projections for Solving Large Inconsistent Linear Systems. Math. Prog. B, 111 (2008), 273-300.
[28] P.B. Stark and R.L. Parker, Bounded Variable Least Squares: An Algorithm and Applications. J. Computational Statistics, 10 (1995), 129-141.
[29] H.F. Walker and L.T. Watson, Least-Change secant update methods for under-determined systems. SIAM J. on Numerical Analysis, 27 (1990), 1227-1262.

Publication Dates

Publication in this collection
22 Mar 2011
Date of issue
2011

History

Received
15 Aug 2010
Accepted
05 Jan 2011

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

[1] [1] J. Abadie, J. Carpentier, Generalization of the Wolfe Reduced-Gradient Method to the Case of Nonlinear Constraints. Optimization, Edited by R. Fletcher, Academic Press, New York (1968), 3717.

[2] [2] J. Barzilai and J.M. Borwein, Two-point stepsize gradient methods. IMA J. of Numerical Analysis, 8 (1988), 141-148.

[3] [3] R.H.Bielschowsky and F.A.M. Gomes, Dynamic control of infeasibility in equalitty constrained optimization. SIAM J. on Opt., 19 (2008), 1299-1325.

[4] [4] E.G. Birgin, N. Krejic and J.M. Martínez, Globally convergent inexact quasiNewton methods for solving nonlinear systems. Numerical Algorithms, 32 (2003), 249-260.

[5] [5] N. Dan, N. Yamashita and M. Fukushima, Convergence properties of inexact Levenberg-Marquardt method under local error bound conditions. Opt. Methods and Software, 17 (2002), 605-626.

[6] [6] J.E. Dennis and R.B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice-Hall, Englewood Cliffs, 1983.

[7] [7] J.B. Francisco, N. Krejic and J.M. Martinez, An interior point method for solving box-constrained underdetermined nonlinear systems. J. of Comp. and Applied Math., 177 (2005), 67-88.

[8] [8] N.I.M. Gould and Ph. L. Toint, Nonlinear programming without a penalty function or a filter. Math. Prog., 122 (2010), 155-196.

[9] [9] L. Grippo, F. Lampariello and S. Lucidi, Anonmonotone line search technique for Newton's method. SIAM J. on Numerical Analysis, 23 (1986), 707-716.

[10] [10] W. Hock and K. Schittkowski, Test Examples for Nonlinear Programming Codes Springer Series Lectures Notes in Economics Mathematical Systems, (1981).

[11] [11] W. La Cruz, J.M. Martinez and M. Raydan, Spectral residual method without gradient information for solving large-scale nonlinear systems of equations. Mathematics of Computation, 75 (2006), 1429-1448.

[12] [12] L.S. Lasdon, Reduced Gradient methods. Nonlinear Optimization 1981, edited by M.J.D. Powell, Academic Press, New York (1982), 235-242.

[13] [13] D.H. Li and M. Fukushima, A derivative-free line search and global convergence of Broyden-like method for nonlinear equations. Opt. Meth. and Software, 13 (2000), 181-201.

[14] [14] J.M. Martínez, Quasi-Inexact Newton methods with global convergence for solving constrained nonlinear systems. Nonlinear Analysis, 30 (1997), 1-7.

[15] [15] J.M. Martínez, Inexact Restoration method with Lagrangian tangent decrease and new merit function for nonlinear programming. J. of Opt. Theory and Applications, 111 (2001), 39-58.

[16] [16] J.M. Martinez, Two-Phase Model Algorithm with Global Convergence for Nonlinear Programming. J. of Opt. Theory and Applications, 96 (1998), 397-436.

[17] [17] J.M. Martinez and E.A. Pilotta, Inexact Restoration algorithms for constrained optimization. J. of Opt. Theory and Applications, 104 (2000), 135-163.

[18] [18] A. Miele, H.Y. Huang and J.C. Heideman, Sequential Gradient-Restoration Algorithm for the Minimization of Constrained Functions, Ordinary and Conjugate Gradient Version. J. of Opt. Theory and Applications, 4 (1969), 213-246.

[19] [19] A. Miele, A.V. Levy and E.E. Cragg, Modifications and Extensions of the Conjugate-Gradient Restoration Algorithm for Mathematical Programming Problems. J. of Opt. Theory and Applications, 7 (1971), 450-72.

[20] [20] A. Miele, E.M. Sims and V.K. Basapur, Sequential Gradient-Restoration Algorithm for Mathematical Programming Problems with Inequality Constraints, Part 1, Theory. Rice University, Aero-Astronautics Report No. 168 (1983).

[21] [21] M.J.D. Powell, The NEWUOA software for unconstrained optimization without derivatives. Nonconvex Opt. and its Applications, 83 (2006), 255-297.

[22] [22] M. Rom and M. Avriel, Properties of the Sequential Gradient-Restoration Algorithm (SGRA), Part 1: Introduction and Comparison with Related Methods. J. of Opt. Theory and Applications, 62 (1989), 77-98.

[23] [23] M. Rom and M. Avriel Properties of the Sequential Gradient-Restoration Algorithm (SGRA), Part 2: Convergence Analysis J. of Opt. Theory and Applications, 62 (1989), 99-126.

[24] [24] J.B. Rosen, The Gradient Projection Method for Nonlinear Programming, Part 1, Linear Constraints. SIAM J. on Applied Math., 8 (1960), 181-217.

[25] [25] J.B. Rosen, The Gradient Projection Method for Nonlinear Programming, Part 2, Nonlinear Constraints. SIAM J. on Applied Math., 9 (1961), 514-532.

[26] [26] J.B.Rosen, Two-Phase Algorithm for Nonlinear Constraint Problems. Nonlinear Prog. 3, Edited by O.L. Mangasarian, R.R. Meyer and S.M. Robinson, Academic Press, London and New York (1978), 97-124.

[27] [27] H.D. Scolnik, N. Echebest, M.T. Guardarucci and M.C. Vacchino, Incomplete Oblique Projections for Solving Large Inconsistent Linear Systems. Math. Prog. B, 111 (2008), 273-300.

[28] [28] P.B. Stark and R.L. Parker, Bounded Variable Least Squares: An Algorithm and Applications. J. Computational Statistics, 10 (1995), 129-141.

[29] [29] H.F. Walker and L.T. Watson, Least-Change secant update methods for under-determined systems. SIAM J. on Numerical Analysis, 27 (1990), 1227-1262.

Brasil

Brasil

Two derivative-free methods for solving underdetermined nonlinear systems of equations

Abstract

Publication Dates

History