Matrix differential equations and inverse preconditioners

Chehab, Jean-Paul

Abstract

In this article, we propose to model the inverse of a given matrix as the state of a proper first order matrix differential equation. The inverse can correspond to a finite value of the independent variable or can be reached as a steady state. In both cases we derive corresponding dynamical systems and establish stability and convergence results. The application of a numerical time marching scheme is then proposed to compute an approximation of the inverse. The study of the underlying schemes can be done by using tools of numerical analysis instead of linear algebra techniques only. With our approach, we recover some known schemes but also introduce new ones. We derive in addition a masked dynamical system for computing sparse inverse approximations. Finally we give numerical results that illustrate the validity of our approach.

matrix differential equation; numerical schemes; numerical linear algebra; preconditioning

Matrix differential equations and inverse preconditioners

Jean-Paul Chehab

Laboratoire de Mathématiques Paul Painlevé, UMR 8524, Université de Lille 1, Bât. M2, 59655 Villeneuve d'Ascq cedex, France; and Laboratoire de Mathématiques, UMR 8628, Equipe Analyse Numérique et EDP, Bât. 425, Université Paris XI, 91405 Orsay cedex, France, E-mail: chehab@math.univ-lille1.fr

ABSTRACT

In this article, we propose to model the inverse of a given matrix as the state of a proper first order matrix differential equation. The inverse can correspond to a finite value of the independent variable or can be reached as a steady state. In both cases we derive corresponding dynamical systems and establish stability and convergence results. The application of a numerical time marching scheme is then proposed to compute an approximation of the inverse. The study of the underlying schemes can be done by using tools of numerical analysis instead of linear algebra techniques only. With our approach, we recover some known schemes but also introduce new ones. We derive in addition a masked dynamical system for computing sparse inverse approximations. Finally we give numerical results that illustrate the validity of our approach.

Mathematical subject classification: 65F10, 65F35, 65L05, 65L12, 65L20, 65N06.

Key words: matrix differential equation, numerical schemes, numerical linear algebra, preconditioning.

1 Introduction

The modern methods we have at our disposal for solving linear systems of equations such as the preconditioned versions of GMRES [19] or BI-CGSTAB [21], are robust and apply to many situations; they are intensively used for the numerical solution of large sparse linear systems coming out from PDE discretisation. For this reason the preconditioning is still now a central topic in numerical linear algebra since it is the common universal approach to accelerate the solution of a linear system by an iterative method. Preconditioning a matrix practically leads to improve its spectral properties, by, e.g., concentrating the spectrum of the preconditioned matrix. There is not an efficient general method for building a preconditioner for a given nonsingular matrix : a large number of approaches have been developed depending on the properties of the considered matrix, surveys are presented, e.g., in [4, 19].

Let P be a n × n regular matrix and b Î ⁿ. The preconditioning of the numerical solution of the linear system

(with a descent method) consists in solving at each step of the iterative process, an additional linear system

where K is the preconditioning matrix. Of course, the additional computation carried by the solution of (1.2) is convenient when this system is easy to solve and when K (respectively K^-1) resembles P (resp. P^-1).

When the preconditioning of (1.1) is obtained by approaching P, system (1.2) must be easy to solve. In the other case, the approximation of P^-1, which defines the so-called inverse preconditioner, leads to a trivial solution of (1.2).

Inverse preconditioners can be built in many ways: by minimizing an objective functional (the Frobenius norm of the residual, [10]), by incomplete sparse factorization [9], or also by building proper convergent sequences, see [5, 10] in which the authors have presented sequences generated by descent methods such as MINRES or Newton-like schemes. The polynomial preconditioning, which consists in approaching the inverse of a matrix by a proper polynomial, has been developed and implemented for parallel computers, see [19].

In this paper we propose to model the inverse of a regular matrix as a state of a first order Matrix Differential Equation (MDE). This state can correspond to the solution of the MDE for a finite value of the independent variable, but also to an equilibrium point, depending on the equation. In such a way, the implementation of any numerical integration can produce an approximation of P^-1, say an inverse preconditioner of P. The underlying schemes involve at least one multiplication of two matrices at each iterations so the terms of the sequence of inverse preconditioners become denser even if the matrix P is sparse. However, it is possible to derive a masked dynamical system which preserve a given density pattern making our method suitable for computing sparse inverse preconditioners. From a technical point of view, the advantage of the dynamical system approach is to use the classical tools of numerical analysis of differential equations for studying the processes.

This approach is very flexible since the construction of the numerical scheme is subjected to the choice of the modelling ODE and of a time marching scheme; it allows also to study the schemes by using classical mathematical tools of ODE analysis and of numerical analysis of ODEs [15, 16, 20]. We mention that the use of differential equation modeling for solving systems of equations, including linear algebra problems, was considered in other situations: in [14, 17] for generating flows of matrices that preserve eigenvalues, singular values; in [8] for generating fixed point methods, the solution being defined as a stable steady state; in [13] for computing the square root of a matrix, by integrating a Riccati matrix differential equation (see also R. Bellman's book [3], chap 10).

The article is organized as follows: in Section 2 we study a Riccati differential equation whose solution is the inverse at finite time of one of the data; the derivation, the stability analysis and the study of approximation scheme is given. In Section 3 we consider Matrix differential equations for which one of the steady states is the inverse of a datum of the equation. In Section 4 we concentrate on the construction of sparse inverse preconditioners by considering the numerical integration of a so-called masked differential equation, when a sparsity pattern is fixed; error estimates are derived. Finally in Section 5 we give numerical illustration on the solution of linear systems when using the approximate inverses built in the previous sections.

2 Inverse at finite time

2.1 Derivation of the equation

Let P(t) and Q(t) be two square matrices, depending on the scalar variable t which belongs to an interval I. We assume that the coefficients of both P and Q are differentiable functions of t. We have

Assume that P(t) is regular, i.e. invertible, for all t in I and consider the particular situation Q(t) = P^-1(t), "t Î I. Then we have

So,

for all t Î I. If P(t) is supposed to be known, then Q(t) can be computed by integrating the differential matrix equation:

Q is hence the solution of a matrix Riccati differential equation.

Let now P be a regular n × n matrix and Id, the n × n identity matrix. Now, the basic idea consists in defining P(t) as a simple path function of regular matrices between P(0) easy to invert (Q(0) = P^-1(0)) and P(1) = P. We consider the function

Assume that P(t) is invertible for all t in [0,1]. The Matrix Q(t) = P^-1(t) satisfies the Cauchy problem

and P^-1 = Q(1); we assume that [0,1] Ì .

We have the following result:

Lemma 1. P(t) is regular for all t in [0,1] iff SP(P) Ì ²\{(t,0), t < 0} where SP(P) denotes the spectrum of P.

Proof. The eigenvalues of P(t) are the numbers

S(t) = (1 - t) + tl ¹ 0 , l Î SP(P).

Taking the real and the imaginary parts of this expression, we have

f₁(t) = (1 - t) + tÂ(l), f₂(t) = tÁ(l).

Let us look to necessary and sufficient conditions for having f₁(t) = f₂(t) = 0 for same t. By continuity, it is easy to see that f₁(t) = 0 if and only if Â(l) < 0. f₂(t) vanishes for t = 0 (but P(0) = Id) or for Á(l) = 0.

In conclusion S(t) vanishes if and only if there exists l Î SP(P) such that Â(l) < 0 and Á(l) = 0.

Particularly, Lemma 1 applies when P is positive definite, such as, e.g., discretization matrices of elliptic operators.

Remark 1. We can consider P(t) = (1 - t)P₀ + tP with P₀ a preconditioner of P. Of course, in this case, Lemma 1 applies replacing P by P. Same considerations can be made with a more general path

P(t) = (1 - f(t))P₀ + f(t)P,

with

f(t) : [0,1] ® [0,1], f Î C¹([0,1]), f'(t) > 0, t Î ]0,1[.

2.2 Stability results

Let us now give some notations and technical results which will be used along the article.

2.2.1 Matrix norms

Let M be a n × n matrix. We denote by ||M|| any matrix norm of M and particularly, ||M||₂ and ||M||_F the 2-norm and the Frobenius norm of M, respectively. We shall use also the notation ||v||₂ for the 2 norm of a vector of ⁿ, there will be no ambiguity in practice.

2.2.2 Hadamard Matrix Product

We denote by R * M the Hadamard product of R and M:

(M * R)_i,j = R_i,jM_i,j.

2.2.3 Matrix scalar product

We will use the following scalar product:

which coincide with the sum of the coefficient of the Hadamard product of R and M. We also use the euclidean scalar product of vector of ⁿ that we note by < · , · > .

We begin with the following very simple but useful technical result:

Lemma 2. Let R and S be two n × n matrices. We have the inequalities

(i)

(ii)

(iii) ||S||₂< ||S||_F.

Proof. Assertion (i) follows from a simple application of Cauchy-Schwarz inequality.

Let us prove (ii). We have

We now take the sum of these terms for i,j = 1,...n. We obtain

Assertion (iii) is classical and obtained by applying Cauchy-Schwarz inequality to , for v Î ⁿ, ||v||₂ = 1.

At this point, we can establish a stability result:

Proposition 1. Assume that Id - P

satisfy the assumptions of Lemma 1. We set S(t) = (P - P₀)Q(t), where Q(t) solves the equation

Assume that ||S(0)||_F < 1. Then S(t) exists for all t in [0,1] and

Proof. Multiplying on the left each term of (2.7) by P - P₀, we obtain

We now take the Hadamard product of each term with S(t), and consider the sum of all indices i,j = 1, ..., n. We find

Hence,

< y(t), where y(t) is the solution of the differential equation

We find

Since ||S(0)||_F < 1, y(t) remains bounded and y(t) < y(1).

Another stability result can be derived when assuming both P and P₀ to be symmetric, positive definite (SPD). More precisely we have the next result:

Lemma 3. Assume that both P and P₀ are SPD. Then Q(t) = P(t)^-1 is SPD for all t Î [0,1].

Proof. It suffices to prove that P(t) is SPD for all t Î [0,1]. The proof is straightforward starting from the definition of P(t):

P(t) = (1 - t)P₀ + tP.

2.3 Construction of an inverse preconditioner by numerical integration

Let us subdivide I = [0,1] into N subintervals of the same length d = 1/N, the step-length. The application of any (stable) time marching scheme to equation (2.4) generates a sequence Q_k, k = 1, ..., N, Q_k being an approximation of Q(k/N) . In particular, since Q(1) = P^-1, Q_N will be an inverse preconditioner for the matrix P.

For each time integration scheme, a method for computing a preconditioner is derived. We consider the following cases.

Forward Euler Scheme

We consider the sequence

We have Q_N ~ P^-1.

Second order Adams Bashforth (AB2)

We consider the sequence

We have Q_N ~ P^-1.

Of course, further methods can be derived by considering, e.g., Runge-Kutta or more general Adams-Bashforth schemes, but, in practice, it is important to find a compromise between the accuracy and the cost of the computation, since each iteration requires (at least) the multiplication of three matrices.

It is easy to see that the above schemes consist in approaching P^-1 with a polynomial of P, _N(P), whose coefficients are matrix independent. The degree of _N(P) grows exponentially with N. For instance, we have the following expressions of _N(P) when it is seen as a one variable function:

Euler's

AB2's

These polynomials are approximations of the function t

, as illustrated in .

Remark 2. Both of the numerical integration schemes given above lead to compute the approximate inverse with a polynomial of P. This is hence a polynomial preconditioning. Several approaches of polynomial preconditioning have been proposed: they are based on truncated Neumann series [11] or based on orthogonal polynomial [1], see [2, 4] for a review. However, the point of view here is different and the underlying polynomial are also different.

At this point we give a convergence result for the Euler method (scheme (2.9)). More precisely, we give a consistency error bound. We have the

Theorem 1. Assume that P - P₀ is regular and satisfies the hypothesis of Lemma 1. Let Q_N, be the approximation of Q(1) obtained by replacing

Assume that the solution of (2.7) is C². Then

Proof. We have

Let k be fixed and let t Î], [. There exists t₀Î ], t[ such that

Hence,

Therefore, we have the estimate:

where y(t) is solution of (2.8), and

The Euler scheme preserves the symmetry. In particular we can prove that for P, P₀, P - P₀ SPD, and for N large enough, the matrices Q_k generated by (2.9) are SPD for k = 0,..., N.

3 Inverse matrix as steady state

3.1 The equations

Another way to reach P^-1 is to consider differential equations for which one of the steady states is Q = P^-1. We consider the two following equations:

which is a Riccati matrix differential equation and its linearized version

In both equations P^-1 is a steady state.

Remark 3. We can also proceed as in Section 2: we consider equation (2.4) with the path function P(t):

P(t) = (1 - e^-t)P + e^-tP₀.

It is easy to see that P(t) is invertible for all t > 0 iff P satisfies the assumptions of Lemma 1, see also Remark 1. The differential equation satisfied by Q(t) is then

We now give sufficient conditions for obtaining the convergence

Q(t) = P^-1. We propose two approaches. The first one consists in deriving bounds of the Frobenius norm of the solution, assuming that the initial data is close enough to the steady state. The second one concentrates on the symmetric definite positive case.

We begin with the following result:

Proposition 2. Let Q(t) be the solution of the matrix differential equation (3.11). Assume that ||Id - PQ₀||_F < 1. Then Q(t) = P^-1.

Proof. The matrix R(t) = Id -PQ(t) satisfies the equation

Then, taking the Hadamard product of each terms with R(t) and taking the sum of all the coefficients, we obtain

By the first assertion of Lemma 2 (with S = R), we have

From the previous inequality, we infer that is bounded from below by the solution of the scalar differential equation

We have

hence the result.

This last results insures the existence of solution and the convergence to P^-1 for initial conditions closed enough to the steady state; however no other properties of P or of Q₀ are required. We now give an existence and a convergence result in the symmetric positive definite case. We have the

Proposition 3. Assume that P and Q(0) are SPD matrices. Then Q(t) is SPD for all t > 0 and Q(t) = P^-1.

Proof. If Q(t) is regular for all t, then U(t) = Q(t)^-1 satisfies the differential equation

We prove the proposition by studying U(t). We have

U(t) = (1 - e^-t)P + e^-tU₀,

from which we infer that U(t) is SPD for all t > 0. Indeed, U(t) is symmetric as sum of symmetric matrices, and for every w Î ⁿ, we have

áU(t)w, wñ = (1 - e^-t)áPw, wñ + e^-t áP₀w, wñ > 0,

since both P and P₀ are assumed to be positive definite. Furthermore, we have immediately U(t) = P. Therefore U(t) is SPD for all t > 0. In conclusion Q(t) exists and is SPD for all t > 0 and Q(t) = P^-1.

Proposition 4. Let Q(t) be the solution of (3.12). Assume P is positive definite. Then

Q(t) = P^-1. Moreover if P and Q₀ are SPD and commute with P, then Q(t) is also SPD for all t > 0.

Proof. As usual, we introduce the residual matrix R(t) = Id - PQ(t) which here satisfies the equation

whose solution is

R(t) = e^-t^PR(0).

Hence, if P is positive definite, R(t) = 0.

From the expression of R(t) we infer

Q(t) = (Id - e^-t^P)P^-1 + e^-t^PQ₀.

Q(t) is thus SPD when Q₀ and P are SPD and Q₀ and P commute.

Let us now establish the convergence in the Frobenius norm. If P is positive definite, there exists a strictly positive real number a such that

the number a possibly depending on n.

Taking the Hadamard product of each term of the differential equation and summing on all indices i,j, we get

Therefore,

By integration of each side of the last inequality, we obtain

||R(t)||_F< e^-at||R(0)||_F.

3.2 Construction of preconditioners by numerical integration

We introduce the discrete residual R_k = Id - PQ_k. The numerical integration of equation (3.11) by forward Euler's method generates the sequence R_k which satisfies the recurrence relation:

R_k₊₁ = (1 - Dt_k)R_k + Dt_k.

We remark that for Dt_k = 1, the convergence is quadratic whenever ||R₀|| < 1, where ||.|| is any matrix norm. We recover in this case the Newton method derived from the equation in one variable - r = 0, see [10].

Let us study the general case.

Theorem 2. We have the following results:

(i) D_k = Dt. Assume that

Then Q_k = P^-1.

(ii) Assume that ||R₀||_F < 1 and that

Then, ||R_k||_F = 0. Moreover the convergence is quadratic for Dt_k = 1.

(iii) Assume that P and Q₀ are symmetric, then Q_k is also symmetric for all k > 0.

Proof. From the relation

R_k₊₁ = R_k((1 - Dt)Id + Dt R_k),

we deduce that the convergence is guaranteed if r((1 - Dt)Id + Dt R_k) < 1, say if

The first condition is verified, e.g., when Q₀ = gP^T with 0 < g < . Notice that if P is positive definite, we can take Q₀ = gId with 0 < g < , Hence the assertion (i).

Let k be fixed. We have

R_k₊₁* R_k₊₁ = (1 - Dt_k)²R_k* R_k + 2Dt_k(1 - Dt_k)R_k* . + (Dt_k)² * .

Taking the sum of all the indices, we obtain

Therefore, if M_k = |1 - Dt_k| + Dt_k||R_k||_F < 1, say if ||R_k||_F < 1 and 0 < Dt_k < , then ||R_k₊₁||_F < M_k||R_k||_F with M_k < 1. The contraction holds in particular when 0 < Dt_k < 1 and it is easy to prove by induction that if ||R₀||_F < 1 then ||R_k₊₁||_F < M||R_k||_F, with M < 1. The convergence follows.

The particular case Dt_k = 1 gives directly the estimate

The convergence in Frobenius norm is then quadratic in this case if ||R₀||_F < 1. The point (ii) is proved.

Finally, if Q₀ and P are SPD, then using the relation Q_k₊₁ = (1 + Dt_k)Q_k- Dt_kQ_kPQ_k, we show easily by induction that Q_k is symmetric for all k > 0. This completes the proof.

Let us now consider the implementation of the Euler scheme to (3.12). The following sequence of matrices is generated:

We have the

Theorem 3. Assume P is positive definite and, for simplicity, that D_kt = Dt. Then

(i) If 0 < Dt < , "k > 0. Then, Q_k, the sequence generated by the scheme (3.13) converges to P^-1.

(ii) Assume in addition that P is symmetric and Q₀ is SPD. Assume that Q₀ and P commute. Then Q_k is symmetric for all k > 0. Moreover if a₀ - Dt > 0 then Q_k is SPD for all k > 0, where we have set M = ||Id - DtP||₂,

Proof. Assume first that Dt_k = Dt. Using the same notations, we have,

R_k₊₁ = (Id - DtP)R_k.

Thus, R_k® 0 if and only if 0 < Dt < .

Let us now study the convergence in the Frobenius norm. Since P is positive definite, we can define

We have

R_k₊₁* R_k₊₁ = (Id - DtP)R_k * (Id - DtP)R_k.

Hence, taking the sum of all indices, we obtain after simplifications

Therefore

Finally

which gives the (sufficient) stability condition

Now, one can show by induction that if Q₀ and P commute, then Q_k and P commute also for all k > 0. Then, proceeding also by induction, it can be shown that Q_k is symmetric for all k > 0. Notice that the condition PQ₀ = Q₀P is simply verified, e.g., with the choice Q₀ = Id.

Now we set

Let x Î ⁿ, ||x||₂ = 1. We have

But ||R_k|| < ||Id - DtP

R₀||₂ = M^k||R₀||₂. Thus

a_k+1> a_k - DtM^k||R₀||₂,

and therefore

This completes the proof.

By analogy between Euler's method and Richardson's iterations, it is natural to compute Dt_k such as minimizing ||R_k₊₁||_F. We have

R_k₊₁* R_k₊₁ = (Id - Dt_kP)R_k * (Id - Dt_kP)R_k.

Taking the sum on all indices, we obtain, after the usual simplifications

It follows that ||R_k₊₁||_F is minimized for

and we recover the iterations proposed in [10], see also Section 4.

3.3 Steepest descent-like Schemes

The computation of a steady state by an explicit scheme can be speeded up by enhancing the stability domain of the scheme since it allows to use larger time steps; in this situation the accuracy of a time marching scheme is not fundamental. We can derive more stable methods by using parametrized one step schemes and to fit the parameters, not for increasing the accuracy such as in the classical schemes (Heun's, Runge Kutta's), but for improving the stability.

For example, in [8] it was defined a method for computing iteratively fixed points with larger descent parameter starting from a specific numerical time scheme. It consists in integrating the differential equation

by the two steps scheme

Here a is a parameter to be fixed. This scheme allows a larger stability as compared to the Forward Euler scheme. More precisely, when F(U) = b - PU.

Lemma 4. Assume that P is positive definite, then the scheme is convergent iff

Of course, one can define iteratively a and Dt such as minimizing the euclidean norm of the residual, exactly as in the steepest descent method. The residual equation is

Hence

We set for convenience

a = ||r^k||², b = áPr^k, r^kñ, c = ||Pr^k||², d = áP²r^k, r^kñ, e = áP²r^k, Pr^kñ, f = áP²r^k, P²r^kñ.

||r^k⁺¹|| is minimized for the following definition of the parameters:

This gives rise to the steepest descent method derived from (3.15).

4 Sparse inverse preconditioners

The iterative processes generated by numerical integration of the differential equations require at least a product of two matrices at each iteration. Hence, at each iteration, the inverse preconditioner matrix becomes denser, even if the initial data and the matrix to invert are sparse.

We propose here a simple way to derive a dropping strategy from the numerical integration of a matrix differential equation. The notations are the same as in the previous sections. Consider the equation

Here P is a positive definite matrix so Q(x) = P^-1, a shown in section 2.

4.1 Derivation of the equations

Now, let be a n × n matrix with coefficients 0 or 1. The Hadamard product * P returns a matrix whose coefficients are those of P which have the same indices as the non null coefficients of , so is a filter matrix which selects a sparsity pattern. More precisely, we have

We assume that

_i,i = 1, i = 1, ...n, so

* Id = Id, where Id is the n × n identity matrix.

At this point, we consider the Hadamard product of each term of (4.17) with . We obtain the system

For deriving an autonomous equation with a sparse matrix S as unknown, we approach * (PQ) by * (P

Q) and we obtain the new system

The matrix S(t) is sparse for all t. Indeed, we have the

Lemma 5. The matrix equation (4.19) has a unique solution S(t) Î C¹(]0, +¥[ and

* S(t) = S(t), "t > 0.

Proof. The existence and the uniqueness of S(t) is established by using standard arguments.

We have

Hence, since * S(0) = S(0), we can write

We now will show that S(t) is an approximation of Q(t).

4.2 A priori estimates

We have the following result:

Theorem 4.

where 1 is the neutral element of the Hadamard product, (1_i,j = 1, i, j = 1, ...n).

Proof. Taking the difference of the equations (4.19) and (4.18), we get

The difference between (4.18) and (4.17) gives

From (4.20), we infer

Now, since

* (P(S -

* Q)), S -

* Q

=

P(S -

* Q)), S -

* Q

,

we can write

We let a = and we deduce from the previous equation

Here h is a strictly positive real number which will be chosen later on. We now must derive estimates for || * Q - Q||_F. From the direct integration of (4.17), we get

PQ = Id - e^-t^P(Id - PQ₀).

Therefore,

so

We then can write

||( * Q - Q)(t)||_F < ||(1 - ) * P^-1||_F||e^-t^P - Id||_F||Id - PQ₀||_F.

Substituting this last inequality in (4.24), we get

Now we choose h = a and we integrate this inequality:

Finally, summing this last estimate with ||( * Q - Q) we obtain

We conclude this section with an important remark. The error bounds that we derive do not insure that the sparse preconditioner S(t) is invertible for all t and at least for t > t₀. However, in practice, the numerical implementations of time marching schemes for computing S(t), t large, produce invertible matrices.

5 Numerical illustrations

5.1 Inverse matrix approximation

Consider the problem

We discretize this problem by second order finite differences on a n×n grid and we define P as the underlying matrix. The numerical results we present were obtained by using Matlab 6 on a cluster of Bi-processor 800 (Pentium III) at Université Paris XI, Orsay, France.

5.1.1 Integration of finite time inverse matrix differential equation

We first consider the problem with

a(x,y) = 30e^y^2-x2, b(x,y) = 50sin(72x(1-x)y) * sin(3py), n = 30

(the matrix is of size 900 × 900) and a Chebyshev Mesh in both directions. In Figure 2 we have compared the preconditioners obtained with 2 iterations of Euler (Euler(2)), of Adams-Bashforth (AB(2)), of Fourth order Runge Kutta (RK4(2)). We observe that the more accurate is the integration method, the more concentrated is the spectrum of the preconditioned matrix.

5.1.2 Sparse inverse preconditioner case

We consider here the sparse approximation of the inverse of the finite differences discretization matrix of the operator

-D + 500¶_x +20¶_y

on the domain ]0,1[² with homogeneous Dirichlet boundary conditions, on a regular grid. Here the sparsity pattern is defined by the n² × n² symmetric mask-matrix as follows

_i,j = 1 if |i - j| < 2 or if |i - j ± n| < 1

_i,j = 0 in the other cases.

In Figure 3 we have represented approximations of the inverse matrix that are obtained by a tresholding of the coefficient at the level , for different values of . This shows that a sparse approximation can be considered in this case.

As we can see in Figure 4, very few iterations are needed to obtain the convergence. Of course the residual do not converge to 0 because the approximation of the inverse is sparse. This is agree with error estimates of the continuous equations: a saturation is expected. In Figure 5, we have plotted the spectrum of the preconditioned matrix. We observe that the inverse preconditioner provides a concentration of the spectrum.

5.2 Preconditioned descent methods

The reduction of the condition number as well as the concentration of the spectrum of the preconditioned matrix allows faster convergence of descent methods.

As an illustration, we apply the explicit preconditioner computed above to the numerical solution of the convection diffusion problem. To this end, we use the preconditioned BiCgstab method [21].

The discretization matrix is the same as above. The discrete problem to solve reads

P x = b

We prepare the system by diagonal preconditioning, and we consider the equivalent problem

diag(P)^-1Px = diag(P)^-1b.

The exact solution x_e is a random vector and b = Px_e.

In Figure 6 we have represented the residual (respectively the error) versus the iteration when using Bicgstab and various preconditioned versions; the explicit preconditioners Q were here generated by, in the one hand, with two iterations of Euler, of AB2 and of RK4, and, in the other hand, with an ILU factorization with = 10^-2 as tolerance. The Euler and the AB2 preconditioners improve the convergence of the unpreconditioned method, with respective rates 2 and 3. The RK4 preconditioner is comparable to the ILU one.

In Figure 6, we have illustrated the improvement of the convergence carried by the sparse inverse preconditioner computed on the previous subsection.

6 Concluding remarks

The approach we have developed here is simple, rather general and seems to apply to a large class of matrices. The advantage of this technique is to study the underlying approximations with simple analysis tools; we recover in addition particular sequences of inverse preconditioners ([4, 10, 9]) and introduce new ones. The iterative schemes we introduced in this article are all based on approximation of the inverse by a proper polynomial: they can be considered as polynomial preconditioners in spite they are not automatically related to the ones proposed (e.g.) by [1], the point of view being here different. This suggests as a feature to analyze them by using an approach coming from the approximation theory.

The masked matrix differential equation approach allows to build simply efficients sparse inverse preconditioners for a fixed sparsity pattern. A natural next feature would be to develop dropping strategies for improving the method.

We have applied here a dynamical modeling approach to the construction of inverse preconditioners. A similar approach can be developed for the solution of linear as well as non linear systems of equations, deriving numerical schemes from special dynamical systems.

The examples we give are coming out from PDE's discretization and are rather academic, but is it a first step to be considered before developing and applying the schemes to large scales problems.

Acknowledgments. The author thanks Y. Chitour and J. Laminie, from Université Paris-Sud, for fruitful remarks.

Received: 14/III/06. Accepted: 20/X/06.

#657/06.

[1] Ashby, Manteuffel and Otto, A comparison of adaptive Chebyshev and least squares polynomial preconditioning for Hermitian positive definite linear systems, SIAM J. Sci. Stat. Comput., 13(1) (1992), 1-29.
[2] O. Axelsson, Iterative solution methods Cambridge University Press, Cambridge, 1994. xiv+654 pp.
[3] R.E. Bellman, Introduction to Matrix Analysis, Mcgraw-Hill (New York), 1970 - 2nd ed.
[4] M. Benzi, Preconditioning techniques for large linear systems: a survey. J. Comput. Phys., 182(2) (2002), 418-477.
[5] C. Brezinski, Projection Methods for Systems of Equations, North-Holland, 1997.
[6] C. Brezinski, Dynamical systems and sequence transformation, J. Phys. A: Math. Gen., 34 (2001), 10659-10669.
[7] C. Brezinski, Difference and differential equations and convergence acceleration algorithms. SIDE III-symmetries and integrability of difference equations (Sabaudia, 1998), 53-63, CRM Proc. Lecture Notes, 25, Amer. Math. Soc., Providence, RI, 2000.
[8] C. Brezinski and J.-P. Chehab, Nonlinear hybrid procedures and fixed point iterations, Numer. Func. Anal. Opt., 19(5-6) (1998), 415-487.
[9] G. Castro, J. Laminie, M. Sarrazin and A. Seghier, Factorized Sparse Inverse Preconditioner using Generalized Reflection Coefficients, Prépublication d'Orsay numéro 28 (10/7/1997).
[10] E. Chow and Y. Saad, Approximate Inverse Preconditioning for Sparse-Sparse Iterations, SIAM Journal of Scientific Computing, 19 (1998), 995-1023.
[11] Eisenstat, Ortega and Vaughan, Efficient polynomial preconditioning for the conjugate gardient method, SIAM J. Sci Stat. Comp., 11(5) (1990), 859-872.
[12] A. Cuyt and L. Wuytack, Nonlinear Methods in Numerical Analysis, North-Holland, Amsterdam, 1987.
[13] F. Dubois and A. Saidi, Unconditionally stable scheme for Riccati equation, ESAIM Proc, 8 (2000), 39-52.
[14] U. Helmke and J.B. Moore, Optimization and Dynamical Systems, Comm. Control Eng. Series, Springer, London, 1994.
[15] M.W. Hirsch and S. Smale, Differential Equations, Dynamical Systems and Linear Algebra, Academic Press, London, 1974.
[16] J.H. Hubbard and B.H. West, Differential equations. A dynamical Systems Approach. Part I: Ordinary Differential Equations, Springer Verlag, New-York, 1991.
[17] J.B. Moore, R.E. Mahony and U. Helmke, Numerical Gradient Algorithms for eigenvalue and singular value calculations, SIMAX, 15(3) (1994), 881-902.
[18] Y. Saad, Practical implementation of polynomial preconditioning for Conjugate Gradient, SIAM J. Sci. Stat. Comp., 6 (1985), 865-881.
[19] Y. Saad, Iterative Methods for Sparse Linear Systems, SIAM, 1996.
[20] A.M. Stuart, Numerical analysis of dynamical systems, in Acta Numerica, 1994, Cambridge University Press, Cambridge, 1994, pp. 467-572.
[21] H.A. Van der Vorst, Bi-CGSTAB: a fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 13 (1992), 631-644.

Publication Dates

Publication in this collection
10 May 2007
Date of issue
2007

History

Received
14 Mar 2006
Accepted
20 Oct 2006

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

[1] [1] Ashby, Manteuffel and Otto, A comparison of adaptive Chebyshev and least squares polynomial preconditioning for Hermitian positive definite linear systems, SIAM J. Sci. Stat. Comput., 13(1) (1992), 1-29.

[2] [2] O. Axelsson, Iterative solution methods Cambridge University Press, Cambridge, 1994. xiv+654 pp.

[3] [3] R.E. Bellman, Introduction to Matrix Analysis, Mcgraw-Hill (New York), 1970 - 2nd ed.

[4] [4] M. Benzi, Preconditioning techniques for large linear systems: a survey. J. Comput. Phys., 182(2) (2002), 418-477.

[5] [5] C. Brezinski, Projection Methods for Systems of Equations, North-Holland, 1997.

[6] [6] C. Brezinski, Dynamical systems and sequence transformation, J. Phys. A: Math. Gen., 34 (2001), 10659-10669.

[7] [7] C. Brezinski, Difference and differential equations and convergence acceleration algorithms. SIDE III-symmetries and integrability of difference equations (Sabaudia, 1998), 53-63, CRM Proc. Lecture Notes, 25, Amer. Math. Soc., Providence, RI, 2000.

[8] [8] C. Brezinski and J.-P. Chehab, Nonlinear hybrid procedures and fixed point iterations, Numer. Func. Anal. Opt., 19(5-6) (1998), 415-487.

[9] [9] G. Castro, J. Laminie, M. Sarrazin and A. Seghier, Factorized Sparse Inverse Preconditioner using Generalized Reflection Coefficients, Prépublication d'Orsay numéro 28 (10/7/1997).

[10] [10] E. Chow and Y. Saad, Approximate Inverse Preconditioning for Sparse-Sparse Iterations, SIAM Journal of Scientific Computing, 19 (1998), 995-1023.

[11] [11] Eisenstat, Ortega and Vaughan, Efficient polynomial preconditioning for the conjugate gardient method, SIAM J. Sci Stat. Comp., 11(5) (1990), 859-872.

[12] [12] A. Cuyt and L. Wuytack, Nonlinear Methods in Numerical Analysis, North-Holland, Amsterdam, 1987.

[13] [13] F. Dubois and A. Saidi, Unconditionally stable scheme for Riccati equation, ESAIM Proc, 8 (2000), 39-52.

[14] [14] U. Helmke and J.B. Moore, Optimization and Dynamical Systems, Comm. Control Eng. Series, Springer, London, 1994.

[15] [15] M.W. Hirsch and S. Smale, Differential Equations, Dynamical Systems and Linear Algebra, Academic Press, London, 1974.

[16] [16] J.H. Hubbard and B.H. West, Differential equations. A dynamical Systems Approach. Part I: Ordinary Differential Equations, Springer Verlag, New-York, 1991.

[17] [17] J.B. Moore, R.E. Mahony and U. Helmke, Numerical Gradient Algorithms for eigenvalue and singular value calculations, SIMAX, 15(3) (1994), 881-902.

[18] [18] Y. Saad, Practical implementation of polynomial preconditioning for Conjugate Gradient, SIAM J. Sci. Stat. Comp., 6 (1985), 865-881.

[19] [19] Y. Saad, Iterative Methods for Sparse Linear Systems, SIAM, 1996.

[20] [20] A.M. Stuart, Numerical analysis of dynamical systems, in Acta Numerica, 1994, Cambridge University Press, Cambridge, 1994, pp. 467-572.

[21] [21] H.A. Van der Vorst, Bi-CGSTAB: a fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 13 (1992), 631-644.

Brasil

Brasil

Matrix differential equations and inverse preconditioners

Abstract

Publication Dates

History