Acessibilidade / Reportar erro

Inverse and optimization problems in heat transfer

Abstract

This paper presents basic concepts of inverse and optimization problems. Deterministic and stochastic minimization techniques in finite and infinite dimensional spaces are revised; advantages and disadvantages of each of them are discussed and a hybrid technique is introduced. Applications of the techniques discussed for inverse and optimization problems in heat transfer are presented.

Inverse problems; optimization; heat transfer


TECHNICAL PAPERS

Inverse and optimization problems in heat transfer

Marcelo J. ColaçoI; Helcio R. B. OrlandeII; George S. DulikravichIII

ISenior Member, ABCM; Dept. of Mechanical Engineering Military Institute of Engineering – IME; Pç. Gen. Tiburcio, 80; 22290-270 Rio de Janeiro, RJ, Brazil; colaco@ime.eb.br

IISenior Member, ABCM; Dept. of Mechanical Engineering – POLI/COPPE; Federal University of Rio de Janeiro – UFRJ; Cx.Postal 68503; 21941-972 Rio de Janeiro, RJ, Brazil; helcio@mecanica.coppe.ufrj.br

IIIDept. of Mechanical and Materials Engineering; Florida International University; College of Engineering, Room EC 3474; 10555 West Flagler St., Miami, FL 33174, U.S.A; dulikrav@fiu.edu

ABSTRACT

This paper presents basic concepts of inverse and optimization problems. Deterministic and stochastic minimization techniques in finite and infinite dimensional spaces are revised; advantages and disadvantages of each of them are discussed and a hybrid technique is introduced. Applications of the techniques discussed for inverse and optimization problems in heat transfer are presented.

Keywords: Inverse problems, optimization, heat transfer

Introduction

Inverse problems of heat transfer rely on temperature and/or heat flux measurements for the estimation of unknown quantities appearing in the mathematical formulation of physical problems in this field (Tikhonov and Arsenin, 1977; Bech and Arnold, 1977; Alifanov, 1994; Beck et al, 1985; Alifanov et al, 1995; Dulikravich and Martin, 1996; Ozisik and Orlande, 2000; Kurpisz and Nowak, 1995; Woodbury, 2002; Murio, 1993; Trujillo and Busby, 1997). As an example, inverse problems dealing with heat conduction have been generally associated with the estimation of an unknown boundary heat flux, by using temperature measurements taken below the boundary surface. Therefore, while in the classical direct heat conduction problem the cause (boundary heat flux) is given and the effect (temperature field in the body) is determined, the inverse problem involves the estimation of the cause by utilizing the knowledge of the effect.

The use of inverse analysis techniques represents a new research paradigm. The results obtained from numerical simulations and from experiments are not simply compared a posteriori, but a close synergism exists between experimental and theoretical researchers during the course of study, in order to obtain the maximum information regarding the physical problem under consideration (Beck, 1999). In the recent past inverse problems have evolved from a specific theoretical research topic to an important practical tool of engineering analysis (Tikhonov and Arsenin, 1977; Bech and Arnold, 1977; Alifanov, 1994; Beck et al, 1985; Alifanov et al, 1995; Dulikravich and Martin, 1996; Ozisik and Orlande, 2000; Kurpisz and Nowak, 1995; Woodbury, 2002; Murio, 1993; Trujillo and Busby, 1997; Beck, 1999; Sabatier, 1978; Hensel, 1991; Desinov, 1999; Yagola et al, 1999; Ramm et al, 2000; Morozov, 1984; Zubelli, 1999; Kirsch, 1996; Isakov, 1998). Nowadays, at least three international journals are specifically oriented to the community involved with the solution and application of inverse problems, both from the mathematical and from the engineering points of view. These include: Inverse Problems, Inverse and Ill-posed Problems and Inverse Problems in Science and Engineering. The International Conference on Inverse Problems in Engineering: Theory and Practice is held every three years since 1993. Also, several other seminars and symposia have been held in different countries in the past, including the International Symposium on Inverse Problems in Engineering Mechanics in Japan, and the Inverse Problems, Design and Optimization Symposium in Brazil. More specifically for the heat transfer community, all major conferences in the field such as the International Heat Transfer Conference have special sessions or mini-symposia dedicated to inverse problems.

Inverse problems can be solved either as a parameter estimation approach or as a function estimation approach. If some information is available on the functional form of the unknown quantity, the inverse problem can be reduced to the estimation of a few unknown parameters. On the other hand, if no prior information is available on the functional form of the unknown, the inverse problem needs to be regarded as a function estimation approach in an infinite dimensional space of functions (Tikhonov and Arsenin, 1977; Bech and Arnold, 1977; Alifanov, 1994; Beck et al, 1985; Alifanov et al, 1995; Dulikravich and Martin, 1996; Ozisik and Orlande, 2000; Kurpisz and Nowak, 1995; Woodbury, 2002; Murio, 1993; Trujillo and Busby, 1997; Beck, 1999; Sabatier, 1978; Hensel, 1991; Desinov, 1999; Yagola et al, 1999; Ramm et al, 2000; Morozov, 1984; Zubelli, 1999; Kirsch, 1996; Isakov, 1998).

Inverse problems are mathematically classified as ill-posed, whereas standard heat transfer problems are well-posed. The solution of a well-posed problem must satisfy the conditions of existence, uniqueness and stability with respect to the input data (Hadamard, 1923). The existence of a solution for an inverse heat transfer problem may be assured by physical reasoning. On the other hand, the uniqueness of the solution of inverse problems can be mathematically proved only for some special cases. Also, the inverse problem is very sensitive to random errors in the measured input data, thus requiring special techniques for its solution in order to satisfy the stability condition.

Successful solution of an inverse problem generally involves its reformulation as an approximate well-posed problem and makes use of some kind of regularization (stabilization) technique. Although the solution techniques for inverse problems do not necessarily make use of optimization techniques, many popular methods are nowadays based on them (Tikhonov and Arsenin, 1977; Bech and Arnold, 1977; Alifanov, 1994; Beck et al, 1985; Alifanov et al, 1995; Dulikravich and Martin, 1996; Ozisik and Orlande, 2000; Kurpisz and Nowak, 1995; Woodbury, 2002; Murio, 1993; Trujillo and Busby, 1997; Beck, 1999; Sabatier, 1978; Hensel, 1991; Desinov, 1999; Yagola et al, 1999; Ramm et al, 2000; Morozov, 1984; Zubelli, 1999; Kirsch, 1996; Isakov, 1998).

Despite their similarities, inverse and optimization problems are conceptually different. Inverse problems are concerned with the identification of unknown quantities appearing in the mathematical formulation of physical problems, by using measurements of the system response. On the other hand, optimization problems generally deal with the minimization or maximization of a certain objective or cost function, in order to find design variables that will result in desired state variables. In addition, inverse and optimization problems involve other different concepts. For example, the solution technique for an inverse problem is required to cope with instabilities resulting from the noisy measured input data, while for an optimization problem the input data is given by the desired response(s) of the system. In contrast to inverse problems, the solution uniqueness may not be an important issue for optimization problems, as long as the solution obtained is physically feasible and can be practically implemented. Engineering applications of optimization techniques are very often concerned with the minimization or maximization of different quantities, such as minimum weight (e.g., lighter airplanes), minimum fuel consumption (e.g., more economic cars), maximum autonomy (e.g., longer range airplanes), etc. The necessity of finding the maximum or minimum values of some parameters (or functions) can be governed by economic factors, as in the case of fuel consumption, or design characteristics, as in the case of maximum autonomy of an airplane. Sometimes, however, the decision is more subjective, as in the case of choosing a car model. In general, different designs can be idealized for a given application, but only a few of them will be economically viable. The best design is usually obtained by some Min-Max technique.

In this paper we address solution methodologies for inverse and single-objective optimization problems, based on minimization techniques. Several gradient and stochastic techniques are introduced, together with their basic implementation steps and algorithms. We present some deterministic methods, such as the Conjugate Gradient Method, the Newton Method and the Davidon-Fletcher-Powell Method (Tikhonov and Arsenin, 1977; Bech and Arnold, 1977; Alifanov, 1994; Beck et al, 1985; Alifanov et al, 1995; Dulikravich and Martin, 1996; Ozisik and Orlande, 2000; Kurpisz and Nowak, 1995; Woodbury, 2002; Murio, 1993; Trujillo and Busby, 1997; Beck, 1999; Daniel, 1971; Jaluria, 1998; Stoecker, 1989; Belegundu and Chandrupatla, 1999; Fletcher, 2000; Powell, 1977; Fletcher and Reeves, 1964; Hestenes and Stiefel, 1952; Polak, 1971; Beale, 1972; Davidon, 1959; Fletcher and Powell, 1963; Broyden, 1965; Broyden, 1967; Levenberg, 1944; Marquardt, 1963; Bard, 1974; Dennis and Schnabel, 1983; Moré, 1977). In addition, we present some of the most promising stochastic approaches, such as the Simulated Annealing Method (Corana et al, 1987; Goffe et al, 1994), the Differential Evolutionary Method (Storn and Price, 1996), Genetic Algorithms (Goldberg, 1989; Deb, 2002) and the Particle Swarm Method (Kennedy and Eberhart, 1995; Kennedy, 1999; Naka et al, 2001; Eberhart et al, 2001; Dorigo and Stützle, 2004). Deterministic methods are in general computationally faster than stochastic methods, although they can converge to a local minima or maxima, instead of the global one. On the other hand, stochastic algorithms can ideally converge to a global maxima or minima, although they are computationally slower than the deterministic ones. Indeed, the stochastic algorithms can require thousands of evaluations of the objective functions and, in some cases, become non-practical. In order to overcome these difficulties, we will also discuss the so-called hybrid algorithm, which takes advantage of the robustness of the stochastic methods and of the fast convergence of the deterministic methods (Colaço et al, 2003a; Colaço et al, 2004; Colaço et al, 2003b; Dulikravich et al, 2003a; Dulikravich et al, 2003b; Dulikravich et al, 2003c; Dulikravich et al, 2004; Colaço et al, 2003c). Each technique provides a unique approach with varying degrees of convergence, reliability and robustness at different cycles during the iterative minimization procedure. A set of analytically formulated rules and switching criteria can be coded into the program to automatically switch back and forth among the different algorithms as the iterative process advances. Specific concepts for inverse and optimization problems are discussed and several examples in heat transfer are given in the paper.

Objective Function

Basic Concept

For the solution of inverse problems, as considered in this paper, we make use of minimization techniques that are of the same kind of those used in optimization problems. Therefore, the first step in establishing a procedure for the solution of either inverse problems or optimization problems is thus the definition of an objective function. The objective function is the mathematical representation of an aspect under evaluation, which must be minimized (or maximized). It can be mathematically stated as

where x1, x2, , xN are the variables of the problem under consideration that can be modified in order to find the minimum value of the function U.

The relationship between U and x can, most of the times, be expressed by a physical / mathematical model. However, in some cases this relationship is impractical or even impossible and the variation of U with respect to x must be determined by experiments.

Constraints

Usually the variables x1, x2, , xN which appear on the objective function formulation are only allowed to vary between some pre-specified ranges. Such constraints are, for example, due to physical or economical limitations.

We can have two types of constraints. The first one is the equality constraint, which can be represented by

This kind of constraint can represent, for example, the pre-specified power of an automobile.

The second type of constraint is called inequality constraint and it is represented by

This can represent, for example, the maximum temperature allowed in a gas turbine engine.

Optimization Problems

For optimization problems, the objective function U can be, for example, the fuel consumption of an automobile and the variables x1, x2, , xN can be the aerodynamic profile of the car, the material of the engine, the type of wheels used, the distance from the floor, etc.

Inverse Problems and Regularization

For inverse problems, the objective function usually involves the squared difference between measured and estimated variables of the system under consideration. As a result, some statistical aspects regarding the unknown parameters and the measured errors need to be examined in order to select an appropriate objective function. In the examples presented below we assume that temperature measurements are available for the inverse analysis. The eight standard assumptions (Beck and Arnold, 1977) regarding the statistical description of the problem are:

1. The errors are additive, that is,

where Yi is the measured temperature, Ti is the actual temperature and ei is the random error.

2. The temperature errors ei have a zero mean, that is,

where E(·) is the expected value operator. The errors are then said to be unbiased.

3. The errors have constant variance, that is,

which means that the variance of Yi is independent of the measurement.

4. The errors associated with different measurements are uncorrelated. Two measurement errors ei and ej , where i ¹j, are uncorrelated if the covariance of ei and ej is zero, that is,

Such is the case if the errors ei and ej have no effect on or relationship to each other.

5. The measurement errors have a normal (Gaussian) distribution. By taking into consideration the assumptions 2, 3 and 4 above, the probability distribution function of ei is given by

6. The statistical parameters describing ei , such as s , are known.

7. The only variables that contain random errors are the measured temperatures. The measurement times, measurement positions, dimensions of the heated body, and all other quantities appearing in the formulation of the inverse problem are all accurately known.

8. There is no prior information regarding the quantities to be estimated, which can be either parameters or functions. If such information exists, it can be utilized to obtain improved estimates.

If all of the eight statistical assumptions stated above are valid, the objective function, U, that provides minimum variance estimates is the ordinary least squares norm (Beck and Arnold, 1977) (i.e., the sum of the squared residuals) defined as

where Y and T(x) are the vectors containing the measured and estimated temperatures, respectively, and the superscript T indicates the transpose of the vector. The estimated temperatures are obtained from the solution of the direct problem with estimates for the unknown quantities.

If the values of the standard deviations of the measurements are quite different, the ordinary least squares method does not yield minimum variance estimates (Beck and Arnold, 1977). In such a case, the objective function is given by the weighted least squares norm, Uw, defined as

where W is a diagonal weighting matrix. Such matrix is taken as the inverse of the covariance matrix of the measurement errors in cases where the other statistical hypotheses remain valid (Beck and Arnold, 1977).

If we consider that some information regarding the unknown parameters is available, we can use the maximum a posteriori objective function in the minimization procedure (Beck and Arnold, 1977). Such an objective function is defined as:

where x is assumed to be a random vector with known mean m and known covariance matrix V. Therefore, the mean m and the covariance matrix V provide a priori information regarding the parameter vector x to be estimated. Such information can be available from previous experiments with the same experimental apparatus or even from the literature data. By assuming the validity of the other statistical hypotheses described above regarding the experimental errors, the weighting matrix W is given by the inverse of the covariance matrix of the measurement errors (Beck and Arnold, 1977).

If the inverse heat transfer problem involves the estimation of only few unknown parameters, such as the estimation of a thermal conductivity value from the transient temperature measurements in a solid, the minimization of the objective functions given above can be stable. However, if the inverse problem involves the estimation of a large number of parameters, such as the recovery of the unknown transient heat flux components f(ti) ºfi at times ti, iº1, ,I, excursion and oscillation of the solution may occur. One approach to reduce such instabilities is to use the procedure called Tikhonov's regularization (Tikhonov and Arsenin, 1977; Bech and Arnold, 1977; Alifanov, 1994; Beck et al, 1985; Alifanov et al, 1995; Dulikravich and Martin, 1996; Ozisik and Orlande, 2000; Kurpisz and Nowak, 1995; Woodbury, 2002; Murio, 1993; Trujillo and Busby, 1997; Beck, 1999; Sabatier, 1978; Hensel, 1991; Desinov, 1999; Yagola et al, 1999; Ramm et al, 2000; Morozov, 1984; Zubelli, 1999; Kirsch, 1996; Isakov, 1998), which modifies the least squares norm by the addition of a term such as

where a* (> 0) is the regularization parameter and the second summation on the right is the whole-domain zeroth-order regularization term. In Eq. (7.a), fi is the heat flux at time ti , which is supposed to be constant in the interval ti - Dt/2 < t < ti + Dt/2, where Dt is the time interval between two consecutive measurements. The values chosen for the regularization parameter a* influence the stability of the solution as the minimization of Eq. (7.a) is performed. As a*®0 the solution may exhibit oscillatory behavior and become unstable, since the summation of fi2 terms may attain very large values and the estimated temperatures tend to match those measured. On the other hand, with large values of a* the solution is damped and deviates from the exact result.

The whole-domain Tikhonov's first-order regularization procedure involves the minimization of the following modified least squares norm:

For a*®0, exact matching between estimated and measured temperatures is obtained as the minimization of U[f(t)] is performed and the inverse problem solution becomes unstable. For large values of a*, when the second summation in Eq. (7.b) is dominant, the heat flux components fi tend to become constant for i = 1, 2, ..., I, that is, the first derivative of f(t) tends to zero. Instabilities on the solution can be alleviated by proper selection of the value of a*. Tikhonov (Tikhonov, 1977) suggested that a* should be selected so that the minimum value of the objective function would be equal to the sum of the squares of the errors expected for the measurements, which is know as the Discrepancy Principle.

Alternative approaches for Tikhonov's regularization scheme described above is the use of Beck's Sequential Function Specification Method (Beck et al, 1985) or of Alifanov's Iterative Regularization Methods (Alifanov, 1994; Alifanov et al, 1995; Ozisk and Orlande, 2000). Beck's sequential function specification method is a quite stable inverse analysis technique. This is due to the averaging property of the least-squares norm and because the measurement at the time when the heat flux is to be estimated, is used in the minimization procedure together with few measurements taken at future time steps (Beck et al, 1985). In Alifanov's regularization methods, the number of iterations plays the role of the regularization parameter a* and the stopping criterion is so chosen that reasonably stable solutions be obtained. Therefore, there is no need to modify the original objective function, as opposed to Tikhonov's or Beck's approaches. The iterative regularization approach is sufficiently general and can be applied to both parameter and function estimations, as well as to linear and non-linear inverse problems (Alifanov, 1994; Alifanov et al, 1995; Ozisk and Orlande, 2000).

Minimization Techniques

In this section, we present deterministic and stochastic techniques for the minimization of an objective function U(x) and the identification of the parameters x1, x2, , xN, which appear on the objective function formulation. Basically, this type of minimization problem is solved in a space of finite dimension N, which is the number of unknown parameters. For many minimization problems, the unknowns cannot be recast in the form of a finite number of parameters and the minimization needs to be performed in an infinite dimensional space of functions (Tikhonov and Arsenin, 1977; Bech and Arnold, 1977; Alifanov, 1994; Beck et al, 1985; Alifanov et al, 1995; Dulikravich and Martin, 1996; Ozisik and Orlande, 2000; Kurpisz and Nowak, 1995; Woodbury, 2002; Murio, 1993; Trujillo and Busby, 1997; Beck, 1999; Sabatier, 1978; Hensel, 1991; Desinov, 1999; Yagola et al, 1999; Ramm et al, 2000; Morozov, 1984; Zubelli, 1999; Kirsch, 1996; Isakov, 1998; Hadamard, 1923; Daniel, 1971). A powerful and straightforward technique for the minimization in a functional space is the conjugate gradient method with adjoint problem formulation (Alifanov, 1994; Alifanov et al, 1995; Ozisk and Orlande, 2000; Daniel, 1971). Therefore, this technique is also described below, as applied to the solution of an inverse problem of function estimation.

Deterministic Methods

These types of methods, as applied to non-linear minimization problems, generally rely on establishing an iterative procedure, which, after a certain number of iterations, will hopefully converge to the minimum of the objective function. The iterative procedure can be written in the following general form (Beck and Arnold, 1977; Alifanov, 1994; Alifanov et al, 1995; Jaluria, 1998; Stoecker, 1989; Belegundu and Chandrupatla, 1999; Fletcher, 2000; Bard, 1974; Dennis and Schnabel, 1983):

where x is the vector of design variables, a is the search step size, d is the direction of descent and k is the number of iterations.

An iteration step is acceptable if Uk+1 < Uk. The direction of descent d will generate an acceptable step if and only if there exists a positive definite matrix R, such that d=–RÑU, (Bard, 1974).

In fact, such requirement results in directions of descent that form an angle greater than 90º with the gradient direction. A minimization method in which the directions are obtained in this manner is called an acceptable gradient method (Bard, 1974).

A stationary point of the objective function is one at which ÑU=0. The most that we can hope about any gradient method is that it converges to a stationary point. Convergence to the true minimum can be guaranteed only if it can be shown that the objective function has no other stationary points. In practice, however, one usually reaches the local minimum in the valley where the initial guess for the iterative procedure was located (Bard, 1974).

Steepest Descent Method

In this section we will introduce the most basic gradient method, which is the Steepest Descent Method. The basic idea of this method is to move downwards on the objective function along the direction of highest variation, in order to locate its minimum value. Therefore, the direction of descent is given by

since the gradient direction is the one that gives the fastest increase of the objective function. Despite being the natural choice for the direction of descent, the use of the gradient direction is not very efficient. Usually the steepest-descent method starts with large variations in the objective function, but, as the minimum value is reached, the convergence rate becomes very low.

The optimum choice for the search step size, a, is the one that minimizes the objective function along the direction of descent. Thus, a univariate search method needs to be employed in order to find the search step size at each iteration. In the case of a unimodal function some classical procedures can be used, such as the Dichotomous Search, Fibonacci Search, Golden Search and Cubic Interpolation, among others. However, for some realistic cases, the variation of the objective function with the search step size is not unimodal and, then, more robust techniques are required, such as the exhaustive search method or a technique based on exhaustive interpolation (Jaluria, 1988; Stoecker, 1989; Belegundu and Chandrupatla, 1999; Fletcher, 2000; Bard, 1974; Dennis and Schnabel, 1983).

Figure 1 illustrates the iterative procedure for the Steepest Descent Method.


Conjugate Gradient Method

The Conjugate Gradient Method improves the convergence rate of the Steepest Descent Method, by choosing directions of descent that are a linear combination of the gradient direction with directions of descent of previous iterations (Alifanov, 1994; Alifanov et al, 1995; Ozisik and Orlande, 2000; Daniel, 1971; Jaluria, 1998; Stoecker, 1989; Belegundu and Chandrupatla, 1999; Fletcher, 2000; Powell, 1977; Fletcher and Reeves, 1964; Hestenes and Stiefel, 1952; Polak, 1971, Beale, 1972; Davidon, 1959). In a general form, the direction of descent is given by:

where gk and yk are the conjugation coefficients. This direction of descent is used in the iterative procedure specified by Eq. (8).

The superscript q in Eq. (10) denotes the iteration number where a restarting strategy is applied to the iterative procedure of the conjugate gradient method. Restarting strategies were suggested for the conjugate gradient method of parameter estimation in order to improve its convergence rate (Powell, 1977).

Different versions of the Conjugate Gradient Method can be found in the literature depending on the form used for the computation of the direction of descent given by Eq. (10) (Alifanov, 1994; Alifanov et al, 1995; Ozisik and Orlande, 2000; Daniel, 1971; Jaluria, 1998; Stoecker, 1989; Belegundu and Chandrupatla, 1999; Fletcher, 2000; Powell, 1977; Fletcher and Reeves, 1964; Hestenes and Stiefel, 1952; Polak, 1971, Beale, 1972; Davidon, 1959). In the Fletcher-Reeves version, the conjugation coefficients gk and y k are obtained from the following expressions:

where || . || denotes the Euclidian norm in the vector space.

In the Polak-Ribiere version of the Conjugate Gradient Method the conjugation coefficients are given by:

Based on previous work by Beale (Beale, 1972), Powell (Powell, 1977) suggested the following expressions for the conjugation coefficients, which gives the so-called Powell-Beale's version of the Conjugate Gradient Method:

In accordance with Powell (Powell, 1977), the application of the conjugate gradient method with the conjugation coefficients given by Eqs. (13.a,b) requires restarting when gradients at successive iterations tend to be non-orthogonal (which is a measure of the local non-linearity of the problem) and when the direction of descent is not sufficiently downhill. Restarting is performed by making yk=0 in Eq. (10).

The non-orthogonality of gradients at successive iterations is tested by the following equation:

where ABS (.) denotes the absolute value.

A non-sufficiently downhill direction of descent (i.e., the angle between the direction of descent and the negative gradient direction is too large) is identified if either of the following inequalities is satisfied

or

We note that the coefficients 0.2, 1.2 and 0.8 appearing in Eqs. (14.a-c) are empirical and are the same used by Powell (Powell, 1977).

In Powell-Beale's version of the conjugate gradient method, the direction of descent given by Eq. (10) is computed in accordance with the following algorithm for k > 1 (Powell, 1977):

STEP 1: Test the inequality (14.a). If it is true set q = k-1.

STEP 2: Compute gk with Eq. (13.a).

STEP 3: If k = q+1 set yk = 0. If k ¹ q+1 compute yk with Eq. (13.b).

STEP 4: Compute the search direction dk with Eq. (10).

STEP 5: If k ¹ q+1 test the inequalities (14.b,c). If either one of them is satisfied set q = k-1 and yk=0. Then recompute the search direction with Eq. (10).

The Steepest Descent Method, with the direction of descent given by the negative gradient equation, would be recovered with gk=yk=0 for any k in Eq. (10). The same procedures used for the evaluation of the search step size in the Steepest Descent Method are employed for the Conjugate Gradient Method.

Figure 2 shows the iterative procedure for the Fletcher-Reeves version of the Conjugate Gradient Method.


Newton-Raphson Method

While the Steepest Descent and the Conjugate Gradient Methods use gradients of the objective function in their iterative procedures, the Newton-Raphson Method also uses information of the second derivative of the objective function in order to achieve a faster convergence rate (which doesn't mean a lower computational cost).

Let us consider an objective function U(x), which is, at least, twice differentiable. The Taylor expansion of U(x) around a vector xk, where x – xk = h, is given by (Beck, 1977; Jaluria, 1998; Belegundu and Chandrupatla, 1999; Broyden, 1965; Broyden, 1967; Levenberg, 1944; Marquardt, 1963; Bard, 1974; Dennis and Schnabel, 1983):

where ÑU(x) is the gradient (vector of 1st order derivatives) of the objective function and D2U(x) is the Hessian (matrix of 2nd order derivatives) of the objective function.

If the objective function U(x) is twice differentiable, the Hessian is symmetric, and we can write

A necessary condition for the minimization of the objective function is that its gradient vanishes. Therefore, from Eq. (16) we have:

and the vector that locally optimizes the function U(x) is given by

Therefore, the iterative procedure of the Newton-Raphson method, given by Eq. (18), can be written in the same form of Eq. (8) by setting ak=1 and

However, the Newton-Raphson method converges to the extreme point closer to the initial guess, disregarding if it is a maximum, a minimum or a saddle point. For this reason it is common to introduce a search step size to be taken along the direction of descent for this method, so that it is written as:

Figure 3 shows the iterative procedure for the Newton-Raphson Method.


Although the convergence rate of the Newton-Raphson is quadratic, the calculation of the Hessian is computationally expensive. As a result, other methods that approximate the Hessian with simpler and faster-computing forms have been developed. Some of these methods are described next.

Quasi-Newton Methods

In these types of methods, the Hessian matrix appearing in the Newton-Raphson's Method is approximated in such a way that it does not involve second derivatives. Usually, the approximations for the Hessian are based on first derivatives. As a result, the Quasi-Newton Methods have a slower convergence rate than the Newton-Raphson Method, but they are computationally faster (Beck, 1977; Jaluria, 1998; Belegundu and Chandrupatla, 1999; Broyden, 1965; Broyden, 1967; Levenberg, 1944; Marquardt, 1963; Bard, 1974; Dennis and Schnabel, 1983).

Let us define a new matrix H, which is an approximation for the inverse of the Hessian, that is,

The direction of descent for the Quasi-Newton methods is thus given by:

and the matrix H is iteratively calculated as

where I is the identity matrix. Note that, for the first iteration, the method starts as the Steepest Descent Method.

Different Quasi-Newton methods can be found in the literature depending on the choice of the matrices M and N. For the Davidon-Fletcher-Powell (DFP) Method (Davidon, 1959; Fletcher and Powell, 1963), such matrices are given by

where

Note that, since the matrix H is iteratively calculated, some errors can be propagated and, in general, the method needs to be restarted after some number of iterations. Also, since the matrix M depends on the choice of the search step size a, the method is very sensitive to its choice. A variation of the DFP method is the Broyden-Fletcher-Goldfarb-Shanno (BFGS) Method (Davidon, 1959; Fletcher and Powell, 1963; Broyden, 1965; Broyden, 1967), which is less sensitive to the choice of the search step size. For this method, the matrices M and N are calculated as

Figure 4 shows the iterative procedure for the BFGS Method.


Levenberg-Marquardt Method

The Levenberg-Marquardt Method was first derived by Levenberg (Levenberg, 1944) in 1944, by modifying the ordinary least squares norm. Later, in 1963, Marquardt (Marquardt, 1963) derived basically the same technique by using a different approach. Marquardt's intention was to obtain a method that would tend to the Gauss method in the neighborhood of the minimum of the ordinary least squares norm, and would tend to the steepest descent method in the neighborhood of the initial guess used for the iterative procedure. This method actually converts a matrix that approximates the Hessian into a positive definite one, so that the direction of descent is acceptable.

The method rests on the observation that if P is a positive definite matrix, then A + l P is positive definite for sufficiently large l. If A is an approximation for the Hessian, we can choose P as a diagonal matrix whose elements coincide with the absolute values of the diagonal elements of A (Bard, 1974).

The direction of descent for the Levenberg-Marquardt method is given by (Bard, 1974):

and the step size is taken as ak = 1. Note that for large values of lk a small step is taken along the negative gradient direction. On the other hand, as lk tends to zero, the Levenberg-Marquardt method tends to an approximation of Newton's method based on the matrix A. Usually, the matrix A is taken as that for the Gauss method (Beck, 1977; Ozisik and Orlande, 2000; Bard, 1974).

Evolutionary and Stochastic Methods

In this section some Evolutionary and Stochastic Methods like Genetic Algorithms, Differential Evolution, Particle Swarm and Simulated Annealing will be discussed. Evolutionary methods, in opposition to the deterministic methods, don't rely, in general, on strong mathematical basis and do not make use of the gradient of the objective function as a direction of descent. They tend to mimic nature in order to find the minimum of the objective function, by selecting, in a fashionable and organized way, the points where such function is going to be computed.

Genetic Algorithms

Genetic algorithms are heuristic global optimization methods that are based on the process of natural selection. Starting from a randomly generated population of designs, the optimizer seeks to produce improved designs from one generation to the next. This is accomplished by exchanging genetic information between designs in the current population, in what is referred to as the crossover operation. Hopefully this crossover produces improved designs, which are then used to populate the next generation (Goldberg, 1989; Deb, 2002).

The basic genetic algorithm works with a collection or population of potential solutions to the minimization problem. The algorithm works in an iterative manner. At each iteration, also called generation, three operators are applied to the entire population of designs. These operators are selection, crossover, and mutation. For the operators to be effective, each potential solution or design must be represented as a collection of finite parameters, also called genes. Each design must have a unique sequence of these parameters that define it. This collection of genes is often called the chromosome. The genes themselves are often encoded as binary strings though they can be represented as real numbers. The length of the binary string determines how precisely the value, also know as the allele of the gene, is represented.

The genetic algorithm applied to an optimization problem proceeds as follows. The process begins with an initial population of random designs. Each gene is generated by randomly generating 0's and 1's. The chromosome strings are then formed by combining the genes together. This chromosome defines the design. The objective function is evaluated for each design in the population. Each design is assigned a fitness value, which corresponds to the value of the objective function for that design. In the case of minimization, a higher fitness is assigned to designs with lower values of the object function.

Next, the population members are selected for reproduction, based upon their fitness. The selection operator is applied to each member of the population. The selection operator chooses pairs of individuals from the population who will mate and produce offspring. In the tournament selection scheme, random pairs are selected from the population and the individual with the higher fitness of each pair is allowed to mate.

Once a mating pair is selected, the crossover operator is applied. The crossover operator essentially produces new designs or offspring by combining the genes from the parent designs in a stochastic manner. In the uniform crossover scheme, it is possible to obtain any combination of the two parent's chromosomes. Each bit in each gene in the chromosome is assigned a probability that crossover will occur (for example, 50 % for all genes). A random number between 0 and 1 is generated for each bit in each gene. If a number greater than 0.5 is generated then that bit is replaced by the corresponding bit in the gene from the other parent. If it is less than 0.5, the original bit in the gene remains unchanged. This process is repeated for the entire chromosome for each of the parents. When complete, two offspring are generated, which may replace the parents in the population.

The mutation process follows next. When the crossover procedure is complete and a new population is formed, the mutation operator is applied. Each bit in each gene in the design is subjected to a chance for a change from 0 to 1, or vice versa. The chance is known as the mutation probability, which is usually small. This introduces additional randomness into the process, which helps to avoid local minima. Completion of the mutation process signals the end of a design cycle. Many cycles may be needed before the method converges to an optimum design.

For more details or for the numerical implementation of Genetic Algorithms, the reader is referred to (Goldberg, 1989; Deb, 2002).

Differential Evolution

The Differential Evolution Method is an evolutionary method based on Darwin's Theory for the Evolution of the Species. It was created in 1995 by two researchers from Berkeley (Kenneth Price and Rainer Storn) (Storn and Price, 1996) as an alternative to the Genetic Algorithm Method. Following the Theory for the Evolution of the Species, the strongest members of a population will be more capable to survive in a certain environmental condition. During the matting process, the chromosomes of two individuals of the population are combined in a process called crossover. During this process mutations can occur, which can be good (individual with a better objective function) or bad (individual with a worst objective function). The mutations are used as a way to escape from local minima. However, their excessive usage can lead to a non-convergence of the method.

The method starts with a randomly generated population in the domain of interest. Then, successive combinations of chromosomes and mutations are performed, creating new generations until an optimum value is found.

The iterative process of the Differential Evolution Method is given by:

where

xi is the i-th individual vector of parameters.

a, b and g are three members of the population matrix P, randomly chosen.

F is a weight function, which defines the mutation (0.5 < F < 1).

k is the number of generations.

d1 and s2 are Dirac delta functions that define the mutation.

In the minimization process, if U(xk+1) < U(xk), then xk+1 replaces xk in the population matrix P. Otherwise, xk is kept in the population matrix.

The binomial crossover is given as

where CR is a factor that defines the crossover (0.5 < CR < 1) and R is a random number with uniform distribution between 0 and 1.

Figure 5 shows the iterative procedure for the Differential Evolution Method.


Particle Swarm

Another evolutionary method is the one that uses the concepts of Particle Swarm. Such method was created in 1995 by an Electrical Engineer (Russel Eberhart) and a Social Psychologist (James Kennedy) (Kennedy and Eberhart, 1995; Kennedy, 1999; Naka et al, 2001; Eberhart et al, 2001) as an alternative to the Genetic Algorithm Method. This method is based on the social behavior of various species and tries to equilibrate the individuality and sociability of the individuals in order to locate the optimum of interest. The original idea of Kennedy and Eberhart came from the observation of birds looking for nesting places. When the individuality is increased, the search for alternative places for nesting is also increased. However, if the individuality becomes too high, the individual might never finds the best place. On the other hand, when the sociability is increased, the individual learns more from its neighbors' experience. However, if the sociability becomes too high, all the individuals might converge to the first minima found, which is possibly a local minima.

In this method, the iterative procedure is given by:

where:

xi is the i-th individual vector of parameters.

vi = 0, for k=0.

r1i and r2i are random numbers with uniform distribution between 0 and 1.

pi is the best value found for the vector xi.

pg is the best value found for the entire population.

0 < a < 1; 1 < b < 2

In Eq. (29.b), the second term on the right hand side represents the individuality and the third term the sociability. The first term on the right-hand side represents the inertia of the particles and, in general, must be decreased as the iterative process runs. In this equation, the vector pi represents the best value ever found for the i-th component vector of parameters xi during the iterative process. Thus, the individuality term involves the comparison between the current value of the i-th individual xi with its best value in the past. The vector pg is the best value ever found for the entire population of parameters (not only the i-th individual). Thus the sociability term compares xi with the best value of the entire population in the past.

Figure 6 shows the iterative procedure for the Particle Swarm Method.


Simulated Annealing

This method is based on the thermodynamics of the cooling of a material from a liquid to a solid phase (Corana et al, 1987; Goffe et al, 1994). If a liquid material (e.g. liquid metal) starts being slowly cooled down and left for a sufficiently long time close to the phase change temperature, a perfect crystal will be created, which has the lowest internal energy state. On the other hand, if the liquid material is not left for a sufficient long time close to the phase change temperature, or, if the cooling process is not sufficiently slow, the final crystal will have several defects and a high internal energy state. This is similar to the quenching process used in metallurgical applications.

We can say that gradient-based methods "cool down too fast", going rapidly to an optimum location which, in most cases, is not the global but the local one. Differently from the gradient-based methods, the Simulated Annealing Method can move in any direction, escaping from possible local minimum or maximum values. Consider, for example, the Boltzmann probability function

This equation expresses the idea that a system in thermal equilibrium has its energy distributed probabilistically among different energy states E. In this equation, K is the Boltzmann constant. Such equation tells us that, even at low temperatures, there is a chance, even small, that the system is at a high energy level, as illustrated in Fig. 7. Thus, there is a chance of the system to get out of this local minimum and search for a global one.


Figure 8 shows the iterative procedure for the Simulated Annealing Method. The procedure starts by generating a population of individuals of the same size of the number of variables (n=m), in such a way that the population matrix is a square matrix. Then, the initial temperature (T), the reducing ratio (RT), the number of cycles (Ns) and the number of iterations of the annealing process (Nit) are selected. After NSn function evaluations, each element of the step length V is adjusted, so that approximately half of all function evaluations are accepted. The suggested value for the number of cycles is 20. After NitNsn function evaluations, the temperature (T) is changed by the factor RT. The value suggested for the number of iterations by Corana et al (Corana et al, 1987) is MAX(100, 5n).


The iterative process is given by the following equation:

where R is a random number with uniform distribution between 0 and 1 and V is a step-size which is continuously adjusted.

To start, it randomly chooses a trial point within the step length V (a vector of length N) of the user selected starting point. The function is evaluated at this trial point (xi1) and its value is compared to its value at the initial point (xi0). In a minimization problem, all downhill moves are accepted and the algorithm continues from that trial point. Uphill moves may be accepted; the decision is made by the Metropolis criteria. It uses T (temperature) and the size of the downhill move in a probabilistic manner

The smaller T and the size of the uphill move are, the more likely that move will be accepted. If the trial is accepted, the algorithm moves on from that point. If it is rejected, another point is chosen for a trial evaluation.

Each element of V is periodically adjusted, so that half of all function evaluations in that direction are accepted. The number of accepted function evaluations is represented by the variable Ni. Thus the variable r represents the ratio of accepted over total function evaluations for an entire cycle Ns and it is used to adjust the step length V.

A decrease in T is imposed upon the system with the RT variable by using

where i is the i-th iteration. Thus, as T declines, uphill moves are less likely to be accepted and the percentage of rejections rises. Given the scheme for the selection for V, V falls. Thus, as T declines, V falls and the algorithm focuses upon the most promising area for optimization.

The parameter T is crucial for the successful use of the algorithm. It influences V, the step length over which the algorithm searches for the minimum. For a small initial T, the step length may be too small; thus not enough values of the function will be evaluated to find the global minimum. To determine the starting temperature that is consistent with optimizing a function, it is worthwhile to run a trial run first. The user should set RT = 1.5 and T = 1.0. With RT > 1.0, the temperature increases and V rises as well. Then the T must be selected that produces a large enough V.

Hybrid Methods

The so-called Hybrid Methods are a combination of the deterministic and the evolutionary/stochastic methods, in a sense that the advantages of each one of them are used. Hybrid Methods usually employ an evolutionary/stochastic method to locate the region where the global minimum is located and then switches to a deterministic method to get closer to the exact point faster.

As an example, consider the Hybrid Method illustrated in Fig. 9. The main module is the Particle Swarm Method, which does almost the entire optimization task. When some percentile of the particles find a minima (let us say, some birds already found their best nesting place), the algorithm switches to the Differential Evolution Method and the particles (birds) are forced to breed. If there is an improvement of the objective function, the algorithm returns to the Particle Swarm Method, meaning that some other region is more prone to have a global minimum. If there is no improvement of the objective function, this can indicate that this region already contains the global minimum expected and the algorithm switches to the BFGS Method in order to locate more precisely its location. Finally, the algorithm returns again to the Particle Swarm Method in order to check if there are any changes in the minimum location and the entire procedure is repeated for a few more iterations (e.g., 5).


More involved Hybrid Methods, dealing with the application of other deterministic and stochastic methods, can be found in references (Colaço et al, 2003a; Colaço et al, 2004; Colaço et al, 2003b; Dulikravich et al, 2003a; Dulikravich et al, 2003b; Dulikravich et al, 2003c; Dulikravich et al, 2004; Colaço et al, 2003c).

Function Estimation Approach

The methods described above were applied for the minimization of an objective function U(x), where x = [x1, x2, , xN] is the vector with the parameters of the problem under consideration that can be modified in order to find the minimum of U(x). Therefore, the minimization is performed in a parameter space of dimension N. Several optimization or inverse problems rely on functions, instead of parameters, which need to be selected for the minimization of an objective function. In these cases, the minimization needs to be performed in an infinite dimensional space of functions and no a priori assumption is required regarding the functional form of the unknown functions, except for the functional space that they belong to (Alifanov, 1994; Alifanov et al, 1995; Ozisik and Orlande, 2000). A common selection is the Hilbert space of square integrable functions in the domain of interest.

In this paper we use the conjugate gradient method with adjoint problem formulation for the minimization of the objective function. We illustrate here the function estimation approach as applied to the solution of a specific inverse problem involving the following diffusion equation (Colaço et al, 2003c):

where r* denotes the vector of coordinates and the superscript * denotes dimensional quantities.

Equation (34) can be used for the modeling of several physical phenomena, such as heat conduction, groundwater flow and tomography. We focus our attention here to a one-dimensional version of Eq. (34) written in dimensionless form as

and subject to the following boundary and initial conditions.

Notice that in the direct problem, the diffusion coefficient function D(x) and the source term distribution function µ(x) are regarded as known quantities, so that a direct (analysis) problem is concerned with the computation of T(x,t).

Inverse Problem

For the inverse problem of interest here, the functions D(x) and µ(x) are regarded as unknown. Such functions will be simultaneously estimated by using measurements of T(x,t) taken at appropriate locations in the medium or on its boundaries. Such measurements may contain random errors. These measurement errors are assumed to be uncorrelated, additive, normally distributed, with zero mean, and with a known constant standard deviation.

Practical applications of this inverse problem include the identification of non-homogeneities in the medium, such as inclusions, obstacles or cracks, determination of thermal diffusion coefficients and distribution of heat sources, groundwater flow and tomography physical problems, in which both D(x) and µ(x) vary.

The basic steps of the conjugate gradient method of function estimation are discussed below.

Conjugate Gradient Method with Adjoint Problem

For the simultaneous estimation of the functions D(x) and µ(x) with the conjugate gradient method with adjoint problem, we consider the minimization of the following objective functional

where Ym(t) are the transient measurements of T(x,t) taken at the positions xm, m = 1, , M. The estimated dependent variable T[xm,t;D(x),µ( x)] is obtained from the solution of the direct problem (35.a-d) at the measurement positions xm, m = 1, , M, with estimates for D(x) and µ(x). When dealing with an optimization problem, instead of an inverse problem, Ym(t) represents the desired temperature at the positions xm, m = 1, , M. We note in Eq. (36) that, for simplicity in the analytical analysis developed below, the measurements Ym(t) are assumed to be continuous in the time domain.

The use of the conjugate gradient method for the minimization of the objective functional (36) requires the solution of auxiliary problems, known as sensitivity and adjoint problems.

The sensitivity function, solution of the sensitivity problem, is defined as the directional derivative of T(x,t) in the direction of the perturbation of the unknown function (Alifanov, 1994; Alifanov et al, 1995; Ozisik and Orlande, 2000). Since the present problem involves two unknown functions, two sensitivity problems are required for the estimation procedure, resulting from perturbations in D(x) and µ(x).

The sensitivity problem for DTD(x,t) is obtained by assuming that the dependent variable T(x,t) is perturbed by eDTD(x,t) when the diffusion coefficient D(x) is perturbed by eDD(x), where e is a real number. The sensitivity problem resulting from perturbations in D(x) is then obtained by applying the following limiting process:

where Le(De) and L(D) are the direct problem formulations written in operator form for perturbed and unperturbed quantities, respectively. The application of the limiting process given by Eq. (37) results in the following sensitivity problem:

A limiting process analogous to that given by Eq. (37), obtained from the perturbation eDµ(x), results in the following sensitivity problem for D(x,t)

A Lagrange multiplier l(x,t) is utilized in the minimization of the functional (36) because the estimated dependent variable T[xm,t;D(x),µ (x)] appearing in such functional needs to satisfy a constraint, which is the solution of the direct problem. Such Lagrange multiplier, needed for the computation of the gradient equations (as will be apparent below), is obtained through the solution of problems adjoint to the sensitivity problems, given by Eqs. (38.a-d) and (39.a-d) (Alifanov, 1994). Despite the fact that the present inverse problem involves the estimation of two unknown functions, thus resulting in two sensitivity problems as discussed above, one single problem, adjoint to problems (38.a-d) and (39.a-d), is obtained.

In order to derive the adjoint problem, the governing equation of the direct problem, Eq. (35.a), is multiplied by the Lagrange multiplier l(x,t), integrated in the space and time domains of interest and added to the original functional (36). The following extended functional is obtained:

where d is the Dirac delta function.

Directional derivatives of U[D(x),µ(x)] in the directions of perturbations in D(x) and µ(x) are respectively defined by

where U[De,m] and U[D,me] denote the extended functional (40) written for perturbed D(x) and m(x), respectively.

After letting the above directional derivatives of U[D(x),µ(x)] go to zero, which is a necessary condition for the minimization of the extended functional (40), and after performing some lengthy but straightforward manipulations (Alifanov, 1994; Alifanov et al, 1995; Ozisik and Orlande, 2000), the following adjoint problem for the Lagrange multiplier l(x,t) is obtained

During the limiting processes used to obtain the adjoint problem, applied to the directional derivatives of S[D(x),µ(x)] in the directions of perturbations in D(x) and µ(x), the following integral terms are respectively obtained:

By invoking the hypotheses that D(x) and µ(x) belong to the Hilbert space of square integrable functions in the domain 0 < x < 1, it is possible to write (Alifanov, 1994; Alifanov et al, 1995; Ozisik and Orlande, 2000):

Hence, by comparing Eqs. (43.a,b) and (44.a,b) we obtain the gradient components of U[D,µ] with respect to D(x) and µ(x), respectively, as

An analysis of Eqs. (45.a) and (42.b.c) reveals that the gradient component with respect to D(x) is null at x = 0 and at x = 1. As a result, the initial guess used for D(x) is never changed by the iterative procedure of the conjugate gradient method at such points, which can create instabilities in the inverse problem solution in their neighborhoods.

For the simultaneous estimation of D(x) and µ(x), the iterative procedure of the conjugate gradient method is written respectively as (Alifanov, 1994; Alifanov et al, 1995; Ozisik and Orlande, 2000)

where dkD(x) and dkµ(x ) are the directions of descent for D(x) and µ(x), respectively; akD and akµ are the search step sizes for D(x) and µ(x), respectively; and k is the number of iterations.

For the iterative procedure for each unknown function, the direction of descent is obtained as a linear combination of the gradient direction with directions of descent of previous iterations. The directions of descent for the conjugate gradient method for D(x) and µ(x) can be written respectively as

where gkD, gkµ, ykD and ykµ are the conjugation coefficients. The superscripts qD and in Eqs. (47.a,b) represent the iteration numbers where a restarting strategy is applied to the iterative procedure for the estimation of D(x) and µ(x), respectively (Powell, 1977).

Different versions of the conjugate gradient method can be found in the literature, depending on how the conjugation coefficients are computed. In this work we use the so-called Powell-Beale's version of the conjugate gradient method because of its superior robustness and convergence rate in non-linear problems (Colaço et al, 2003c). The conjugation coefficients for this version of the conjugate gradient method are given by:

where gkD = gkµ = ykD = ykµ = 0, for k = 0.

Powell-Beale's version of the conjugate gradient method is restarted by making the conjugation coefficient ykD = 0 (or ykµ = 0) if gradients at successive iterations are too far from being orthogonal (which is a measure of the nonlinearity of the problem) or if the direction of descent is not sufficiently downhill. For further details, the reader is referred to (Colaço et al, 2003c).

The search step sizes akD and akµ, appearing in the expressions of the iterative procedures for the estimation of D(x) and µ(x), Eqs, (46.a,b), respectively, are obtained by minimizing the objective functional at each iteration along the specified directions of descent. If the objective functional given by Eq. (36) is linearized with respect to akD and akµ, closed form expressions can be obtained for such quantities as follows (Alifanov, 1994; Alifanov et al, 1995; Ozisik and Orlande, 2000):

where

In Eqs. (50.a-e), DTkµ(x,t) and DTkµ(x,t) are the solutions of the sensitivity problems given by Eqs. (38.a-d) and (39.a-d), respectively, obtained by setting DDk(x)=dkD( x) and Dµk(x)=dkµ( x).

The use of the conjugate gradient method for the simultaneous estimation of D(x) and µ(x) can be suitably arranged in a systematic and straightforward computational procedure, which is omitted here for the sake of brevity, but can be readily adapted from those found in reference (Ozisik and Orlande, 2000). The conjugate gradient method of function estimation belongs to the class of iterative regularization methods (Alifanov, 1994). For this class of methods, the stopping criterion for the computational procedure is chosen so that sufficiently accurate and smooth solutions are obtained for the unknown functions. Although different approaches can be used for the specification of the tolerance for the stopping criterion, we use in this work the discrepancy principle (Alifanov, 1994).

With the use of the discrepancy principle, the iterative procedure of the conjugate gradient method is stopped when the difference between measured and estimated variables is of the order of the standard deviation, s, of the measurements, that is, when

Therefore, the iterative procedure is stopped when

where the tolerance, j, is obtained by substituting Eq. (51) into the definition of the functional given by Eq. (36), that is,

Applications

We now present some applications of inverse and optimization problems in heat transfer. The first example deals with the simultaneous estimation of the spatial distributions of the diffusion coefficient and of the source-term in a 1D diffusion problem. Such is the same kind of inverse problem for which the conjugate gradient method of function estimation was derived above. The other four examples of applications deal with convective heat transfer. The first two examples involve the solution of inverse problems in irregularly shaped channels (forced convection) and cavities (natural convection), respectively, while the other two involve the optimization of externally induced electro or magnetic fields in order to control the solidification characteristics of melts.

Simultaneous Estimation of Spatially-Dependent Diffusion Coefficient and Source Term in a Diffusion Problem (Colaço et al, 2003c)

This work deals with the simultaneous estimation of the spatially varying diffusion coefficient and of the source term distribution in a one-dimensional diffusion problem. The problem formulation is given by Eqs. (35.a-d). This work can be physically associated with the detection of material non-homogeneities such as inclusions, obstacles or cracks, heat conduction, groundwater flow detection, and tomography. Two solution techniques are applied to the inverse problem under consideration, namely: the conjugate gradient method with adjoint problem and a hybrid optimization algorithm. For the hybrid optimization algorithm, stabilization for the solution of the inverse problem was obtained with Tikhonov's first order regularization. Thus, the objective function was rewritten in the form

where a1 is the first-order Tikhonov's regularization parameter.

The test cases examined below in dimensionless form are physically associated with a heat conduction problem in a homogeneous steel bar of length 0.050 m. The diffusion coefficient and the spatial distribution of the source term are supposed to vary from base values of D(x) = 54 W/mK and µ(x) = 105W/m3K, which result in dimensionless base values of Dc = 1 and mc = 5, respectively. The base values for the diffusion coefficient and source term distribution are associated with solid-solid phase transformations in steels. The final time is assumed to be 60 seconds, resulting in a dimensionless value of tf = 0.36. Fifty measurements are supposed to be available per temperature sensor.

Figure 10 shows the results obtained with the conjugate gradient method and with the measurements of two non-intrusive sensors, for a step variation of D(x) and for constant µ(x). The simulated measurements in this case contained random errors with standard deviation s = 0.01 Ymax, where Ymax is the maximum absolute value of the measured variable. The initial guesses used for the iterative procedure of the conjugate gradient method and for the hybrid optimizer were D(x) = 0.9 and µ(x) = 4.5. We note in Fig. 10 that quite good results were obtained for such a strict test case involving a discontinuous variation for D(x), even with only two non-intrusive sensors, by using the conjugate gradient method of function estimation.


Figure 11 shows the results obtained for the same test case shown in Fig. 10, but with 10 sensors. We note in Fig. 11 that more accurate results are obtained for both D(x) and µ(x) when more sensors are used in the inverse analysis.


Figure 12 shows results similar to those presented in Fig. 10, obtained with the hybrid optimization algorithm by using 2 non-intrusive sensors, with the regularization parameter set as zero. The x-axis in this figure represents the ith value of the control volume, that is, i = 80 represents x = 1.0. Note that the estimated functions suffer from large oscillations resulting from the ill-posed character of the inverse problem.


In order to find the best value of the first order Tikhonov's regularization parameter a1, we used the L-shape curve as shown in Fig. 13. In this figure, the x-axis represents the second term appearing on the RHS of Eq. (54) and the y-axis represents the first term appearing on the RHS of Eq. (54). The best choice for a1 is the one that minimizes both terms represented by the x and y-axis. The optimum value for a1 in this case was 0.0001.


Figure 14 shows the estimated functions obtained with such value of the regularization parameter and the hybrid optimization approach, with the measurements of two non-intrusive sensors. Note that the oscillations are eliminated because of the stabilization introduced by the first order regularization term. However, the estimated function for D(x) is in very bad agreement with the exact one. This is probably due to the fact that this case, involving 50 measurements per sensor and only two sensors, is underdetermined, that is, the number of unknown parameters is less than the number of measurements. Note that for the hybrid optimization approach, 2 parameters are estimated for each of the control-volumes used for the discretization of the domain, corresponding to the values of D(x) and µ(x) at the control volume. Therefore, a total of 160 parameters were estimated in this case. A comparison of the functions estimated for D(x) with the conjugate gradient method and with the hybrid optimization approach by using only two sensors (see Figs. 10 and 14, respectively) shows that the conjugate gradient method is not as sensitive to the fact that the problem is undetermined. This is because of the fact that the measured data is used in the source-function term of the adjoint problem in the function estimation approach with the conjugate gradient method, so that the information from the sensors at the boundaries is extended to the rest of the domain through the adjoint function l(x,t). On the other hand, the accuracies of the functions estimated for µ(x) with the conjugate gradient method and with the hybrid optimization approach, with the measurements of two non-intrusive sensors (see Figs. 10 and 14, respectively), are similar. This is due to the fact that the exact µ(x) is constant and the initial guess used for both approaches was relatively near the exact function.


Figure 15 shows the estimated functions obtained with the hybrid optimization approach and the measurements of 10 sensors evenly spread in the medium and with a1 = 0.0001. It is interesting to note that the function estimated for µ(x) with 10 sensors is in much better agreement with the exact one than that obtained with 2 sensors. A comparison of Figs. 11 and 15 shows that similar results are obtained for the simultaneous estimation of D(x) and µ(x), by using either the conjugate gradient method or the hybrid optimization approach, when ten sensors are used in the inverse analysis.


We note that the use of the regularization parameter, based on the Tikhonov's technique, produced the regularization necessary to obtain stable results for the estimation of both functions with the hybrid optimization approach. In fact, completely unstable results were obtained if the regularization technique was not used, as a result of the ill-posed character of the inverse problem under consideration. Also, the results presented above show that the two solution approaches examined in this paper are not sensitive to measurement errors. In fact, qualitatively similar results were obtained by using errorless simulated measurements.

Finally, Fig. 16 shows the convergence history of the hybrid optimizer for the estimation of the functions presented in Fig. 15, where one can see that no further reduction in the cost function is obtained after approximately 500 iterations.


Inverse Forced Convection Problem of Simultaneous Estimation of Two Boundary Heat Fluxes in Irregularly Shaped Channels (Colaço and Orlande, 2001)

This work deals with the use of the conjugate gradient method of function estimation for the simultaneous identification of two unknown boundary heat fluxes in channels with laminar flows. The irregularly shaped channel in the physical domain is transformed into a parallel plate channel in the computational domain, by using an elliptic scheme of numerical grid generation. The direct problem, as well as the auxiliary problems and the gradient equations, required for the solution of the inverse problem with the conjugate gradient method, are formulated in terms of generalized boundary-fitted coordinates. Therefore, this solution approach can be readily applied to forced convection boundary inverse problems in channels of any shape. Direct and auxiliary problems are solved with finite-volumes. The numerical solution for the direct problem is validated by comparing the results obtained here with benchmark solutions for smoothly expanding channels.

For the results presented below, we considered as an example of application of the present solution approach, test-cases involving the laminar flow of water (r = 1000.52 kg/m3, µ = 0.001 kg/m s, k = 0.597 W/m, Cp = 4.1818 x 103 J/kg ºC) inside a channel with a smooth expansion, as illustrated in Fig. 17.


Test-cases involving the estimation of the time and spatial variations for the heat fluxes were examined. For the estimation of the unknown functions we considered available the readings of 28 sensors uniformly distributed along the channel and located at the second control-volumes distant from each of the walls. Simulated measurements with a standard-deviation s = 0.01 Tmax were used for the inverse analysis. Figs. 18.a and 18.b show that the estimated functions are quite accurate, even for the strict exact functions tested here, and for measurements with random errors. These figures present the results obtained for t = 999 s and 1665 s, respectively.



Inverse Natural Convection Problem of Simultaneous Estimation of Two Boundary Heat Fluxes In Irregular Cavities (Colaço and Orlande, 2004)

This paper deals with the use of the Conjugate Gradient Method of function estimation with Adjoint Problem for the simultaneous identification of two boundary conditions in natural convection inverse problems in two-dimensional irregular cavities. This is a much more involved inverse problem solution than that addressed above for forced convection. Such is the case because, for natural convection, the energy equation is coupled with continuity and momentum equations. As a result, the counterparts of these equations for the sensitivity and adjoint problems are also coupled.

The unknown boundary conditions are estimated with no a priori information about their functional forms. Irregular geometries in the physical domain are transformed into regular geometries in the computational domain by using an elliptic scheme of numerical grid generation. The methodology is applied to cases involving an annular cavity, as depicted in Fig. 19, where the position and time dependent heat fluxes are unknown at the inner and outer surfaces. These two surfaces are supposed to be maintained at the constant temperatures Th and Tc, respectively.


For the results presented here we considered natural convection of air with physical properties: r=1.19 kg/m3; µ=1.8 x 10-5 kg/m s; b =0.00341 K-1; Pr=0.70; K=0.2624 W/m K; CP=1020.4 J/Kg ºC. The test-cases analyzed below correspond to a Rayleigh number of 5 x 104, where the characteristic length used was L=R2 - R1. For this Rayleigh number, R2 was taken as 54.4 mm and R1 as 22.9 mm, while the temperatures at the walls h = N and h=1 were taken as Th = 30 ºC and Tc = 20 ºC, respectively.

We now examine the results obtained with simulated measurements containing random errors of standard deviation s = 0.6 ºC. For this test-case, the measurements of 27 sensors, located near each of the boundaries, were assumed available for the inverse analysis. The sensors were located 1.08 mm below the surface at h=1 and 1.25 mm below the surface at h=N. Figures 20.a,b show that the estimated functions are in very good agreement with the exact ones for such test-case, although some oscillations are observed in the inverse problem solution, especially near the sharp variation around 2 seconds at h=N (see Fig. 20.a).


Optimization Problem in Magnetohydrodynamics (Colaço et al, 2003b; Dulikravich et al, 2003a; Dulikravich et al, 2003b; Dulikravich et al, 2003c; Dulikravich et al, 2004)

This section presents the optimization problem for achieving desired features of a melt undergoing solidification, involving the application of an external magnetic field, whose intensity and spatial distribution are obtained by the use of a Hybrid Method. The intensities of the magnets along the boundaries of the container are described as B-splines. The inverse problem is then formulated as to find the magnetic boundary conditions (the coefficients of the B-splines) in such a way that the gradients of temperature along the gravity acceleration direction are minimized.

Transient Navier-Stokes and Maxwell equations were discretized using the Finite Volume Method in a generalized curvilinear non-orthogonal coordinate system. For the phase change problems, an enthalpy formulation was used. The code was validated against analytical and numerical benchmark results with very good agreement in both cases.

First, let us demonstrate the inverse determination of the magnetic boundary conditions that create certain pre-specified flow-field within some domain. Fig. 21 shows the geometry and the boundary conditions for the test cases considered here. The height and length of the square container were equal to 0.15 m. The top and bottom walls were kept thermally insulated. The left boundary was kept at a "hot" temperature while the right wall was kept at a "cold" temperature. For the first test case, there was no phase change, since the "hot" and "cold" temperatures were above the melting temperature.


The four walls were subjected to unknown magnetic field distributions whose directions were made orthogonal to each wall. In order to satisfy the magnetic flux conservation equation

The following periodic conditions were imposed

The objective was to minimize the natural convection effects by reducing the gradient of temperature along the y direction, thus trying to obtain a temperature profile similar to those obtained for pure conduction. The objective function to be minimized is then formulated as:

The magnetic field boundary conditions were inversely determined at either four or six points equally spaced along each of the four boundaries and interpolated using B-splines for the other points at those boundaries. The magnetic boundary conditions at x = 0.15 m and y = 0.15 m were then obtained using the periodic conditions from Eq. (56) and Eq. (57).

The fluid analyzed was silicon. For the first test case, the temperature difference Th-Tc was set equal to 0.65 K, which gives a Rayleigh number of 105.

Figure 22 shows streamlines, isotherms and heat fluxes on all four boundaries, predicted without any magnetic flux applied and no phase change (left column), as well as streamlines, isotherms and heat fluxes on all four boundaries resulting from the optimized magnetic boundary conditions with six points on each boundary. One can see that the gradients of temperature in the y direction are reduced. Figure 23 shows the optimized magnetic field boundary conditions for x = 0 and y = 0 and Fig. 24 shows the convergence history of the process.




As a second test case, we minimized the curvature of the isotherms in a solidifying process after a pre-specified time from the start of the solidification process. The temperature difference Th-Tc was set equal to 6.5 K (Th = 1686. K, Tc = 1676.5 K) and the length of the square container was taken as 0.069 m, which gives a Rayleigh number of 105. The solidus and liquidus temperatures were equal to 1681.0 K and 1686.0 K, respectively. Thus, a mushy region exists between the phases. The initial condition was set as T0 = Th. Then, the solidifying process started at the right wall, where T = Tc.

Figure 25 shows the streamlines, isotherms and heat fluxes on all four boundaries for this test case without any magnetic flux applied, predicted at 500 seconds, (left column) as well as the streamlines, isotherms and heat fluxes on all four boundaries resulting from the optimization of six B-splines points for the estimation of the magnetic boundary conditions on each boundary (right column). The boundary conditions at the other points were interpolated using B-splines. One can see that the curvature of the isotherms is smaller than in the case without any magnetic fields applied.


Figure 26 shows the optimized variation of the magnetic fields orthogonal to x = 0 and y = 0 boundaries. Figure 27 shows the convergence history of the optimization process.



Optimization Problem in Electrohydrodynamics (Colaço et al, 2003a; Colaço et al, 2004)

In another test case we dealt with the inverse determination of the electric boundary conditions that create some pre-specified flow-field features within some region. Figure 28 shows the geometry and the boundary conditions for the configuration considered here.


The height and length of the closed container filled with the electrically conducting liquid were equal to 33.3 mm and 66.7 mm, respectively. The vertical walls were kept thermally insulated. The bottom boundary was kept at a "hot" temperature while the top wall was kept at a "cold" temperature. A slightly triangular temperature profile was applied to the bottom wall in order to create a preferential direction for the thermally induced fluid flow.

The vertical walls were subjected to unknown electric potential boundary conditions. The electrically charged particles were assumed to enter the fluid from the walls where the electric potential was applied. The objective was to minimize the natural convection effects by reducing the gradient of temperature along the x direction, thus trying to obtain a temperature profile similar to those obtained for pure conduction. The objective function to be minimized was then formulated as

The electric boundary conditions were inversely determined at six points equally spaced along each of the vertical walls and parameterized using B-splines for the other points of these boundaries. In this case we considered natural convection of gallium arsenide. The temperature difference Th-Tc was set equal to 1.0 K, which gives a Rayleigh number of 1.9x104.

For the first test case, there was no phase change, since the "hot" and "cold" temperatures were above the melting temperature (Th = 1521.5 K; Tc = 1520.5 K).

Figure 29 shows the streamlines, isotherms and heat fluxes on all four walls predicted for the first test case without any electric field applied and no phase change (left column). Figure 29 also shows the streamlines, isotherms, and heat fluxes on the four walls when using six points on each vertical wall for the estimation of the electric boundary conditions (right column). One can see that the gradients of temperature in the x direction are reduced close to the top and bottom walls. One can see that the isotherms start to become horizontal which is similar to those obtained if the gravity vector was acting in the horizontal direction.


Figure 30 shows the optimized electric potential and Fig. 31 shows the convergence history.



In a second test case, we tried to minimize the curvature of the isotherms in a solidifying process after a pre-specified time from the start of the solidifying process. Figure 32 shows (left column) the results obtained for a Rayleigh number equal to 1.9x104 without any electric field applied. In this case, the "hot" and "cold" temperatures were equal to 1510.5 K and 1511.5 K, respectively. Figure 32 also shows the results obtained with an optimized electric potential acting in the horizontal direction. Note that the isotherms are smoother than those for a case without any electric field applied.


Figure 33 shows the optimized electric potential and Fig. 34 shows the convergence history for the hybrid optimizer.



Acknowledgements

This work was supported by CNPq, CAPES and FAPERJ, agencies for the fostering of science of the Brazilian government and of the state of Rio de Janeiro government. The supported provided by UTA (University of Texas at Arlington), FIU (Florida International University) and NSF-USA is also greatly appreciated.

Presented at ENCIT2004 – 10th Brazilian Congress of Thermal Sciences and Engineering, Nov. 29 – Dec. 03, 2004, Rio de Janeiro, RJ, Brazil.

Technical Editor: Atila P. Silva Freire.

  • Alifanov, O. M., 1994, "Inverse Heat Transfer Problems", Springer-Verlag, New York.
  • Alifanov, O. M., Artyukhin, E. and Rumyantsev, A., 1995, "Extreme Methods for Solving Ill-Posed Problems with Applications to Inverse Heat Transfer Problems", Begell House, New York.
  • Bard, Y. B., 1974, "Nonlinear Parameter Estimation", Acad. Press, New York.
  • Beale, E.M.L., 1972, "A Derivation of Conjugate Gradients", Numerical Methods for Nonlinear Optimization, F. A. Lootsma (editor), pp. 39-43.
  • Beck, J. V. and Arnold, K. J., 1977, "Parameter Estimation in Engineering and Science", Wiley Interscience, New York.
  • Beck, J. V., Blackwell, B. and St. Clair, C. R., 1985, "Inverse Heat Conduction: Ill-Posed Problems", Wiley Interscience, New York.
  • Beck, J.V., 1999, "Sequential Methods in Parameter Estimation", 3rd International Conference on Inverse Problems in Engineering, Tutorial Session, Port Ludlow, WA.
  • Belegundu, A.D and Chandrupatla, T. R., 1999, "Optimization Concepts And Applications In Engineering", Prentice Hall.
  • Broyden, C.G., "A Class of Methods for Solving Nonlinear Simultaneous Equations", 1965, Math. Comp., vol. 19, pp. 577-593.
  • Broyden, C.G., 1967, "Quasi-Newton Methods and Their Applications to Function Minimization", Math. Comp., vol. 21, pp. 368-380.
  • Colaço, M.J., Dulikravich, G.S. and Martin, T. J., 2003a, "Optimization of Wall Electrodes for Electro-Hydrodynamic Control of Natural Convection Effects During Solidification", In: ASME International Mechanical Engineering Congress & Exposition, Washington, DC, November.
  • Colaço, M.J., Dulikravich, G.S. and Martin, T. J., 2004, "Optimization of Wall Electrodes for Electro-Hydrodynamic Control of Natural Convection Effects During Solidification", Materials and Manufacturing Processes, Vol. 19, No. 4, pp. 719-736.
  • Colaço, M.J., Dulikravich, G.S. and Martin, T.J., 2003b, "Reducing Convection Effects in Solidification by Appling Magnetic Fields Having Optimized Intensity Distribution" (Keynote Lecture), In: ASME Summer Heat Transfer Conference, Las Vegas, Nevada, July.
  • Colaço, M.J., Orlande, H.R.B., Dulikravich, G.S. and Rodrigues, F. A., 2003c, "A Comparison of Two Solution Techniques for the Inverse Problem of Simultaneously Estimating the Spatial Variations of Diffusion", In: ASME International Mechanical Engineering Congress & Exposition, November.
  • Colaço, M.J., Orlande, H.R.B., 2001, "Inverse Forced Convection Problem Of Simultaneous Estimation of Two Boundary Heat Fluxes in Irregularly Shaped Channels", Numerical Heat Transfer, Part A, Vol. 39, pp. 737-760.
  • Colaço, M.J., Orlande, H.R.B, 2004, "Inverse natural convection problem of simultaneous estimation of two boundary heat fluxes in irregular cavities", International Journal of Heat and Mass Transfer, V. 47, 12011215.
  • Corana, A., Marchesi, M., Martini, C. and Ridella, S., 1987, "Minimizing Multimodal Functions of Continuous Variables with the 'Simulated Annealing Algorithm", ACM Transactions on Mathematical Software, vol. 13, pp. 262-280.
  • Daniel, J.W., 1959, "The Approximate Minimization of Functionals", Prentice-Hall, Englewood Cliffs, 1971.
  • Deb, K., 2002, "Multi-Objective Optimization Using Evolutionary Algorithms", John Wiley & Sons.
  • Dennis, J. and Schnabel, R., 1983, "Numerical Methods for Unconstrained Optimization and Nonlinear Equations", Prentice Hall.
  • Denisov, A. M., 1999, "Elements of the Theory of Inverse Problems", VSP, Netherlands.
  • Dorigo, M. and Stützle, T., 2004, "Ant Colony Optimization",
  • Dulikravich, G. S. and Martin, T. J., 1996, "Inverse Shape and Boundary Condition Problems and Optimization in Heat Conduction", Chapter 10 in Advances in Numerical Heat Transfer, 1, 381-426, Minkowycz, W. J. and Sparrow, E. M. (eds.), Taylor and Francis.
  • Dulikravich, G.S., Colaço, M.J., Martin, T.J. and Lee, S., 2003a, "Magnetized Fiber Orientation and Concentration Control in Solidifying Composites", In: Symposium on Materials Processing Under the Influence of Electrical and Magnetical Fields, TMS Annual Meeting, San Diego, CA, March.
  • Dulikravich, G.S., Colaço, M.J., Martin, T.J. and Lee, S., 2003b, "An Inverse Method Allowing User-Specified Layout of Magnetized Micro-Fibers in Solidifying Composites", Journal of Composite Materials, vol. 37, no. 15, pp. 1351-1365.
  • Dulikravich, G.S., Colaço, M.J., Dennis, B.H., Martin, T.J. and Lee, S., 2003c, "Optimization of Intensities, and Orientations of Magnets Controlling Melt Flow During Solidification", In: Symposium on Materials Processing Under the Influence of Electrical and Magnetical Fields, TMS Annual Meeting, San Diego, CA, March.
  • Dulikravich, G.S., Colaço, M.J., Dennis, B.H., Martin, T.J. and Lee, S., 2004, "Optimization of Intensities, and Orientations of Magnets Controlling Melt Flow During Solidification", Materials and Manufacturing Processes, vol. 19, iss. 4, pp. 695-718.
  • Eberhart, R., Shi, Y. and Kennedy, J., 2001, "Swarm Intelligence", Morgan Kaufmann.
  • Fletcher, R., 2000, "Practical Methods of Optimization", John Wiley & Sons.
  • Fletcher, R. and Reeves, C.M., 1964, "Function Minimization by Conjugate Gradients", Comput. Journal, vol.7, pp.149-154.
  • Fletcher, R. and Powell, M.J.D., 1963, "A Rapidly Convergent Descent Method for Minimization", Computer J., vol. 6, pp. 163-168.
  • Goffe, W.L., Ferrier, G.D. and Rogers, J., 1994, "Global Optimization of Statistical Functions with Simulated Annealing", Journal of Econometrics, vol. 60, pp. 65-99.
  • Goldberg, D.E., 1989, "Genetic Algorithms in Search, Optimization, and Machine Learning", Addison-Wesley Pub Co.
  • Hadamard, J., 1923, "Lectures on Cauchy's Problem in Linear Differential Equations", Yale University Press, New Haven, CT.
  • Hensel, E., 1991, "Inverse Theory and Applications for Engineers", Prentice Hall, New Jersey.
  • Hestenes, M.R. and Stiefel, E., 1952, "Method of Conjugate Gradients for Solving Linear Systems", J. Res. Nat. Bur. Standards Sect. B, vol. 49, pp.409-436.
  • Isakov, V., 1998, "Inverse Problems for Partial Differential Equations", Applied Mathematical Sciences, vol. 127, Springer.
  • Jaluria, Y., 1998, "Design and Optimization of Thermal Systems", McGraw Hill.
  • Kennedy, J. and Eberhart, R.C., 1995, "Particle Swarm Optimization", Proceedings of the 1995 IEEE International Conference on Neural Networks, vol. 4, pp. 1942-1948.
  • Kennedy, J., 1999, "Small Worlds and Mega-Minds: Effects of Neighborhood Topology on Particle Swarm Performance", Proceedings of the 1999 Congress of Evolutionary Computation, IEEE Press, vol. 3, pp. 1931-1938.
  • Kirsch, A., 1996, "An Introduction to the Matheamtical Theory of Inverse Problems", Applied Mathematical Sciences, vol. 120, Springer.
  • Kurpisz, K. and Nowak, A. J., 1995, "Inverse Thermal Problems", WIT Press, Southampton, UK.
  • Levenberg, K., 1944, "A Method for the Solution of Certain Non-linear Problems in Least Squares", Quart. Appl. Math., 2, 164-168.
  • Marquardt, D. W., 1963, "An Algorithm for Least Squares Estimation of Nonlinear Parameters", J. Soc. Ind. Appl. Math, 11, 431-441.
  • Moré, J. J., 1977, "The Levenberg-Marquardt Algorithm: Implementation and Theory", In: Numerical Analysis, Lecture Notes in Mathematics, Vol. 630, G. A. Watson (ed.), Springer-Verlag, Berlin, 105-116.
  • Morozov, V. A., 1984, "Methods for Solving Incorrectly Posed Problems", Springer Verlag, New York.
  • Murio, D. A., 1993, "The Mollification Method and the Numerical Solution of Ill-Posed Problems", Wiley Interscience, New York.
  • Naka, S., Yura, T.G. and Fukuyama, T., 2001, "Practical Distribution State Estimation using Hybrid Particle Swarm Optimization", Proceedings IEEE Power Engineering Society, Winter Meeting, Columbus, Ohio, January 28-February 1st
  • Ozisik, M.N. and Orlande, H.R.B., 2000, "Inverse Heat Transfer: Fundamentals and Applications", Taylor and Francis, New York.
  • Polak, E., 1971, "Computational Methods in Optimization", Academic Press, New York.
  • Powell, M.J.D., 1977, "Restart Procedures for the Conjugate Gradient Method", Mathematical Programming, vol. 12, pp.241-254.
  • Ramm, A. G., Shivakumar, P.N. and Strauss, A. V. (eds.), 2000, "Operator Theory and Applications", Amer. Math. Soc., Providence.
  • Sabatier, P. C., (ed.), 1978, "Applied Inverse Problems", Springer Verlag, Hamburg.
  • Stoecker, W. F., 1989, "Design of Thermal Systems", McGraw Hill.
  • Storn, R. and Price, K.V., 1996, "Minimizing the Real Function of the ICEC'96 Contest by Differential Evolution", IEEE Conf. on Evolutionary Computation, pp. 842-844.
  • Tikhonov, A. N. and Arsenin, V. Y., 1977, "Solution of Ill-Posed Problems", Winston & Sons, Washington, DC.
  • Trujillo, D. M. and Busby, H. R., 1997, "Practical Inverse Analysis in Engineering", CRC Press, Boca Raton.
  • Woodbury, K., 2002, "Inverse Engineering Handbook", CRC Press, Boca Raton.
  • Yagola A. G., Kochikov, I.V., Kuramshina, G. M. and Pentin, Y. A., 1999, "Inverse Problems of Vibrational Spectroscopy", VSP, Netherlands.
  • Zubelli, J. P., 1999, "An Introduction to Inverse Problems: Examples, Methods and Questions", Institute of Pure and Applied Mathematics, Rio de Janeiro, Brazil.
  • Publication Dates

    • Publication in this collection
      20 Mar 2006
    • Date of issue
      Mar 2006
    Associação Brasileira de Engenharia e Ciências Mecânicas - ABCM Av. Rio Branco, 124 - 14. Andar, 20040-001 Rio de Janeiro RJ - Brazil, Tel.: +55 21 2221-0438, Fax: +55 21 2509-7129 - Rio de Janeiro - RJ - Brazil
    E-mail: abcm@abcm.org.br