Multi-objective optimization with Kriging surrogates using “moko”, an open source package

Many modern real-world designs rely on the optimization of multiple competing goals. For example, most components designed for the aerospace industry must meet some conflicting expectations. In such applications, low weight, low cost, high reliability, and easy manufacturability are desirable. In some cases, bounds for these requirements are not clear, and performing mono-objective optimizations might not provide a good landscape of the required optimal design choices. For these cases, finding a set of Pareto optimal designs might give the designer a comprehensive set of options from which to choose the best design. This article shows the main features and functionalities of an open source package, developed by the present authors, to solve constrained multi-objective problems. The package, named moko (acronym for Multi-Objective Kriging Optimization), was built under the open source programming language R. Popular Kriging based multiobjective optimization strategies, as the expected volume improvement and the weighted expected improvement, are available in the package. In addition, an approach proposed by the authors, based on the exploration using a predicted Pareto front is implemented. The latter approach showed to be more efficient than the two other techniques in some case studies performed by the authors with moko.


INTRODUCTION
Multi-objective optimization is a field of interest in many real-world applications. Engineering design has multiple and conflicting goals and, most of the time, the relationship between the decision space (design variables domain) and the outcome is highly complex. Around the 2000's, some comprehensive reviews were made that cover many of the basic ideas and challenges related to many-objective optimization (Van Veldhuizen and Lamont, 1998;Zitzler, 1999;Deb et al., 2002b). Today, there are many efficient algorithms for dealing with multi-objective optimization, most of them based on evolutionary techniques (evolutionary multi-objective optimization algorithms -EMOAs). Some state-of-the-art algorithms can be found in Corne et al., (2001); Zitzler et al., (2001); Deb et al., (2002a); Emmerich et al., (2005); Beume et al., (2007) ;Deb, (2014); Chen et al., (2015). However, even when using such efficient frameworks, if the evaluation of the designs is too time consuming, a surrogate approach is usually used to alleviate the computational burden.
To address this issue, the present authors have developed an open-source package based on Kriging interpolation models. The package, named moko (acronym for Multi-Objective Kriging Optimization), is written in R language and presents three multi-objective optimization algorithms: (i) MEGO, in which the efficient global optimization algorithm (EGO) is applied to a single-objective function consisting of a weighted combination of the objectives, (ii) HEGO, which involves sequential maximization of the expected hypervolume improvement, and (iii) MVPF, an approach based on sequential minimization of the variance of the predicted Pareto front. The first two are implementations of well-known approaches of the literature, and MVPF is a novel technique first presented in Passos and Luersen (2016) and in Passos and Luersen (2018), in which a composite panel with curved fibers is optimized for multiple objectives.
In R language, packages are the fundamental units of reproducible codes. They must include reusable R functions, the documentation that describes how to use them and sample data. For developing the moko code, the authors followed the guidelines and good practices provided by Wickham (2015). This resulted in a package that has been accepted and is available at the Comprehensive R Archive Network (CRAN), the official public repository for R packages.
We highlight that the present paper is a revised and extended version of the work presented in the conference MecSol2017 (Passos and Luersen, 2017b).

Kriging Basics
Kriging is a spatial interpolation method and was originated in the field of geosciences (Krieg, 1951). Given a black box function : d y D     , we can model it as being a realization of a stochastic process  , D The idea of Kriging is to predict unknown values of   y x based on the conditional distribution of   x  given some priori knowledge. This priori knowledge comes from a finite set of n-observations of   y x . In practical terms, the Kriging surrogate model can be seen simply as the following Gaussian random field containing initial known design points (usually called design of experiments (DOE)); are the true responses evaluated at X and   m x and   2 s x are scalar fields representing the mean and variance which define the Gaussian random field. The process of building the mean and variance descriptors for the Gaussian field is well covered in the literature (Sacks et al., 1989;Jones et al., 1998;Forrester et al., 2008;Roustant et al., 2012;Scheuerer et al., 2013).
To illustrate the Kriging surrogate, consider the unidimensional scalar function given by which is evaluated at the following DOE points: The Kriging model for this case is presented in Figure 1. In this figure, the unknown exact function is depicted in bold line and   m x (the predictor's mean value) in dashed line. A section view is drawn on the right side to show the probability density of Figure 1: Example of a Kriging surrogate for a scalar mono-variable function.
As already mentioned, moko package presents three different Kriging-based multi-objective techniques, which are discussed in the following subsections. The derivation of the Kriging predictor and the design of experiments (DOE) concept are not covered in this paper. The reader can find a comprehensive mathematical description of Latin American Journal of Solids and Structures, 2018, 15(10 Thematic Section), e74 3/17 these subjects in Forrester et al. (2008). The Kriging models are built using the DiceKriging R package (Roustant et al., 2012).

The MOKO paradigm
Most of surrogate-based multi-objective algorithms uses an iterative approach. The most common framework is to interactively search for infill points that maximizes some search criteria (like the expected improvement EI). All algorithms implemented in 'moko' follows that paradigm and Figure 2 shows the workflow of 'moko'. The efficient global optimization algorithm (EGO) was proposed by Jones et al. (1998) for mono-objective optimization. It consists in, from an initial set of samples X , a Kriging model is built using the responses of a highfidelity model 1 , and EGO sequentially maximizes the expected improvement (EI) and updates the model at each iteration (including re-estimation of the hyperparameters).
The basic idea of the EI criterion is that by sampling a new point x  the results will be improved by or 0 otherwise, where min y is the lowest value of the responses y obtained so far.
Obviously, the value of this improvement is not known in advance because   y x  is unknown. However, the expectation of this improvement can be obtained using the information from the Kriging predictor. The EI criterion has important properties for sequential exploitation and exploration (filling) of the design space: it is null at points already visited (thus preventing searches in well-known regions and increasing the possibility of convergence) and positive at all other points, and its magnitude increases with predicted variance (favoring searches in unexplored regions) and decreases with the predicted mean (favoring searches in regions with low predicted values). The EGO algorithm can be easily adapted to a multi-objective framework by scalarizing the objective vector into a single function (Knowles, 2006). The constrains of the optimization problem can be considered simply by building independent metamodels for each constraint and multiplying the EI of the composed objective function by the probability of each constraint to be met (Sasena et al., 2002). The MEGO algorithm can be summarized as follows: , where  is drawn uniformly at random from a set of evenly distributed unit vectors and  is an arbitrary small value which we set to 0.05; (c) Build Kriging models for f  and for the constraints g ; (d) Find x  that maximizes the constrained expected improvement: (e) Evaluate the "true" values of   f x  and   x g  using high-fidelity models and update the database.

end while
More details regarding scalarization functions like sampling  can be found in Miettinen and Mäkelä (2002), Nikulin et al. (2012) and Ruiz et al. (2015). The parameter 0   , which must be a small positive value, is the socalled augmentation coefficient and it is used to assure the Pareto optimal solution can be found for any reference point (Ruiz et al. 2015).
Here, in order to handle the constrains, the EI is computed using a custom modified version of the DiceOptim::EI function provided in Ginsbourger et al. (2013). In addition, in all approaches, the optimized Latin hypercube is built using the R package lhs (Carnell, 2012).

Expected Hypervolume Improvement (EHVI -HEGO)
The hypervolume improvement (EHVI) infill criterion is based on the theory of the hypervolume indicator (Zitzler and Thiele, 1998), a metric of dominance of non-dominated the solutions have. This metric consists in the size of the hypervolume fronted by the non-dominated set bounded by reference maximum points. Figure 3 illustrates this concept for two objectives. The hollow circles points represent dominated designs, while the filled ones represent Pareto optimal solutions. The area hatched with solid lines is the dominated portion of the objective space, while the area hatched with dashed lines represents the area that would be additionally dominated when a new observation (the crossed square point) is added. This additional 'area' is called the hypervolume improvement (since it can be generalized for more than two objectives).
The EHVI is thus the expected improvement at the hypervolume size we would get by sampling a new point x  . Here, the EHVI is computed using the R package GPareto (Binois and Picheny, 2016). The original GPareto::EHI function does not account for constrains, so a custom modification was implemented (available in the moko package). The algorithm used here (HEGO -Hyper Efficient Global Optimization) is similar to the MEGO algorithm and can be summarized as follows: 1. Generate an initial DOE X using an optimized Latin hypercube; 2. Evaluate X using high-fidelity models and store responses of for the mobjectives and p -constraints; 3. while computational budget not exhausted do: (c) Find x  that maximizes the constrained expected hypervolume improvement: (d) Evaluate the "true" values of   f x  and   x g  using high-fidelity models and update the database.

end while
For this and the previous approach, a simulated annealing (SA) algorithm is used to maximize the infill criteria. SA algorithm is provided by the R package GenSA (Xiang et al., 2013).

Minimization of the Variance of the Kriging-Predicted Front (MVPF)
The MVPF technique is based on iterative improvement of the fidelity of the predicted Pareto set. The idea is to build a Pareto front from a given initial set of Kriging models (one for each cost or constraint function) using the mean of the predictor of each model as the input function. From the estimated front P , the design with higher variance x  (i.e., the most isolated in the decision space) has its "true" value evaluated using the high fidelity model.
A new set of Kriging models is then rebuilt, and the process is repeated until a stopping criterion is met. The MVPF algorithm can be summarized as follows: 1. Generate an initial DOE X using an optimized Latin hypercube; 2. Evaluate X using high-fidelity models and store responses of for the mobjectives and p -constraints; 3. while computational budget not exhausted do: (a) For each of the m -objectives and p -constraints, build a Kriging model; (b) Generate a Pareto set P using the mean predictor of the Kriging models using a state-of-the-art multi-objective optimization algorithm (such as NSGA-II); (d) Evaluate the "true" values of   f x  and   g x  using high-fidelity models and update the database.

end while
NSGA-II, proposed by Deb et al. (2002a), is an improved version of the NSGA (Non-dominated Sorting Genetic Algorithm), proposed by Srinivas and Deb (1994). The NSGA-II implementation used here is the one provided in the R package 'mco' (Mersmann, 2014). We highlight that instead of NSGA-II another efficient multi-objective algorithm could be employed.

MOKO PACKAGE: SOME FEATURES AND USAGE
The installation of 'moko' package can be done simply by downloading it from CRAN servers and running the following commands: install.packages('moko') library(moko) To make use of any of the three multi-objective optimization algorithms implemented in moko package, the first step is to create a set of initial sampling points (the DOE -design of experiments). This can be done in different ways, however, we suggest employing an optimized Latin hypercube. A randomly generated unit Latin hypercube of dimension k with n samples can be created using only base functions of R by running: doe <-replicate(k, sample(0:n, n))/n Alternatively, an optimized Latin hypercube can be generated using the R package lhs (Carnell, 2012) by running: doe <-lhs::optimumLHS(n, k) The second step (also common for all algorithms) is to evaluate the high fidelity model responses (objectives and constrains) at the DOE points. The way moko was built, for all algorithms it is expected a single function that returns a vector of responses. If the user has multiple functions for the objectives and constraints, they must be combined into a single one as: return(c(f1, f2, f3, g1, g2)) } Note that fun expects, as input argument, a single design vector x . Then, the cost functions are evaluated at each point of the DOE by running: res <-t(apply(doe, 1, fun)) Thereafter, with the output and the DOE, the Kriging models can be built. For this, the authors developed an object class named mkm (multi-objective Kriging model). This object is a structured list of km objects (Kriging models from the DiceKriging package). For instance, to build a mkm object for a fun that returns the first three values as objectives can be created by: model <-mkm(doe, res, modelcontrol = list(objective = 1:3)) Note that if the objective slot of the modelcontrol list is NULL (i.e., the index of the objectives are not provided) the function assigns that every response of fun is an object (i.e. unconstrained multi-objective optimization problem). On the other hand, if the user inputs the indexes for the objectives, the algorithm assumes the remainder responses are constrains. Now, to run nsteps iterations of the implemented methods for an object of class mkm, the following should be executed: new_model_MEGO

EXAMPLES
Two analytical examples selected from the literature are used to access the robustness of the proposed multiobjective optimization frameworks. These examples are easy-to-compute optimization problems, which allows comparisons with a direct optimization approach performed here by the NSGA-II algorithm.
The quality of the Pareto sets found is compared using the inverted generational distance (IGD) metric (Shimoyama et al., 2013). The IGD can be defined as where T and P are the true and the current Pareto sets, T is the number of designs in the true Pareto set and t and p are normalized vectors of length m of the m-objectives of the true and the actual Pareto sets, respectively, and   d  is a distance metric (in this case the Manhattan distance metric). Hence, the IGD corresponds to the average distance between all designs in the true set and the closest design of the current set. Thus, the lower the IGD, the better the method is.

Binh and Korn Problem
The Binh and Korn optimization problem is defined as (Binh and Korn, 1997): For this example, the "true" Pareto front is obtained by direct optimization using the NSGA-II algorithm with a population size of 500 over 100 generations. The resulting near-uniform Pareto set for 500  T is shown in  Note that the domain was scaled to the unit square, thus the input must be multiplied by   5, 3 . This step is optional, however, it is highly recommended since it makes the code cleaner.
To pass the constraints to the NSGA-II optimizer one must first split the function by creating an objective function and a constraint function. Both functions receive a vector x with size 2 and also return vectors with size 2.
Also, mco::nsga2 consider a violation of the constraint if the return argument is lesser than 0 . This is the reason for the minus sign preceding fun(x) in fun.cons.

Results
For each strategy, 50 independent runs are performed. Figure 5 shows the IGD histories during the optimization problem. For each run, the IGD values are averaged every five evaluations to avoid overcrowding the graph. The resulting data, gathered from the 50 independent runs, are represented by boxes. The median is shown as a bold horizontal line, and the whiskers correspond to ±1.5 IQR (interquartile range).

The Nowacki Beam
Based on a problem described by Nowacki (1980), the aim is to design a tip loaded cantilever beam (see Figure   7) for minimum cross-sectional area A and minimum bending stress  . The beam length is 1 500 mm l  and it is subject to a tip load force of 5000 N F  . The cross-section of the beam is rectangular, with breadth b and height h , which are the design variables. The design is constrained by five requisites and the optimization problem can be formally defined as the following:

Results
For this example, the results are arranged similar to the previous one and are shown in Figure 9.  Also, the average number of designs on the estimated Pareto fronts were 43.8 for HEGO, 23.56 for MEGO and 48.48 for VMPF.

Car-side Impact Problem
The mathematical formulation of the problem is presented in Jain and Deb (2014) and is reproduced here as follows: find:   This problem aims at minimizing the weight of car and at the same time minimize the pubic force experienced by a passenger and the average velocity of the V-Pillar responsible for withstanding the impact load. All the three objectives are conflicting, therefore, a three-dimensional trade-off front is expected. There are ten constraints involving limiting values of abdomen load, pubic force, velocity of V-Pillar, rib deflection, etc. There are seven design variables describing thickness of B-Pillars, floor, crossmembers, door beam, roof rail, etc.

Implementation
The so called 'car-side impact problem' (Jain and Deb, 2014) is not yet implemented on 'moko' (but it will be available in a future update). However, building this analytical function in R is simple: stop('x must be a vector of length 7 or a matrix with 7 columns') x1 <- Here, we used a higher number of generations to ensure a proper Pareto front is obtained. To illustrate, Jain and Deb (2014) solve the same problem using the NSGA-III algorithm using a population of only 156 individuals over 500 generations.
For this example, a computational budget of 80 high-fidelity model evaluations is also set. Of these 80 evaluations, 20 are spent building the initial DOE using an optimized Latin hypercube and the remainder are used with infill points. In each run, the same DOE is used for the three strategies to avoid random effects caused by different DOEs. This routine is implemented as follows: fun <-function (

Results
For this example, the results are arranged similar to the previous ones and are shown in Figure 11.

Integrating 'moko' with an external application
The greatest justification of using 'moko' (or any surrogate-based optimization algorithm) is to reduce computational cost when running complex optimization tasks. Indeed, the examples provided in this article would be optimized much faster by using a state-of-the-art multi-objective algorithm such as NSGA-II or SPEA-2 directly on the analytical functions. However, when the run time of a single call of a single objective or constraint is about several seconds or even minutes the number of required calls used by those algorithms makes the process unfeasible. This subsection aims to provide a small guidance for those who want to integrate 'moko' with an external application. The core function used for this is the base R 'system' function. This function invokes the operational system (OS) command specified by the 'command' field. The following example shows how this function could be used to call ANSYS Composite PrepPost: system('"C:/Program Files/ANSYS Inc/v170/ACP/ACP.exe" --batch=1 C:/scripts/scriptfile.py') In this example, the ANSYS executable is located on the folder "C:/Program Files/ANSYS Inc/v170/ACP". The guidelines on how to execute the structural simulation are contained on a script file called "scriptfile.py" on the directory "C:/scripts". How to create scripts for structural simulations on ANSYS is out of the scope of this paper, but the reader can find plenty of information on the internet and product manuals.
Since most computer aided engineering (CAE) software do not provide an application programming interface (API) to communicate with external tools, the easiest way to do this task is by the usage of plain text files ('.txt') or even comma separated files ('.csv'). In R programming, this can be done with the commands 'read.

FUTURE IMPLEMENTATIONS AND CONTRIBUTIONS
As 'moko' is an open source package, the interested reader is encouraged to look inside the code, test for bugs, request new features and even modify the code freely. The authors use the github 2 platform to interface with the users of 'moko'. There, anyone can chat with the developer, report bugs, and request for pulls of updated custom code.