Peak Ground Acceleration Models Predictions Utilizing Two Metaheuristic Optimization Techniques

Peak ground acceleration (PGA) is frequently used to describe ground motions accurately to defined the zone is critical for structural engineering design. This study developed a novel models for predicting the PGA using Artificial Neural Networks-Gravitational Search Algorithm (ANN-GSA) and Response Surface Methodology (RSM). This paper grants the prediction of PGA for the seismotectonic of Iraq, which is considered the earlier attempt in Iraqi region. The magnitude of the earthquake, the average shear-wave velocity, the focal depth, the distance between the station, and the earthquake source were used in this study. The proposed models are constructed using a database of 187 previous ground motion records, this dataset is also utilized to evaluate the effect of PGA’s parameters. In general, the results demonstrate that the newly proposed models exhibit a high degree of correlation, perfect mean values, a low coefficient of variance, fewer errors, and an acceptable performance index value compared to actual PGA values. However, the composite ANN-GSA model performs better than the RSM model.


INTRODUCTION
Seismic hazard analysis is a critical step in the engineering phase.Seismological characteristics of earthquakes include their distance, magnitude, soil effects, and kind of faulting.The engineering parameters of an earthquake can be classified into two broad categories: 1) parameters in the response domain; and 2) parameters in the time domain.Pseudo-spectral acceleration (PSA) is a frequently used response domain parameter.Peak ground acceleration (PGA), peak ground velocity (PGV), and peak ground displacement (PGD) are the three major time-domain class parameters.Both of these categories could be used to evaluate the hazards associated with construction.It has been demonstrated that spectral parameters are more effective than time-domain parameters (Luco and Cornell 2007).However, timedomain parameters are more appropriate for applications due to their independence from the structures under consideration (Al-Zuhairi et al., 2021).As a result, PGV, PGD, and PGA are frequently used in seismic risk assessments.
PGA is a well-known earthquake engineering metric that can be utilized for structural analysis and risk assessment during a seismic event.This critical component can be approximated using various techniques, including physical modelling and on-site inspections (Alavi and Gandomi 2011).However, implementing such a method is inconvenient, time-consuming, costly, and frequently impossible (Gandomi et al., 2011).A method for analyzing PGA uses attenuation relationships, which are critical in seismic analysis.PGA is often defined using several independent variables, including the magnitude of the earthquake, the distance between the source and the site, the local site circumstances, and the features of the earthquake source (Güllü andErçelebi 2007, Gandomi et al. 2011).
Due to the tremendous complexity and non-linearity of the PGA, it is not easy to establish a correlation between it and the predictors.Soft computing methods (SCTs) are widely utilized in engineering research to handle a variety of classification problems and anticipate a variety of issues (Hanoon et al. 2017a,b, Banyhussan et al. 2020), and more recently to predict ground motion characteristics (Alavi et al. 2011, Gandomi et al. 2011).SCTs are typically used to resolve complex numerical optimization problems and nonlinear systems.Numerous research problems in a variety of fields of science have been theoretically and analytically articulated utilizing soft computing techniques (Hanoon et al., 2021).Numerous classifications are incorporated in the SCT, for example, ANFIS (Adaptive Network-based Fuzzy Inference System), ANN (Artificial Neural Networks), SVM (Support Vector Machine), FL (Fuzzy Logic), and OA (Optimization Algorithms) (Jang and Topal 2014).Each category of soft computing also has a fine-grained set of algorithms; for example, the OA category includes the GSA (gravitational search algorithm), PSO (particle swarm optimization), GA (genetic algorithm), ABC (Artificial Bee Colony), ACO (Ant Colony Optimization), and DE (Differential Evolution).Pragmatic modelling and design using SCT are still hotly debated topics, particularly in engineering modelling.SCTs are based on experimental data rather than theoretical and/or analytical derivations, which is the primary distinction between SCTs and traditional models.SCTs are frequently complex and frequently unable to be expressed explicitly.As a result, they are best suited for inclusion as a more prominent computer program component, limiting their application (Hanoon et al., 2021).The artificial neural network (ANN) is the most widely used forecasting method in soft computing, having been successfully employed to address complicated pattern identification and analysis issues in a wide variety of domains, including earthquake engineering.ANNs have been widely used in recent years due to their superior pattern recognition capability, which is advantageous for various problems.The amount of hidden nodes in an ANN model is critical, as an overfitted model can result (Hanoon et al. 2021, Hason et al. 2021).
Using experimental and physical correlations, the prediction of model's equation based on site geology and event propagation has been developed.The regression analysis findings at a specific location with a massive amount of data are explained using empirical models based on mathematical procedures (Mahmod et al., 2017).Seismic wave models can also be utilized when there is insufficient data available to make an accurate determination.Response Surface Methodology (RSM) and Design of Experiments (DOE) have lately been used to create correlations between attenuation and other information processing methods (Hason et al. 2020).Research-based forecasting (RSM) relies on the results of experiments to predict future outcomes.In engineering, the RSM technique can be used to construct an acceptable model for establishing a relationship between the causes and the potential answers to a given issue (Hason et al., 2020).
The primary goal of optimization methods is to achieve values from a set of parameters that minimize and maximize objective functions under restrictions.The ANN-GSA and RSM are used in this study to create two models of peak ground acceleration.The proposed models are based on a large dataset of 187 events of a powerful earthquake that strikes Iraq (2004Iraq ( -2020)).This research would primarily contribute to the expanding body of knowledge on PGA and seismic activity assessment, especially in Iraqi tectonic regions.Furthermore, this study also facilitates the usage of combined ANN-GSA and RSM methods for earthquake and hazard forecasting.To date, this is the first attempt to derive explicit and implicit PGA models for the Iraq tectonic region from a variety of parameters.However, there are many unanswered concerns about the extent to which these variables affect PGA.Thus, the factorial design of trials is used to examine the impact of various characteristics on PGA in Iraq.The current research is based on data from the international earthquake stations, Latin American Journal of Solids and Structures, 2022, 19(3), e447 3/23 which is a limitation.The response (PGA) selected parameters (R Epi , Mw, Vs30, and FD) that were considered and affected the PGA response are also subject to the constraints herein.A central composite design was used to model the interactions between the elements (CCD).According to the results, the Mw and Vs30 have the most significant impact on PGA, followed by FD, and the REpi has the most negligible impact on PGA.

STUDY AREA
Iraq is located within longitudinal (45°38'-45°48') and latitudinal (28°5'-37°22') coordinates in south-western Asia (Hasan et al. 2014).The topography of Iraq is like a basin as it is surrounded on north and east by mountains, about 3,500 meters above sea level.Iraq has a total surface area of almost 437 thousand km 2 (Abbas et al. 2020, Abbas et al. 2020).
Figure 1 depicts the seismic and topographical conditions in the research region (Iraq) and the adjacent areas.After the collision of Anatolian plates with Iran's plates in the north and northeast of Iraq, the Bitlis-Zagros Fold and Thrust Belt were formed, which included a magnitude w.7.3 earthquake in November 2017 that killed 539 people, as well as thousands of others, in both countries, as it stretched from Turkey and Iraq to the Strait of Hormuz (Abdulnaby et al. 2014, Shafiqu andSa'ur 2016).The seismicity of plate boundary is linked to a variety of limits in Aden Gulf and the Red Sea, which are both growing.The frontiers of Iraq, Iran, and Turkey, where the Zagros and Bitlis tectonic meeting zones intersect, are experiencing several earthquakes (Ghalib et al., 2006).
There are numerous faults in Mesopotamian Foredeep, including the fault of Badra-Amarah (Iraq's most seismically active fault), the fault of Euphrates (the boundary between Mesopotamian Foredeep and the Stable Platform) as well as the fault of Al-Refaee, Kut, and Hummar (north of Basra).For the most part, Iraq is divided into three main tectonic areas (Fouad and Sissakian 2011).Bitlis-Zagros Fold and Thrust Belt, Mesopotamian Foredeep, and Inner (stable) Arabian Plate, in order from northeast to southwest: 1) (Onur et al. 2017).Seismological and seismotectonic studies in Iraq have clearly shown that seismic activity varies from moderate to high in the northern and north-eastern areas and decreases in the southern and south-eastern regions. in Iraq (Abd Alridha and Jasem 2013).

A description of the database's structure
As illustrated in Figure 2, more than 1800 historical earthquakes with magnitudes ranging from 3.0 to 7.3 struck the research region.Table 1 shows that 187 ground motions occurred between 2004 and 2020, with a magnitude of 4.5 to 7.3, representing the mild, moderate, strong, and significant earthquakes in the research area and the surrounding area.
Latin American Journal of Solids and Structures, 2022, 19(3), e447 4/23 In order to build and verify the suggested models, the datasets were split into 150 records and 37 records, respectively.A wide range of moment magnitude earthquakes (  ), average shear wave velocity ( 30 ), focal depth (FD), and closest epicentre distance (  ) are all included in the database.In order to determine the FD, we used worldwide datasets (NOAA, CSEMEMSC, IRIS, and the USGS), including the   and Mw values.Actual,  .and  30 on the other hand, were sourced from the GSHAP and USGS databases, respectively.The mapping and comparison were made using ArcGIS.With the availability of input data, models derived using SCTs may be anticipated and used for further progress.When it comes to modelling processes, the data quantity is equally important, as it influences the model correctness concerning its intended form.Aside from that, the size of the sample and the parameter combinations affect the performance of an updated model based on these inputs.To further understand how seismic parameter details were included in the suggested models, Table 1 summarizes the input data.According to the recommendations, the best dataset-to-input variable ratio is 3 for model applicability and greater than 5 for additional safety (Frank and Todeschini 1994).Thus, of the 187 datasets, 150 datasets (80%) were utilized to create the models, while the remaining 37 datasets (20%) were used to verify the models.In the current study, 150/5= 30 and 37/5= 7.4 were used to construct and test the suggested model.There was a considerable difference between designing and checking datasets (i.e., 5).

A Computed Intelligence (CI)
Computed Intelligence (CI), which is also known as soft computing technique (), is commonly utilized to determine nonlinear systems, complicated mathematical optimization queries, and non-differentiable problems (Jang and Topal 2014).One of the main objectives of the present paper is to precisely foretell the PGA of Iraq's tectonic regions employing the composite of ANN and GSA.The composite  −  algorithm is generated according to four effective independent input variables (  ,  30 , ,   ) against the response of dependent  as follows: Latin American Journal of Solids and Structures, 2022, 19(3), e447 6/23  =  �  ,  30 , ,   � (1)

Artificial Neural Network (ANN)
represents a data processing system, which has been produced as a comprehensive numerical form of rational natural nerves.In general, the main benefits of utilizing  are their ability to detect errors ideally and maintain training to develop their achievement when dealing with new learning data (Gholami et al., 2013). techniques are capable to model the complicated numerical relationship between the input parameters (i.e.,   ,   and  30 ) and target response (i.e.).A BP neural network and the Levenberg-Marquardt (LM) training method were used in the current study to prepare, examine, and verify the results.The reason behind considering the  training algorithm besides its activity and performance, since it provides fewer localization failure (Payal et al. 2014).Nevertheless, the  algorithm needs a significant quantity of operating memory (Kukolj and Levi 2004).
In the , prior preparation, examination, and verification of the data, many data must be determined first, i.e, the variables, number of hidden zones, rate of learning, and output number.Based on the selected variables in this research, which is represented by four parameters (  ,  30 , , and   ), the number of zones to build the  design is four to determine the response  of the tectonic study area.These zones included: (i) input zone; (ii) hidden zone; and (iii) output zone, as displayed in Figure 3.The input zone comprised four parameters,    30 ,  and   values.The input variables were addressed versus the powers of the individual connections   and summated per every neuron of the hidden zone to measure the output   in the output zone.Tan-Sigmoidal activation functions and the linear activation functions were utilized in the hidden and output zones, respectively, to include all varieties of the response values ().The neurons number in the hidden zone and training degree was chosen according to the  algorithm, which determines the most suitable number of neurons and training degree values of  in the hidden zone, to achieve the best optimum solution.Hence, the performance of  can be modified.This -algorithm comprises  to create a new algorithm called the composite  −  with a minimum of  predictions.
The algorithm of  (backpropagation) in multi-zone feed-forward networks is the appropriate standard algorithm based on the mathematical design of the training complex nonlinear connections.This performance index of the BP algorithm is called  (least mean square error), which can be determined by calculating the difference between the objective and the network outputs (Eq.2). (2) Where ,   ,    represents the number of learning configurations, the objectives output, and the networks output, respectively.and Structures, 2022, 19(3), e447 7/23

GSA technique
There are several excellent ways for finding a rational software solution, but one of the finest is the Heuristic algorithm (Rashedi et al., 2009).GSA is a contemporary heuristic algorithm based on Newton's gravitation and motion equations.An acceptable variable and sufficient input parameter values are necessary for an ANN to reflect the correlation between inputs and output.Consequently, the challenge concerns the number of hidden nodes and learning rate values used in an ANN algorithm.Hence, developing the hybridized version of soft computing models has overcome the previous drawbacks.In this study, these problems can be addressed using the performance of the gravitational search algorithm (GSA) algorithm through searching the optimal values.PSO algorithm is one of the distinguished optimization approaches introduced in the literature within the soft computing implementation for related structural engineering problems (Yaseen et al., 2018, Chen et al. 2018, Kaveh and Talatahari 2012).In order to determine the optimal ANN variables (number of neurons in every hidden layer and training rate), the GSA (heuristic algorithm) is merged with ANN.According to the suggested algorithm, agents' performance is defined by the number of targets they have been identified as being.
Objects are examined in the method, and their mass is used to gauge performance.Thanks to Newton's equations of motion and gravity (Schutz 2003).Illustration of the mass effect concerning other masses as shown in Figure 4.The heavier masses in this figure are linked to good outcomes and travel slower than the lighter masses.Taking this step will allow the algorithm to be more easily attacked.
The expressions    , , and  represent the location of the  ℎ dimension, the search space, and the agent's number, respectively.Besides, the gravitational force affecting on the  ℎ target according to the  ℎ target can be formulated as (Rashedi et al. 2009): Where   is the jth agent's mass, and the gravitational constant at time t is (),   () is the Euclidean distance between the ith and jth agents. 0 is the gravitational constants initial value, and,   is the maximum number of iterations (the total age of the system).When applied to the ith agent, the total force is denoted as: Latin American Journal of Solids and Structures, 2022, 19(3), e447 8/23 Where  is the set of the first k agents with the best fitness (objective function) value and is evaluated in such a way that it decreases linearly with time (Rashedi et al. 2009) and its value becomes 2% of the initial number of agents at the last iteration.where, rand-j.is a random number in the interval (0, 1).According to Newton's rule of motion, the  ℎ agent's acceleration, position, and velocity are given in Eqs.(8-10) at the  ℎ repetition and position.  () represents the ith agents fitness value at the  ℎ repetition of the,  ℎ repetition.The suggested algorithm stages are shown in Figure 5.According to the GSA algorithm, all masses are presumed to be equal.A more significant inertia mass improves search accuracy since the agents movement is slow.In contrast, an enormous gravitational mass attracts more agents, resulting in a quicker convergence rate.
Where the total data were divided into two main parts, the first one (80% of the total data), verification of these data was carried out by dividing it into three main parts, which were as follows: 1.During the training phase, 70% of the data was utilized.
2. 15% of the data was used during the testing phase.
3. 15% of the data was used during the validation phase.External data were used to verify further the model's accuracy in the second main part of the data, which accounts for 20% of the total data and whose values are not included in the first part.
In implementing any ANN model, the main problem is the selection of the datasets correlated to the issue during examination.To achieve stability during the analysis of this study, all input and output datasets were normalized between 0 to 1, by applying Eq. ( 17), before training the network by utilizing the max-min normalization criterion.The normalization of the data record prioritizes all of the computation parameters equally.
Where ,   ,   , and,   are the actual parameter value, the normalized value of the specific parameter, the min.and max.values of the database respectively.

Statistical model designation by RSM
To construct a new statistical model, numerous phases should be considered: (i) deriving the final model according to available datasets; and (ii) carrying out a parametric analysis according to the principles of engineering and problem physics.The first phase is constructed by the Response Surface Methodology (RSM).The second phase focuses on engineering principles and must be carried out based on a parametric analysis by an engineer that acknowledges the issue being modelled.This study tries mainly to analyze the effect of individual parameters on PGA.
Second-order polynomial or quadratic models can model and evaluate problems using RSM, a statistical and mathematical technique (Box and Draper 1987).An optimal output value can be found by exposing the solution to various factors using this technique (Montgomery 2017).The surface response methodology's central composite design (CCD) technique becomes extremely flexible whenever the intended parameters' preliminary lower and upper limitations have been surpassed.Using the CCD process, it is still possible to obtain the best possible values for parameters that appear well outside the initial fixed range.This study aims to learn more about what goes into calculating the PGA.To determine how design parameters interact, the CCD approach of RSM was utilized in conjunction with the DOE method (a statistical approach).This method does check the impact of parameters on the selected response (Antony 2014).
It is possible to discover the ideal combination of factors and their relationship, which is impossible in standard optimization techniques through factorial designs.Aside from that, mathematical models can be generated using these designs as a starting point.For the RSM, Eq. ( 18) represents a second-order polynomial model, which is commonly used to assess the impact of many variables on a response based on the datasets collected: Where y denotes the estimated response,   and   are coded variables,   denotes the constant,   denotes the linear coefficient,   denotes the quadratic coefficient, and,   denotes the interactive coefficient (Montgomery 2017).The most significant characteristics illustrating PGA behaviour were picked after conducting a trial study and a literature review (Gandomi et al. 2016).As a result, the formulation of PGA must take into account the link between the response and the specified parameters as described before in Eq. 1 ( = �  ,  30 , ,   � ).The RSM mechanism is depicted in Figure 7.

RESULTS AND DISCUSSION
The results of the proposed  models in terms of  −  algorithm (composite  − ) and  (  ) are presented and discussed in this section.Followed by the comparison between the proposed models relative to the actual values ( .).Validation and verification are conducted to investigate the degree of accuracy of the implicit and explicit models.Finally, the impact of the independent variables on the dependent response () is carried out to know which parameter has a significant influence during the selection of the ground motion components.

Composite ANN-GSA Algorithm model (𝑷𝑷𝑷𝑷𝑷𝑷 𝑷𝑷𝑨𝑨𝑨𝑨−𝑷𝑷𝑺𝑺𝑷𝑷 )
It should be noted that a specific algorithm to produce an exact outcome for all optimization problems is not exist.However, the GSA algorithm was performed throughout the parameter settings with different population sizes (60, 80, and 100) to enable the algorithm to choose the best population that accomplished the minimum accurate function, as shown in Table 2. To achieve the minimum errors between the actual and predicted PGAs, the preparation operation of ANN was replicated numerous times utilizing a high number of iterations (i.e., 1000).
MATLAB software was used to run the  algorithm according to the selected population sizes to obtain the ℎ (number of Neuron in the hidden layer) and  (Learning rate) as shown in Table 3.The mean absolute error (), which represents the objective function, of the composite  −  for various population sizes is demonstrated in Figure 8.This figure demonstrates that the optimal solution for the  is represented by the population size of 100 since it reaches the lowest  of about 125 iterations compared to the population size of 60 and 80, which needs more time of about 160 and 275 iterations respectively to reach the minimum error.According to the results of the proposed composite  − , the  was run based on the variables that performed the minimum  of population size-resulting in a high calculation accuracy of proposed  model.

ANN training and validation
To choose the best values of neurons in every ℎ and , the  is utilized to optimize the  running, which operated using the input parameters (,   ,   and  30 ) values and output response of the actual  .for Iraq's tectonic regions.The potential of different ANN structures.Figure 9 depicts the ANN structure with four input variables, sixteen hidden neurons in the hidden layer, and a single output variable.This structure was evaluated to determine the optimized circumstances of  for Iraq's tectonic region.

PGA model using RSM (𝑷𝑷𝑷𝑷𝑷𝑷 𝑹𝑹𝑺𝑺𝑴𝑴 )
To capture the PGA response against other variables (  ,  30 ,  and   ), response surface methodology (RSM) is employed using Minitab software.With  (analysis of variance) and model summary, results are demonstrated in Tables 3.Moreover, -value and P-value are two vital factors in evaluating the correlation between variables. and  are connected (Winship and Zhuo 2020).They go hand in hand, similar to Tweedledee and Tweedledum.When T-value is close to 0 (negative or positive values), the more probable there is not a notable variation between variables, as shown in Table 3.The  -value is utilized to determine evidence strength in the data provided.Generally, the lower  -value, the greater the sample evidence for a significant correlation (Chaubey 1993).By convention, a -value higher than 5% is called not statistically significant and vice versa (Altman and Bland 1995).It should be noted from Table 3 that the value (for   ) is < 0.05, which means the factor is more significant in  findings.
A realistic measure of the degree of multi-collinearity in a regression is the Variance Inflation Factor (VIF), which is a term used to quantify how strongly two or more predictors in the regression are associated.For more information (Robinson and Schumacker 2009).The  is, therefore, an essential part of examining interaction effects in multiple regression.According to Table 3,  values are around 1, the regressions have a good shape and are multi-collinearity.As we mentioned previously, out of the 187 datasets, 80% of them (i.e., 150 datasets) are adopted for the explicit model fabrication processes, and the rest 20% (37 datasets) are considered for verifying the final model.Fit Factorial multi-linear regression model is conducted to find the best correlation between dependent response and independent parameters ( versus   ,  30 ,  and   ).The explicit equation is formulated to be: Following the building of the model, it is essential to assess its statistical trustworthiness.From a statistical point of view, one condition has to be settled by a model for its reliability fits approved.This condition is normalized residuals.The expression residual denotes the variation between an experimentally determined value achieved by the model.
The PGA Fit multi-linear regression analysis findings are shown in Figure 13

Comparative study
Datasets of 20% (37 records) of the overall datasets (187 records) are employed to evaluate the proposed  models by  −  algorithm and .Those data have not been used in the model building process.The comparisons are conducted between the prediction models.The outcomes achieved from the proposed models by the verification records are displayed in this section.Besides, the outcomes show that the proposed composite model ( − ) exhibits better than the model produced by  (  ).
The standard deviation (SD) was detected as the data variance was measured.The fewer SD results in less data variance, and conversely.As a result, the variance coefficient (CoV) measures the actual amount of relative variation and  The correlation coefficient (R) is an inadequate indicator of a model's predicting effectiveness since (R) is not sensitive to output values multiplied by a constant (Gandomi et al. 2011).Accordingly, another recommended function should be employed to assess the proposed models' performance.The suggested models' performance may be assessed by comparing their  (performance indices) with  side to side with  (Eq.20).Where  .,refers to the actual value,  .,refers to the predicted outputs,  ������ .,.,refers to average actual values, and  ������ .,.,refers to the average predicted outputs for  number of samples.
The value of  is located between 0 and +∞.The relationship between  and  values is an inverse relationship.Thus, to obtain a perfect prediction performance, the  value should be closed to 0, which means higher  ith lower .As shown in Table 5, the  values are too small of about 0.0613 and 0.0862 for  − and   respectively, which indicate that the proposed models predict perfectly the experiment values of   .According to Table 5, the statistical performance of the composite model produced by  −  algorithm exhibit better outcomes relative to the proposed model generated by the  approach in terms of , , , mean, and .As shown in Figure 14, the high (PGA-act./PGA-pred.)ratio histogram frequency illustrates the high accuracy of the suggested equations in prediction, which means that a model has a respectable level of predictive accuracy if the ratio of actual to projected values is equal to (1.0).
In addition, ARE may be a preferred method for assessing the model prediction capabilities of the relative error distributions (Bagheri et al., 2012).From the percentage formula, ARE could be predicted as below: Superlatively, the frequency should decrease with every ARE% increasing.It is seen distinctly from Figure 15, where the proposed model has the lowest  of the highest frequency (less than 10%) and the largest ARE of the lowest frequency (more than 17.5%).Therefore, the two predicted PGA models have a very satisfactory error distribution.

Parametric analyses
To evaluate the impact of individual parameters on  ., Figure 16 (a-c) show the  as a function of a pair of parameters involving:   with  30 , , and   , respectively.It can be observed that the increases in the amounts of individual variables ( 30 and ) up to a specific range leading to an increase in  values, as depicted in Figure 16.By contrast, decreasing the  .to less than 2.5 m/s 2 with increasing the   .Thus, the tested parametric analyses could directly enhance the assessment of  .through selecting the suitable parameters.The results of the current study were examined further in terms of the interaction plot for  .response, as demonstrated in Figure 17.Different levels of one element may have different responses at different levels of another.As a result, there is an interplay between the various variables.Any point where the lines cross denotes a relationship between two variables.Non-parallel lines in the interaction plots (Figure 17

Screening analysis
Screening analysis or analysis of a factorial design is of extreme significance for selecting the necessary input parameters.Screening analysis offers a valuable method for assessing and evaluating the contributions of every predictive parameter to the response.To achieve that, Minitab software is utilized to conduct the screening analysis by inputting the actual values of the response against other parameters.The findings of the screening analysis are shown in Figure18, employing a standardized diagram.This figure demonstrates that the significant parameters impacting the peak ground acceleration (PGA) value by order of the earthquake magnitude (  ) and the average shear velocity ( 30 ) followed by and   .includes facilitating artificial intelligence for earthquake and hazard predictions.The following are the main findings of this study: • The collected datasets can be utilized to construct a novel model for forecasting peak ground acceleration, especially for the Iraq tectonic region.
• Soft computing algorithms and Response Surface Methodology () are effective and practical tools and efficient techniques in engineering problems to provide an optimized solution with sufficient accuracy utilizing various parameters, which can be easily employed for predicting  values.

•
The implicit and explicit models produced by this study are considered the first novel attempt to predict the  values for the Iraqi tectonic zone.Besides, models could be efficiently employed to provide estimates for the  values in a spreadsheet or hand calculations.
Besides, minimum values are obtained for the absolute relative error (%) for both models.These statistical results indicate good solid accuracy and compatibility of the forecasts made by the soft computing algorithms model.The statistical results show that the RSM approach displays accuracy slightly compared to the GSAANN technique.

•
The   and the  30 .which reflect the ′ the most influential factor, and the standardized plot was determined to be in accord.
• Rarely do we come across data that can be utilized to predict the maximum PGA.As a result, further data should be used to assess and refine the proposed peak ground acceleration models and investigate a broader range of parameters.Consequently One of the most prominent limitations facing the current paper is the use of data within the Iraq region.Therefore, the current study suggests generalizing the current models to a wider range that includes data for other countries.Further improvements may be made to obtain more accurate results, including employing the parameters used in this study for the related domain.Besides, extra data with a wide range of duration and parameters for Iraq's tectonic zones is recommended to compare with the finding obtained herein since this research is the first attempt to propose the Iraqi PGA formula.

Figure 1 :
Figure 1: Seismo-geographical map of Iraq and the Arabian Peninsula (red arrows: plate motion in cm per year; blue lines: boundaries of the plate; red lines: country boundaries) is shown (Abdulnaby et al. 2020).

Figure 3 :
Figure 3: A diagram design of the neural network

Figure 4 :
Figure 4: Effect of a mass on other masses Eq. (3) defines the  ℎ agent location.

Figure 5 :
Figure 5: Flowchart steps of the GSA algorithm.3.2.3 Composite ANN-GSA development are prepared from a dataset named as the preparing or training data.Through the preparing trial, the network's measurements are optimized.Besides, the preparation system includes two significant actions: initialization

Figure 6 :
Figure 6: Schematic diagram of the proposed composite - algorithm.

Figure 9 :
Figure 9: The structure of the developed ANN for the proposed PGA model.In the , the learning dataset of the composite  −  is divided into three categories: training data, testing data, and validation data, as shown in the regression graph of Figure 10.The x-axis (target) and y-axis (output) are represented by the actual and predicted .It is seen that the statistical results of the correlation coefficient () for the training, testing, and validation records have a good correlation between the actual and predicted  that produced by  −  algorithm.Ideally, the most residual error values (differences between target and output) are remarkably less, located near the zero lines as revealed in the error histogram with 20 bins of Figure 11.The -algorithm (Levenberg-Marquardt backpropagation) performance throughout the training development is plotted in Figure 12 using MATLAB R2019b.The difference of  (mean square error) by training epochs is drawn, where the best validation performance is 0.897 at epoch 2.

Figure 10 : 23 Figure 11 :Figure 12 :
Figure 10: Plot of the regression analysis (-algorithm) for the proposed model (composite  − ) (Minitab).The residuals are shown in the outline for demonstration purposes.Figures showing the distribution of differences between predicted and observed values show few outliers in the data set.The PGA Fit multi-linear regression analysis findings are shown in Fig. 13 (Minitab).These graphs afford good evidence to confirm the use of the suggested PGA model.The reliability is verified since the circumstance of model authenticity has been performed.

Figure 14 : 23 Figure 15 :
Figure 14: Frequency of the ratio between actual over predicted values.
(a)) reveal a significant interaction between Mw and VS30.Similar outcomes have been indicated FD (as depicted in Figure17 (b)).By contrast, slightly weak interaction between the REpi and the response, as presented in Figure17 (c).

Figure 17 :
Figure 17: Plot of the interaction between   and the other parameters of the fitted response ( .).

Figure 18 :
Figure 18: The findings of the screening analysis

Table 2
shows the GSA and Neuron parameters in each hidden zone and the ANN learning rate based on population size.
Latin American Journal of Solidsand Structures, 2022, 19(3), e44715/23 Latin American Journal of Solids and Structures, 2022, 19(3), e447 16/23 reflects the correctness between output and input data.According to [Pimentel-Gomes 2000], a CoV value of less than 10% indicates great precision, whereas values of 20-30% indicate low precision, and more than 30% indicate low precision.Table 4 demonstrates that both the proposed models have reasonable -values.It was found that the CoV values for the two models ( − and   ) were 8.739 percent and 10.362%, respectively, with great accuracy in determining the target values.Furthermore, an excellent value adjacent to 1.0 of the mean-values for both models were obtained (1.061 and 1.07).The RSM shows accuracy slightly as compared with the GSAANN technique.

Table 4
The comparative details of actual and predicted values.

Table 5
Statistical comparison performances of the developed  models.