A nonlinear time-series prediction methodology based on neural networks and tracking signals

Paper aims: This paper presents a nonlinear time series prediction methodology using Neural Networks and Tracking Signals method to detect bias and their responsiveness to non-random changes in the time series. Originality: This study contributes with an innovative approach of nonlinear time series prediction methodology. Furthermore, the Design of Experiments was applied to simulate datasets and to analyze the results of Average Run Length, identifying in which conditions the methodology is efficient. Research method: Datasets were generated to simulate different nonlinear time series by changing the error of the series. The methodology was applied to the datasets and the Design of Experiments was implemented to evaluate the results. Lastly, a case study based on total oil and grease was performed. Main findings: The results showed that the proposed prediction methodology is an effective way to detect bias in the process when an error is introduced in the nonlinear time series because the mean and the standard deviation of the error have a significant impact on the Average Run Length.

effects of adjuvant endocrine therapy on the health-related quality of life ; supply chains (Mircetic et al., 2022) and econometric approaches (Matta et al., 2021).
Researchers are often searching for the best forecasting method that results in the most approximated predicted value of real data. Some studies were conducted to evaluate some time series forecasting methods (Liu et al., 2021;Verma et al., 2021;Mao & Xiao, 2019).
Traditional methods include time series regression, Auto-Regressive Integrated Moving Average (ARIMA), and exponential smoothing based on linear models (Kumar & Murugan, 2017). All these methods assume a linear relationship among the past values of the forecast variable and therefore nonlinear patterns cannot be captured by these models (Wong et al., 2010).
The Artificial Neural Network (ANN) models have been proposed during the last few years for obtaining accurate forecasting results (Bandeira et al., 2020) and these were attempts to improve the conventional linear and nonlinear approaches. In this context, approaches based on ANN for time series forecasting have produced convincing results in recent decades for nonlinear models (Corzo & Solomatine, 2007;Hippert & Taylor, 2010;Xiao et al., 2012).
Considering the dynamic feature of time series forecasting, it is not unusual to find that the model needs to be explicitly updated after the passage of a certain number of time periods (Deng et al., 2004). Therefore, the process of monitoring plays an important role in accurate forecasting (Makridakis & Wheelwright, 1989). The monitoring is important to determine if a deviation occurs in the time series and to identify if a corrective action in the model needs to be taken to ensure that the forecasting process is brought back under control. Improper monitoring of results may imply uncertain predictions and incorrect decisions. Nevertheless, questions of when and how to update the model parameters are yet to be answered. Some studies were proposed to compare some monitoring methods for time series (Gardner Junior, 1985;Superville, 2019;Brence & Mastrangelo, 2006). Different monitoring approaches have been proposed in the forecasting area. The tracking signal methods have been used to check the bias of forecasting methods (Sabeti et al., 2016) and also warn when there are unexpected outcomes from the forecast (Kumar & Murugan, 2017). Tracking signals can automatically detect changes in the forecast errors when the forecast is misbehaving. Various tracking signal measures exist, one of the earliest being the cumulative sum (CUSUM) tracking signal first proposed by Brown (1959).
Although tracking signals has been a common practice in traditional time series forecasting, this matter has not been widely addressed in intelligent time series forecasting as ANN. Some studies used tracking signals during the ANN training (Yu & Lai, 2005;Kumar & Murugan, 2017) to select the best ANN model, but it is not commonly used for monitoring forecasting.
Intelligent time series forecasting can learn and/or infer from historical data; however, the notion that an intelligent mechanism is capable of resolving all problems automatically is a misconception (Deng et al., 2004).
Despite similar issues have been noted by several authors (Berry & Linoff, 1997;Deboeck, 1994;Gardner Junior, 1985;Superville, 2019;Brence & Mastrangelo, 2006) very few research reports are found in the literature that have addressed this issue when using artificial neural network and tracking signals. Therefore, this paper proposes a nonlinear time series prediction methodology based on neural network forecasting using the tracking signal method to detect bias and their responsiveness to non-random changes in the time series. However, different from many studies published in the area, this study generated a synthetics dataset to simulate different changes in the time series, compares the monitoring performance using the concept of Average Run Length (ARL) and applied Design of Experiments (DOE) to evaluate the performance of tracking signals, identifying in which conditions the predictor is efficient.
The paper is structured as follows: Section 2 consists of a brief background review of nonlinear time series forecasting based on multilayer perceptron and tracking signals, respectively. Section 3 contains the simulation study showing the experimental design and the forecasting system and Section 4 presents a case study. Finally, Section 5 states the main conclusions of this paper and some discussions.

Nonlinear time-series
In time series models, historical data of the variable to be forecast are analyzed in an attempt to identify a data pattern. Then, assuming that it will continue in the future, this pattern is extrapolated to produce forecasts (Armstrong, 2001;Krishnamurthy, 2006). Classical time series models can be classified into two categories: linear models and nonlinear models. a k-regime self-exciting TAR (SETAR) model with a threshold variable. A criticism of the SETAR model is that its conditional mean equation is not continuous. The thresholds are the discontinuity points of the conditional mean function. In response to this criticism, smooth TAR models (STAR) have been proposed (Chan & Tong, 1986).
The Nonlinear Moving Average specifies that the output variable depends nonlinearly on the current and various past values of a stochastic term (De Gooijer & Hyndman, 2006).
The Bilinear model is a natural extension of nonlinearity employing the second-order terms in the expansion to improve the approximation (Tsay, 2005). This model was introduced by Granger & Anderson (1978) and has been widely investigated. Properties of bilinear models such as stationarity conditions are often derived by putting the model in a state space form and by using the state transition equation to express the state as a product of past innovations and random coefficient vectors.
Although the properties of these models tend to overlap somewhat, each is able to capture a wide variety of nonlinear behavior. In most time series, however, this kind of modeling is even more complex due to some features like high frequency, daily and weekly seasonality, calendar effect on weekends and holidays, high volatility and, presence of outliers (Balestrassi et al., 2009).
Linear time series models have been widely used in recent years. According to Balestrassi et al. (2009), a stochastic time series is said to be linear if it can be written as shown in Equation 1. Any stochastic process that does not satisfy the condition of Equation 1 is said to be nonlinear (Tsay, 2005).
where μ is a constant, i ϕ are real numbers with 0 1 ϕ = and { } t α is a sequence of independent and identically distributed (IID) random variables with a well-defined distribution function. The distribution of t α is continuous However, the time series problems found in human activities and nature are mostly nonlinear. Therefore, the research for time series data prediction is mainly focused on nonlinear models. Thus, there has been increasing interest in extending the classical framework of Box & Jenkins (1970) to incorporate nonstandard properties, such as nonlinearity, non-Gaussianity, and heterogeneity.
Many nonlinear time series models have been proposed in the statistical literature, such as Threshold Autoregressive Nonlinear Autoregressive, Smooth Transition Autoregressive, and Bilinear Model (Tong, 1978;Amiri, 2015;Hamilton, 1989). A deeper theory about nonlinear time series can be found in Priestley (1980). The basic idea underlying these nonlinear models is to let the conditional mean evolve over time according to some simple parametric nonlinear function. Table 1 shows some nonlinear time series implemented and simulated for the present study. In each case, et: N(0,1) is assumed to be independent and identically distributed. These three time series models are chosen to represent a variety of problems that have different time series characteristics. For example, some of the series have autoregressive (AR) or moving average (MA) correlation structures. The AR part involves regressing the variable on its own lagged values, while the MA part involves modeling the error term as a linear combination of error terms occurring contemporaneously and at various times in the past.
In contrast to the traditional piecewise linear model that allows for model changes to occur in the time space, the TAR model uses threshold space to improve the linear approximation. A time series is said to follow Forecasting methods predict values in the future based on a given time series dataset, which considers assumptions in the future by evaluating historical data (Santos et al., 2020).
Artificial Neural Networks are one of the most used forecasting methods and are widely accepted as a technology offering an alternative way to tackle complex and ill-defined problems (Yu & Lai, 2005). The main reason for the increased popularity of ANNs is that these models are able to approximate almost any nonlinear function arbitrarily close. The ANN model can approximate any well-behaved nonlinear relationship to an arbitrary degree of accuracy, in much the same way that an ARMA model provides a good approximation of general linear relationships (Chen & Chen, 1995;Hornik, 1993).
Examples of ANN for nonlinear time series include Multilayer Perceptron (MLP), Radial Basis Function (RBF), Generalized Regression Neural Network (GRNN), and Support Vector Machine (SVM).
Multilayer Perceptron is one of the most popular network types (Aizenberg et al., 2016;Olson et al., 2012), and in many problems, domains seem to offer the best possible performance to describe a relationship between independent and dependent variables (Kialashaki & Reisel, 2013). MLP is a feedforward network trained with backpropagation learning algorithms (Zhai et al., 2016), and consists of one input layer, one or more hidden layers, and one output layer, as shown in Figure 1. A hidden layer is a group of neurons that have a specific function and are processed as a whole. Theoretical results prescribe that an MLP with one hidden layer (threelayer perceptron) is capable of approximating any continuous function (Hornik, 1993). Each layer consists of neurons, and the neurons in two adjacent layers are fully connected with respective weights, while the neurons within the same layer are not connected (Balestrassi et al., 2009). For each neuron in the hidden or output layer, the following input-output transformation is defined in Equation 2.
where v is the output, H is the total number of neurons in the previous layer, h u is the output of the th h neuron in the previous layer, h w is the corresponding connection weight, 0 w is the bias, f is the nonlinear activation function. A backpropagation neural network is called self-adaptive because simulative neurons in the network organize themselves continuously according to the feedback of the output and the whole network (Haykin, 2009).
At the time of training, the values of h w are continuously adjusted according to the feedback obtained from the real value of the response variable (Mo et al., 2017), and the weight and biases are optimized based on the minimization of the sum of the squares of the difference between the desired output and an estimated output (Balestrassi et al., 2009). The training of the network is carried out until it reaches a stable state, i.e., when there are no more significant changes in the values of synaptic weights (Haykin, 2009). The learning rate represents the rate at which the weights are adjusted. A higher learning rate allows the network to converge more rapidly, however, the chances of a non-optimal solution are greater.
Therefore, ANN can be widely used for the purpose of modelling nonlinear problems. One of the main advantages of ANN is that it is not necessary to know in advance a mathematical model that represents the data set (Chang & Tseng, 2017). Thus, ANN can describe nonlinear processes with good accuracy (Sun et al., 2017).

Tracking signals
Monitoring is an important component of a time series forecasting system since there is no guarantee that the past behavior and characteristics of the system continue in the future. In the forecasting and time series fields, tracking signals are used to monitor forecasting systems. Tracking signals have been applied to forecast errors and have proven useful in determining whether processes should be allowed to continue uninterrupted or if intervention is required to bring the process back in control (Krishnamurthy, 2006).
Tracking signals, in general, are ratios of the forecast errors by the mean absolute deviation. If the ratio exceeds a pre-specified limit, the forecasting approach is reexamined to see whether the pattern has changed and whether some action needs to be taken (Krishnamurthy, 2006).
The first proposal of tracking signals was made by Brown (1959) and subsequently analyzed by several researchers, including Trigg (1964), andGardner Junior (1985).
One more common tracking signal, shown in Equation 3, compares the cumulative sum (CUSUM) of the errors at the end of each period with an unsmoothed mean absolute average (MAD). The CUSUM tracking signal is presented in the most standard production management texts and recommended by The Association for Operations Management (Ravi, 2014).
where the forecast error, t e , is the actual time series value, t X , minus the forecast t F , and t AD is the cumulative sum of the absolute errors.
The cumulative error can be both positive or negative, so the tracking signals can be positive or negative as well. If the forecast value is higher than the actual value then the model is in over forecasting and TS will be negative (Kumar & Murugan, 2017).
If the TS is between the control limits, then the forecast model is correctly working (Kumar & Murugan, 2017). The control limits are determined by the concept of average run length (ARL). ARL is the average length of time until the tracking signal exceeds the control limits, starting from an arbitrarily selected point in time (Sun et al., 2017). It determines the probability of detecting time series changes. If the time series doesn't have changes, the initial average run length (ARL0) measures the probability of a false alarm, defined as Type I error. So, the control limits should be defined by the simulation to yield some desired probability of getting a false alarm.

The nonlinear time-series prediction methodology
The proposed methodology has the objective of performing nonlinear time series forecasting and monitoring the errors to detect bias in the time series model assuring the forecasting accuracy.
The methodology is divided into five steps, as shown in Figure 2: 1) The first step is to define the variable that will be controlled and to get the time series historical data; 2) Considering the selected data, it is possible to perform time series forecasting using ANN and MLP; 3) Getting the real and the predicted values, it is possible to calculate the forecast error and obtain the TS. So, the upper control limit (UCL) and the lower control limit (LCL) of the tracking signals chart are also defined; 4) The actual data corresponding to the predicted periods in the forecasting model are collected and compared with the predicted values; 5) Then, the TS values are calculated and the forecast errors are monitored through the tracking signal chart. If the TS exceeds the control limits, the ANN is performed again, otherwise, it returns to step 4.
The methodology also can be described by the pseudocode in Table 2 that is associated with the steps in Figure 2. It will be analyzed and validated in the next sections through a simulated and a case study.

Simulation study
In this section, the simulation study for the nonlinear time series predictor is examined to test different scenarios and to generalize the results. First, it is presented the DOE applied in the simulation datasets. The datasets contain a part of the original time series and a part in which the series error (ε t ) is modified. Next, the time series forecasting and the tracking signals are applied. Finally, the ARL results of the tracking signals are analyzed by DOE and it is possible to determine how much the series error (ε t ) can be modified so that the tracking signals are still efficient in detecting the series shift.

Design of experiments
DOE is a commonly used technique for processes where experiments are planned to find the optimal and robust solution through the combination of input variables at different levels (Lee et al., 2007;Dascalescu et al.,2008). However, the DOE technique can be used for simulation problems, Figure 3.  The DOE applied in simulation problems increases the transparency of simulation model behavior and the effectiveness of reporting simulation results (Lorscheid et al., 2012). Furthermore, it allows controlling the factors that will be used in the simulation and present better and faster results than trial and error simulation. Therefore, DOE is a useful and necessary part of the analysis of simulation (Lee et al., 2007).
Concerning generating distinct time series datasets different from the original models (Table 1), the series error (ε t ) present in the time series models was modified by changing its mean and standard deviation in a controlled way by the DOE technique. The original series error (ε t ) assumes a normal distribution N (0;1).
The levels assumed for the mean of the series error (ε t ) were defined considering the concept of effect size. Effect size is defined as the estimation of the magnitude of the relationship between variables or the difference between two samples (Rosenthal, 1994). Cohen (1988) classified the effect size as small (d = 0.2) when the difference between two samples is difficult to see with the naked eye, and as large (d = 0.8) when the difference between two samples is evident to see with the naked eye.
According to Cohen (1988), for d = 0.2, 58% of the error of the original series will be above the mean of the error of the modified series; 92% of the two series of error will be overlapped, and there is a 56% of chance that a randomly chosen element from the error of the original series will be greater than a randomly chosen element from the error of modified series. For d = 0.8, 79% of the error of the original series will be above the mean of the error of the modified time series; 69% of the two series of error will be overlapped and there is a 71% of chance that a randomly chosen element from the error of the original series will be greater than a randomly chosen element from the error of modified series. That is depicted in Figure 4. The standard deviation of the series error (ε t ) was simulated to be between 0.5 and 3.0, according to the control chart concept. Control rules take advantage of the normal curve in which 68.26% of all data is within plus or minus one standard deviation from the average, 95.44% of all data is within plus or minus two standard deviations from the average, and 99.73% of data will be within plus or minus three standard deviations from the average (Cohen, 1988), as shown in Figure 5.
Thus, a resume of the DOE factors and their levels is detailed in Table 3. Considering the factors in Table 3 and the response surface method (RSM), the design matrix was constructed by using any statistical software with DOE routines.
The RMS is a class of DOE that is widely used because of its simplicity and effectiveness to design the experiments and it minimizes the number of experiments for a specific number of factors and its levels (Montgomery, 2009). The objective of RSM is to explore the relationship between the response and the studied factors involved in an experiment (Amdoun et al., 2018). The mathematical model of RSM is a second order polynomial equation, whose advantage is to be easy to estimate and then be applied in order to approximate the response (Cui et al., 2012).
The design matrix was obtained with RSM for 2 factors with 5 center points and 2 replications, resulting in 26 experiments presented in Table 4.   The design matrix is a guide to indicate the combination of the mean and standard deviation to generate the modified series error (ε t ), that will be used for the time series simulation. For each time series model, we generated 26 series containing the modified series errors (ε t ), according to the design matrix. Afterward, it was generated 100 samples of each 26 datasets, which contains 100 samples of the original series, in order to get a false alarm of 0.01 (Bischak & Trietsch, 2007), and 50 samples with the modified series error (ε t ).

Time-series forecasting
After getting the datasets, MLP was implemented in each time series model by using Matlab® software. For the original time series, 50 samples were trained by using 80% random observations from the original dataset, and 10% was the validation set. So, the remaining 10% of observations were allocated to the testing set. In the training process, we used the seed for sampling equal to a constant number. The neural network used was the three-layer perceptron due to presents excellent performance when dealing with nonlinear data sets (Kialashaki & Reisel, 2013), and theoretical results prescribe that an MLP with one hidden layer is capable of approximating any continuous function (Mo et al., 2017). The number of hidden units was defined as 10, the training function used was the Bayesian Regularization, and the learning rate was set to 0.01. The steps used to predict are set as 1 or 2 according to the lag seasonality of the model. The steps ahead to predict, or forecasting horizon, represents the number of steps ahead of the lagged input values that the predicted output lies. In this case, due to the small synthetic time series and considering the error propagation throughout the steps prediction, just one step ahead was used. The output of the network can be combined with previous input values, shifted one-time steps, and repeated predictions made. Since the runtime is mainly dependent on the minimum error to be reached and this error is not linear, it is not correct to say that predicting two steps ahead doubles the runtime of predicting one step ahead (Balestrassi et al., 2009).
After MLP was implemented, other 50 samples of the original time series were predicted, and the tracking signals were calculated by Equation 3. The initial values of CUSUM (Equation 4) were set to zero as suggested by Gardner Junior (1985) and McClain (1988). Then, the control limits (CL) for the tracking signals were determined based on simulation to detect the probability of getting a false alarm, defined as Type I error (Gardner Junior, 1985). Type I error is the probability of a tracking signal exceeding the control limits when the time series does not have changes. In this paper, the simulation was performed assuming a desired false alarm rate between 0.01 and 0.0105 considering a confidence interval (CI) of 99%, which represents an ARL0 between 100 and 95 samples of the original series. Next, the modified part of the series was implemented in the forecasting model and 50 samples were predicted. Figure 6 shows an example of the dataset generated for the STAR1 model and the time series forecast using MLP (ANN) and Figure 7 shows an example of the Bilinear model. It can be noted in both cases that from sample 100 onwards, the forecasting model does not fit the data well and the forecast error ( ) t e increases once the time series error (ε t ) has changed. After that, the tracking signals for the modified time series were also calculated. Figures 8, 9, and 10 show an example of the methodology applied for the STAR1 series for the training samples, the original series, and the modified series, respectively.
The forecast error ( ) t e for the original time series is normally distributed whereas from the modified series it is not normally distributed and presents larger values. Also, it is possible to note that for the modified series the tracking signals exceed de control limits early.
From the tracking signals, the ARL were obtained for each dataset and the average results of STAR1, BL1, and NMA models are presented in Table 5.
Given the ARL results, the DOE statistics were analyzed. Thus, it was possible to establish mathematical relationships between the responses and the input parameters by using the Ordinary Least Squares Method (OLS) and Analysis of Variance (ANOVA). Furthermore, the residual was analyzed to ensure that it was uncorrelated, normally, and randomly distributed (Priestley, 1980). Therefore, we could obtain the general equation of each model that are described in Equations 8, 9, and 10, which present an uncoded coefficient. All terms are significant with a significance level of 5%, and the Afterward, it is possible to analyze which factors are more significant and have a greater influence on the final result. For better visualization of the factor significance, Figures 11, 12, and 13 present the Pareto Chart for each method. STAR1 and NMA show that the mean presents an effect relatively larger than the standard deviation, while for BL1 the standard deviation presents more influence.     At this moment, it is necessary to understand how each significant factor influences the model. Then, an analysis of the main effects, Figures 14, 15, and 16, and the interaction plots, Figures 17, 18, and 19, was developed for STAR1, BL1, and NMA, respectively.     As inferred from Figure 14, for the STAR1 model the ARL decreases until the mean reaches the value of 0.7, then increases again until the mean reaches 0.8. In addition, the increase in the standard deviation implies a decrease in the ARL results, being easier to detect the change in the time series. Besides, there is a significant interaction between the mean and the standard deviation, as shown in Figure 17. The smallest ARL occurs when the mean is greater than 0.65 and the standard deviation is equal to 0.5, but if the mean is smaller than 0.65, the best ARL is obtained with the standard deviation equal to 3.
Analyzing Figures 15 and 16, the effect of the mean in the ARL for BL1 and NMA follows the same trend as for STAR1. For the BL1 model, the ARL results present a better performance when the standard deviation reaches 2.8 and then increases again until it reaches 3.5. Moreover, Figure 18 also shows an interaction between the mean and the standard deviation. The smallest ARL occurs when the mean is about 0.5 and the standard deviation is equal to 3, but if the mean is greater than 0.5, the best ARL is obtained with the standard deviation equal to 1.75.
The NMA model presents a small variation in the ARL values as the standard deviation changes, showing a better ARL with a standard deviation of 3.5. Figure 19 also shows an interaction between the mean and the standard deviation for the NMA, which presents a smaller ARL with a mean greater than 0.5 and a standard deviation of 3.5. For mean values smaller than 0.5, the best ARL is reached with a standard deviation of 0.5.
The effect in the ARL results concerning the change in the mean and standard deviation of the series error can also be seen in Figures 20, 21, and 22. The STAR1 model is the one that presents the greatest variation of ARL results according to changes in the series error.  Then, the desirability analysis was performed to find the optimal solution to minimize the ARL response. The desirability function, defined by Harrington (1965) and Derringer & Suich (1980), is one of the approaches used for factor optimization (Candioti et al., 2014). It is based on the transformation of all the obtained responses from different scales into a scale-free value. The values of desirability (D) functions lie between 0 and 1. The value 0 is attributed when the factors give an undesirable response, while the value 1 corresponds to the optimal performance (Derringer & Suich, 1980). Table 6 presents the results of the desirability analysis for the three nonlinear time series. They were obtained according to a similar procedure as shown for the STAR1 model, Figure 23. For STAR1, the smallest ARL =11.79 is got with a mean equal to 0.89 and a standard deviation equal to 0.02, with D=1. For BL1, the smallest ARL=15.62 is got with a mean equal to 0.59 and a standard deviation equal to 2.41, with D=1. For NMA, the smallest ARL=15.45 is got with a mean equal to 0.80 and a standard deviation equal to 3.52, with D=1.   The main findings of the experimental results are following summarized.
• The prediction methodology presents better performance applied in the STAR1 model once the change in the time series can be detected early than in other models.
• For STAR1 and NMA, the change in the mean of the error has a significant effect on the ARL results being the smallest ARL when the mean has a high change, while the BL1 presents a smaller ARL with a moderate change in the mean.
• The results also show that the standard deviation has a significant effect on the ARL results, mainly concerning the BL1 and NMA models, which have the best ARL result when the standard deviation has a moderate to high change while the smallest ARL for STAR1 is obtained with small standard deviation change. This could be theoretically expected since BL1 and NMA have a moving average (MA) correlation and a change in the standard deviation of the error can imply a greater variability in the y values.

Total oil and grease
The primary processing of petroleum consists of the separation of oil, gas, and water obtained from the extraction of crude oil. In this process, the produced water is a complex mixture that can be characterized by the total oil and grease (TOG) (Yang, 2011;Ray & Engelhardt, 1992). TOG is also widely used for environmental surveillance purposes (Brasil, 2007).
The TOG value obtained from a chemical analysis of produced water is extremely method dependent. According to Yang (2011), the main reference methods for measuring TOG are infrared absorption, gravimetric, and gas chromatography. Among these, gravimetric is considered in the present study. Some advantages of gravimetric analysis are related to its simplicity and low cost. However, it has a disadvantage in terms of sensitivity, since its lower detection limit varies from 5 to 10 mg/L (Igunnu & Chen, 2014).
The TOG value obtained by the gravimetric method considers both the dispersed oil fraction, which represents the oil in the produced water in the form of small droplets and the dissolved oil fraction, which is defined as the amount of oil in the produced water in a soluble form (Ray & Engelhardt, 1992).

Application of the methodology
It used 100 samples of TOG in the case study, obtained from an oil platform in Brazil. First, 50 samples were trained by using MLP with the parameters set as shown in the simulation study. Afterward, the data were predicted, and the tracking signals were calculated using the forecast errors. So, it was possible to determine the control limits for the tracking signals, assuming a desired false alarm rate of 0.01. The control limits calculated were equal to -4 and 4.
Then, new data were input into the predictor and new forecasting and tracking signals were calculated. This process was done until a tracking signal exceeded the control limit in the 54 th sample. Figure 24 shows the real TOG series and the forecasting, as well as the forecast error and the tracking signal.
Therefore, it was necessary to apply again the MLP and update the forecasting model. So, the tracking signals and the control limits were recalculated, and another 30 samples were predicted using this forecasting model until the tracking signal transposed the upper limit, as shown in Figure 25.
So, the forecasting model was updated and it was used in the rest of the samples, being all of them inside the control limits of the tracking signals chart, Figure 26.  Different from many studies that update the forecasting model for each sample or use the same model for all samples, the case study showed that the predictor was able to detect changes in the TOG data and it was necessary to update the forecasting model only twice.

Conclusion
Time series forecasting is widely used in several areas to make reasonable inferences about the future and the monitoring of forecast errors is essential to ensure forecasting accuracy. Although intelligent tools such as neural networks have been applied in time series forecasting for some time, the problem of monitoring the forecasting process has not been widely addressed. Sometimes there might not be enough current data available. Or, in some cases, as time goes by, more recent data become available while some old historical observations might distort the current structure of the system, or the existing pattern or relationships obtained from the past might not continue in the future (Deng et al., 2004). Hence, detection of any changes in the system and taking the process the brought back under control is very important to ensure certain predictions and correct decisions making.
Motivated by that, this paper aims to evaluate a nonlinear time series prediction methodology using the tracking signal method to detect bias and their responsiveness to non-random changes in the time series, because the forecasting model is static once the training is over. Moreover, this paper presented a case study using TOG data to exemplify the application of the time series prediction methodology.
The study consisted of analysis based on 100 samples of 26 synthetic datasets generated to simulate different situations by changing the mean and the standard deviation of the series error for STAR1, BL1, and NMA models. Each dataset contained 100 samples of the original series and 50 samples with the error modified. The MLP was implemented in the original time series samples with the training parameters carefully set. The CUSUM tracking signals were applied in the forecast errors, and the results were plotted in control charts. Thus, the modified time series were inserted in the forecast model, and the ARL results were obtained and compared by using ANOVA with a significance level of 5%.
The results presented in this paper show that the prediction methodology presented better performance applied in the STAR1 because it was able to detect the change in the series before the other models. For STAR1 and NMA, the smallest ARL results are obtained when the mean has a high change, while the BL1 presents a smaller ARL with a moderate change in the mean. The results also show that, concerning the BL1 and NMA models, the best ARL occurs when the standard deviation has a moderate to high change while the smallest ARL for STAR1 is obtained with a small standard deviation change.
In general, the proposed prediction methodology is an effective way to detect bias in the process when an error is introduced in the nonlinear time series because the mean and the standard deviation of the series error have a significant impact on the ARL. By using this methodology, the forecasting model can be kept up-to-date and thereby improve forecast accuracy This simulation study differs from others also concerning the method used to compare the ARL results. The DOE technique was applied to simulate the datasets by using an arrangement of experiments with controlled levels of the factors and to identify in which conditions the tracking signals are efficient. Nevertheless, other papers presented a usual approach of ARL in which a larger number of different datasets is tested and can lead to restricted conclusions without a formal and statistical analysis, which can be achieved using DOE. Another difference is the calculation of the control limits that were determined by using the false alarm rate concept, different from other papers that use a fixed control limit.
Many other studies still can be performed, for example (1) analysis of other nonlinear time series models, (2) evaluation of time series prediction methodology by using other forecasting method than MLP, (3) or by using another tracking signals method than CUSUM.
Further, this paper may be a reference for researchers looking to improve the accuracy of nonlinear time series forecasting.