Persistence effect determination of variability in forecasting of agricultural and road machinery national production

The objective of this research was to forecast the Brazilian national production of agricultural and road machinery in the short term by BOX & JENKINS methodology and determine the persistence effect. Data were obtained at National Association of Automotive Vehicle Manufacturers (ANFAVEA) from January 1960 to October 2019, totaling 718 monthly observations. The Autoregressive Integrated Moving Average (ARIMA) and Autoregressive Conditional Heteroscedasticity (ARCH) methodology were used. The ARIMA (2,1,1)-ARCH (2) model was fitted and persistence of 0.60 was determined, showing that the instability in the series will be for a long period of time.


INTRODUCTION
The product industry assumed a prominent position in terms of technological development, especially with regard to the agricultural machinery and implements segment (GONÇALVES et al., 2015).
Performing an advanced search of the Web of Science database using the search string TS = ((ARIMA OR forecasting OR forecast) AND ("road machinery" OR "agricultural machinery")) it was reported 9 papers, all of them written in the English language. Of these 9 papers 3 are in automation control systems, 3 in computer science, interdisciplinary applications, 2 are in multidisciplinary agriculture, 2 are in management, 2 are in mechanical engineering and 1 in computer science, artificial intelligence, 1 in electrical engineering, 1 in manufacturing engineering, 1 in computer science information systems, 1 in multidisciplinary engineering.
Brazil is considered one of the most productive country addressing agrobusiness, but no one related to forecast agricultural and road machineries. We consider relevant to estimate the demand for agricultural tractors using univariate linear model. AI (2015) analyzing economic development as a function of agricultural mechanization, used nonlinear relationship models to predict the demand based on GDP. In this study the joining of linear models to study the level and a nonlinear to study the volatility will be used, to determine the persistence and the forecast in a short term. JU et al., (2013) used agriculture-related data to make predictions through neural network capability and fuzzy logic, exploiting the full potential power of agricultural machinery. Here models of the general class ARIMA and ARCH will be used together.
The technical evolution present in this sector has generated supply of equipment that uses advanced technology, which contributes more and more to the field productivity (VIAN et al., 2013). One of the alternatives for subsidizing production control and planning systems is predicting using time series analysis (MARCHESAN & SOUZA, 2010).
The volatility of the world economy affects the predicted values, both in the short and long term. The BOX & JENKINS models are able to capture the series level behavior and useful to forecast it, and volatility models capture the variability on the series. Total agricultural and road machinery corresponds to wheeled tractors, crawler tractors, motorized cultivators, grain harvesters, cane harvesters and backhoes, in units.
The researches problems are identifying if the agricultural and road machinery production series can be represented in level by a linear model from an ARIMA family and its variability by an ARCH family model.
The aim of the research is to determine the best forecast values in a short-term to Brazilian agricultural and road production machinery. Assertive forecasts are made with the Autoregressive Integrated Moving Averages (ARIMA) models, which is an effective econometric approach for performing level prediction (ROCKENBACH et al., 2016). The ARIMA-ARCH models estimated jointly bring the novelty in this productive sector because the average is forecasted but also considering the volatility.

MATERIALS AND METHODS
National production of agricultural and road machinery was collected on the National Association of Automotive Vehicle Manufacturers website (http://www.anfavea.com.br/estatisticas. html), from January 1960 to October 2019, totaling 718 monthly observations. The variable considers the wheeled tractors, crawler tractors, motorized cultivators, grain harvesters, cane harvesters and backhoes, in units.
The variable under analysis was fit by BOX & JENKINS methodology, based on interactive cycle: a) Identification, used to select the possible model based on autocorrelation function (FAC) and partial autocorrelation function (PACF); b) Estimation, the filter AR or MA define in the step before are estimated by Minimum Least Squares or Maximum Likelihood estimators; c) Verification, the model fit is analyzed if it presents white noise characteristics and adjustments statistics are used; d) Forecasting, the best model was used to forecast the future values, in this stage the evaluation statistics are used and the DIEBOLD -MARIANO test (1995) was applied to decide the best model to forecast (MORETIN & TOLOI, 2004;PEREIRA & REQUEIJO, 2008).
The general model proposed by BOX & JENKINS (1970), represented in equation 1. Z t = µ+ ɸ 1 Z t-1 + ɸ 2 Z t-2 +...+ɸ p Z t-p +a t -θ 1 a t-1 -…-θ q a t-q (1) The model adequacy will be performed through residual analysis, which must present white noise characteristics, that is, zero mean, constant variance and non-autocorrelated, and present the lowest values for Akaike Information Criteria (AIC) and Baysian Information Criteria (BIC), equations 2a and 2b respectively. The forecasts values will be verified evaluated by U-Theil statistics which provided if the model stablished give better forecasting than the mean series values. The Mean Squared Error -MSE and Root Means Squared Error -RMSE to evidence the better model to forecast, according to equations 3a, 3b, and 3c. The best forecasted model will be evaluated by DIEBOLD-MARIANO test (1995) give to us the best model. The null hypothesis of DM test is that all forecasted values have the same accuracy.
Where a t is the error defined by subtracting the values from the original series by the forecasts e is the value of the original series lagged by t-1 periods. The models showed values close to zero are the best suited, the forecast values will be done six steps ahead. To decide which forecast value is better the modified DIEBOLD-MARIANO (DM) test was used to decide the best alternative forecast. The DM is a more versatile test than any other alternative test to decide the best forecast, and used the critical values of Student´s test t distribution with (n-1) degrees of freedom. And, the modified DM brought improvement in cases where the sample sizes are small (HARVEY et al., 1997). The best ARIMA model fitted will have its residues tested by conditional heteroscedasticity test -LM-ARCH test (ENGLE, 1982), which will show if the quadratics residuals have autoregressive characteristics that can be represented by the Generalized Autoregressive Heteroscedastic (GARCH) model, presented in equation 4.
(4) In the ARCH (p, q) model, p represents the parameters of the autoregressive part associated with β 1 and q represent the parameters of moving average, associate with α 1 , and α represent the mean process.
If quadratic residuals derived from linear modeling present significant autocorrelation and the LM-ARCH test is statistically significant, so there is evidence of heteroscedasticity and ARCH model should be fit and de parameters estimated. The sum of the parameters α and β represent the volatility persistence, as closer to 1 the sum of α + β greater is the persistence, showing a long period of instability.

RESULTS AND DISCUSSION
Performing the ADF test applied to the series in level was considered stationary (p= 0.413). To the KPSS test, the series in level was considered nonstationary with LM calc = 0,7525 and LM tab at 5% 0,4630, but in first differences LM calc = 0,0975 and LM tab at 5% 0,4630, the series was considered stationary. Performing the Phillips-Perron test applied to the series in level it was considered stationary with p = 0.001. To the Elliott-Rothenberg-Stock Point -Optimal test, considering the series in level stationary P calc = 6,95 and P stat = 3,26, the series is considered stationary. To the Ng-Perron test consider series in level with MZa = -3,026 and MZa at 5% -8,10 the series is nonstationary, in the first differences the series become stationary MZa = -22,83 and MZa at 5% -8,10. Due to controversial results of the unit root tests the series will be considered non-stationary to be fit.
The ARIMA competing models are displayed in table 1, the best fit was an ARIMA(2,1,1)-ARCH(2) model. To corroborate the presence of volatility the LM-ARCH test was carried out, which showed that the residuals are not homoscedastic (p = 0.5257).
The sum of parameters values corresponding to a² t-1 and a² t-2 from the ARCH model (2), using table 1 is 0.25 + 0.35 = 0.60. The value 0.60 Table 1 -Competing models of the ARIMA and GARCH general class for Brazil's agricultural and road machinery production series.

Model
Parameter was close to one indicating that the series was highly persistent, showing that the volatility of domestic production of agricultural and road machinery will affect the series level over long periods of time.
Future values were projected from November to April 2020 as shown in figure 1. The model adequality is displayed at table 1. Figure 2 represents original and expected variance serie values from November to April 2020. It was identified that the first volatility cluster occurred in May 1980, which compared to the previous month showed 87.02% increase in monthly production.
This volatility effect can be explained by the country's economic instability in relation to agricultural and livestock activities. The 60's were marked by industrialization, city growth, population increase and high purchasing power, but the situation was food shortage (EMBRAPA, 2017). To solve the problem, the government has instituted rural credit policies in the country to foster modernization and increase agricultural and livestock productivity (BIANCHINI, 2015).
The volatility cluster formed in May 1980 can be explained by production to meet the demand for agricultural machinery intended to subsidize family farming, especially in soybean farming. It was also observed that production declined significantly and may be justified by the end of the harvest. Another factor that contributed to the reduction in production was that the credit model managed for the 1970s, called Miracle Economic. The end of the decade the sector starts to enter in a crisis, due to changes in international policies that affected the Brazilian economy. The credit line was equal to the other financing lines, and the resources were scarce and expensive, not being adjusted to the agricultural price, causing delinquency in rural credit (BIANCHINI, 2015).
The economic crisis impacted the reduction of sales, resulting in bankruptcy, closure of activities and denationalization of a large number of companies that participated in this industrial segment (AMATO NETO, 1985), which justifies the reduction of production in the subsequent period called Lost Decade.
The second highest volatility peak occurred on February 2017, when production increased by 97.59% compared to January. In this same period the country was in the process of modernization, agribusiness production chains impacted agricultural activities, which were of substantial participation in GDP. In 2016 agribusiness accounted for 46% of the export's values, and in total generating 23% of GDP. In 2017, agribusiness employed 4.12 million people, services 5.67 million and the agribusiness input segment 227.9 thousand (BRASIL, 2017).
This volatility cluster can be justified by the increase of exports, besides supplying the national demand, the international market was favorable, justified by the low dollar price. The high production of agricultural and road machinery occurs mainly in the first months of the year, so that the machinery works due to the soybean crop, which occurred between February and May.
The third largest volatility cluster occurred in September 1995, with 41.99% increase in total production of agricultural and road machinery, which may be justified by the emergence of the National Family Farming Strengthening Program (PRONAF) and macroeconomic stabilization (EMBRAPA, 2017;BIANCHINI, 2015).
In the first half of the 1990s problems with rural credit were persistent. In this post-crisis scenario of the agricultural model, and the end of the military regime, farmers' organizations rearticulated themselves in the National Confederation of Agriculture (CNA), and the Brazilian Agribusiness Association (ABAG). The mobilization of family farmers in 1995 allowed the consolidation of PRONAF, establishing a differentiated credit line for the productive restructuring of family farming. The PRONAF's rural credit was created by Central Bank Resolution 2191 of August 24, 1995(BIANCHINI, 2015, and its first year of operation provided a small number of financing for agricultural production, reflecting in an increasing production in the same year (SCHNEIDER, 2004).

CONCLUSION
The ARIMA(2,1,1)-ARCH(2) model fit to predict production and showed that the national production of agricultural and road machinery had linear and nonlinear components. The model fitted show that only an ARIMA model would not be able to capture all the movements in the series. The benefit to estimate linear and nonlinear models, go ahead for just made a simultaneous estimation of parameters, but provided the series behavior volatility along period.
The volatility captured in the national production of agricultural and road machinery was caused by the variation in credit development policies. It is suggested for future analysis the incorporation of exogenous macroeconomic variables, in order to contribute to the prediction's assertiveness.
To capture the long memory effects fractional models represented by an ARIMA with fractional differentiations (ARFIMA) jointly with volatility models, autoregressive conditional heteroscedasticity ARCH models are encouraged to be applied fitting an ARFIMA -GARCH models, probably providing better candidate models than other conditional heteroscedastic models for volatility.