INTRODUCTION:

In recent years, agriculture in Brazil has developed and modernized, especially as an activity of high economic and social value, with a marked tendency to grow (^{CAMPOS, 2007}). Internal and external development of the markets contributed to the dynamic character of agriculture, which incorporated productive technologies to meet the requirements of these markets.

For farmers, price analysis has a singular importance as a component of the exchange mechanism. Therefore, decision making requires that price behavior is known before the harvest is done (^{RIBEIRO et al., 2010}).

The main problem in forecasting agricultural prices is in the seasonality that occurs due to the weather, market, and conjunctural factors, which cause income uncertainty for farmers (MARTINS & MARTINEL^{LI, 2010}). Thus, farmers and academic people have been increasingly interested in price forecasting as it allows that price trend uncertainty is reduced.

In the literature (YONENAGA & FIGUEREDO, 1998; ^{BRESSAN, 2004}; ^{SOBREIRO et al., 2008}; ^{LIMA et al., 2010}; ^{RIBEIRO et al., 2010}; ^{CERETTA et al., 2010}; ^{FERREIRA et al., 2011}), the artificial neural networks (ANN) and classic (autoregressive) models are applied in the context of the Brazilian market to identify seasonal patterns and predict prices of agricultural products. In addition to these applications for the international market of agricultural commodities, some authors (^{SHAHWAN & ODENING, 2007}; ^{LI et al., 2010}; ^{JHA & SINHA, 2013}), use the ANN model as an alternative to forecast agricultural prices.

However, a gap exists in these studies as they neglect the set of prices of agricultural products to favor structure capturing in the study time series (^{HASSANI & MAHMOUDVAND, 2013}).

So, the purpose of this study was to apply the methodology proposed by ^{PINHEIRO & SENNA (2015}), which combines the ANN model in forecasting prices of agricultural products with the multivariate singular spectrum analysis (MSSA) model. The latter model captures the structure of the study time series, based on the set of other time series; thus, supporting the agricultural sector. Therefore, the methodology efficiency was evaluated by means of prediction performance.

MATERIALS AND METHODS:

This study on time series of prices for agricultural products was performed based on the commodities identified bellow as SUG (sugar), COT (cotton) COR (corn), COF (coffee), and SOY (soy). These products were chosen based on the export volume growth observed in the last years, according to data available at the Brazilian Ministry of Agriculture (^{MAPA, 2014}). Daily prices were obtained from the database of the Center for Advanced Studies in Applied Economics (CEPEA, for *Centro de Estudos Avançados em Economia Aplicada*) and correspond to the period from Jan 13, 2012 to Dec 20, 2013. Then, the time series was converted to weekly periodicity, totaling 96 weeks.

The methodology is the same as that proposed by ^{PINHEIRO & SENNA (2015}), which will be named from now on as ANN-MSSA. In addition to it, the ANN model was also applied for comparison between predictive performances as in the studies by YONENAGA & FIGUEREDO (1998); ^{BRESSAN (2004}); ^{SHAHWAN & ODENING (2007}); ^{SOBREIRO et al. (2008}); ^{LIMA et al. (2010}); ^{RIBEIRO et al. (2010}); ^{CERETTA et al. (2010}); LI et al. (2010); ^{FERREIRA et al. (2011}); ^{JHA & SINHA (2013}).

Artificial neural network (ANN) model

The ANN model is a nonparametric forecasting model, which does not require normality, stationarity, and linearity of data assumptions. In a simplified manner, it is a computational structure based on a biological process inspired by the human brain architecture. In situations in which the law of data generation is unknown, the ANN model becomes useful because it can approach to any non-linear function. Evidence on the nonlinearity of time series of agricultural products in the Brazilian market can be seen in the studies by ^{LIMA et al. (2010}); ^{CERETTA et al. (2010}); ^{FERREIRA et al. (2011}) e ^{BECKMANN & CZUDAJ (2014}) for the international market. This evidence justifies the use of the ANN model in the present study.

Elements of the artificial neuron are represented by: *m* that indicates the number of the neuron input signals; *x _{j}
* the

*j*-th neuron input signal;

*w*the weight associated with the

_{gj}*j*-th neuron input signal

*g*;

*b*the threshold of each neuron, also called bias;

*v*a weighted combination of input signals and the

_{g}*g*-th neuron bias and as an activation function of the

*g*-th neuron (

^{HAYKIN, 2001}).

Replacing the bias *b _{g}
* for a fixed input

*x*is possible, so that the bias becomes a new synaptic weight

_{0}=1*W*. Thus, the neuron

_{g0}=b_{g}*g*is described as (

^{PASQUOTTO, 2010}):

and

with *v _{g}
* defined as an induced local field or activation potential, and the activation function defining the output

*y*of the

_{g}(t)*g*-th neuron at instant

*t*.

Classification given in the literature for the way neurons are distributed in the network considered that they are arranged in layers. According to ^{HAYKIN (2001}), the neural network architecture can be disposed in monolayer, multilayer, fed forward, and recurrent. Proper use of the ANN model requires training to interactively adjust the network parameters to a sequence of events.

The supervised training works indicated the correct answer for each situation at the network output. For this purpose, a set of input data is presented to the neural network as an example. It generates a network output, which is compared with the expected output, thus giving the corresponding error.

Multivariate singular spectrum (MSS) model

As the ANN model, the MSS model is also non-parametric. In its presentation, it consisted of two complementary stages: decomposition and reconstruction. Decomposition stage is defined by two steps: incorporation and decomposition in singular value. Incorporation is a mapping that transfers a set *M* of one-dimension time series to a multidimensional array with lagged vectors ϵ *R*
^{L}
*
^{i}
* , where

*i*=1,...,

*M*, is the window length, and the number of columns is given by

*K*and

_{i}= N_{i}-L_{i}+1*N*is the number of data in the time series.

Then, decomposition is performed in a singular value for *X _{V}X_{V}
^{T}
* . In it, the eigenvalues of

*X*are denoted by , in decreasing order of magnitude , and the eigenvectors are denoted by . These elements are important to define the elementary matrices EVi according to and considering .

_{V}X_{V}^{T}After the elementary matrices are defined, the two steps of the reconstruction stage are necessary: grouping and diagonal average. In the grouping step, the objective is to separate the elementary matrices into two groups: signal and white noise. Based on the measurement of the weighted correlation, as obtained by dividing the norm of groups by the internal product of a pair of groups, the elementary matrices are separated based on the lowest correlation obtained.

As each group is disposed in a multidimensional array composed of KY columns, the diagonal average is applied to convert the multidimensional array in a path vector that represents the original time series without white noise. According to the ANN-MSSA methodology, this series without white noise is used by the ANN prediction model.

Evaluation of the predictive performance

In this study, forecasts for ANN-MSSA and ANN are compared with those of the first 12 weeks subsequent to the final week of the sample. To evaluate the performance of the predictive models, the study makes used the measure of the mean-square error (MSE), as defined by:

with representing the value of the original series, the value of the forecasting and *h* the amount of observations expected and selected for evaluation.

Application of statistical tests

Before the ANN-MSSA and ANN models were used, statistical tests were applied to know the time series characteristics. Thus, results obtained for the Anderson-Darling (AD) and Shapiro-Wilk (SW) tests (10% significance level) indicated that the time series were normally distributed. Considering the same significance level for the ^{TSAY (1986}) and ^{MCLEOD & LI (1983}) non-linearity tests, the series cannot be considered linear, which thus justify the use of the ANN model in the study. Finally, the Augmented Dickey-Fuller (ADF) and Kwiatkowski-Phillips-Schmidt-Shin (KPSS) tests to assess stationarity of the time series confirm that they are not stationary. In addition to the tests, figure 1 shows that in the period Jan 13, 2012 to Dec 20, 2013, they are non-stationary time series due to the existence of both trends and some peaks in the time series.

Use of the ANN-MSSA methodology and ANN model

Based on the study by ^{HASSANI & MAHMOUDVAND (2013}), which indicated the ideal window size for MSSA, a value of 16 was used for this study. Then, 51 elementary matrices were obtained through the singular-value decomposition step. Reconstruction of the time series was the next step. Then, based on the weighted correlation measure, the 51 elementary matrices were separated, assigning values to SUG (14), COT (13), COR (12), COF (12), and SOY (15) to define the separation between signal and white noise.

This means that, in order to promote grouping in a time series, e.g., for SUG, the first 14 elementary matrices of 51 elementary matrices obtained in the singular-value decomposition step, correspond to the signal whereas the others (15-51) correspond to the white noise. After applying the diagonal average to the sum of these matrices, the time series without white noise is given by a column vector. Application of the MSSA model to ANN-MSSA is due to the fact that it favors capturing the structure of the study time series based on the structure of other time series.

After the white noise was eliminated, time series for each agricultural product was used in the ANN model using the R statistical program and ARNN package. This package uses a layer composed of some neurons, assuming a maximum a number of iterations equal to 100,000, with supervised training and use of the backpropagation algorithm. The first 80 of the 96 values available for each time series were used as an input and the 16 values remaining were used as a neural network training.

As the package allows changing the number of neurons, models with 5, 10, 15, 20, and 25 neurons were used Number of neurons was changed during the training phase, as in the studies of ^{SOBREIRO et al. (2008}), ^{BRESSAN (2004}), and ^{FERREIRA et al. (2011}).

After training, the model with 15 neurons showed the lowest value for the Akaike Information Criteria (AIC), being selected as the most appropriate for all the time series. In this study the difference between ANN-MSSA and the ANN model is given by the treatment to separate the noise from the original time series, i.e., for the ANN model, the time series were used without separating the white noise and multivariate analysis of the set of series.

RESULTS AND DISCUSSION:

The forecasts obtained for MSSA-ANN using the error measure (5) correspond to the best performance (Table 1: BP) as compared to those obtained by the ANN model, because of the lowest values for error measurement.

The test proposed by ^{DIEBOLD & MARIANO (1995}) was applied to evaluate the statistical significance of the difference between the MSEs for the best-performance model and the subsequent best-performance model. The results indicated that the null hypothesis for the models under comparison, by which the difference between error measurements is zero, could be rejected at 10%-significance statistical level for the SUG, COT, CORN, and SOY time series. The exception to this is given to the COF time series, whose best forecasting performance is given by the ANN model.

Besides the statistical test, which confirms the predictive superiority for the ANN-MSSA methodology. Figure 2 allowed to verify that the ANN-MSSA methodology was able to detect the trend prices for SUG, COT, CORN, and SOY.

Forecasting superiority of the ANN model can be seen in ^{SOBREIRO et al. (2008}). These authors confirmed the forecasting effectiveness for monthly sugar prices with the highest predictive approximation in recent months. In forecasting the weekly coffee and soybean prices, ^{BRESSAN (2004}) presented evidence favorable to the use of the classic (autoregressive) and ANN models as decision tools in agricultural negotiations.

The forecasting performance favorable to the ANN model is seen again ^{FERREIRA et al. (2011}). For this purpose, the authors used the corn monthly prices and concluded that the model showed a forecasting capacity, and thus it may be used in the decision-making process concerning the behavior of prices for agricultural products.

Prior to these studies, YONENAGA & FIGUEREDO (1998) showed that the ANN model is an alternative to the statistical methods to solve the problem of forecasting time series of agricultural products, and presents favorable predictive results in addition to properly indicat the trend in the real price. Other studies (^{LIMA et al. (2010}); ^{RIBEIRO et al. (2010}); MIRANDA et al. (2013)) based on data from the national and international agricultural market (^{SHAHWAN & ODENING, 2007}; LI et al., 2010; ^{JHA; SINHA, 2013}), indicated the ANN model as an alternative in forecasting agricultural prices.

In the international market, the prices for agricultural commodities were forecasted by a methodology that uses the classical autoregressive model combined with the ANN one, which was defined as a hybrid model in the study by ^{SHAHWAN & ODENING (2007}). For these authors, results for the hybrid model showed a better forecasting performance as compared to the autoregressive model. ^{LI et al. (2010}) were also based on daily and weekly prices for agricultural products in the international market, and concluded that the ANN model has a better predictive performance as compared to the autoregressive model in the short term. Justifying the importance of price management for farmers in the Indian market, the same conclusion was obtained by ^{JHA & SINHA (2013}) in their study on soybean and mustard prices.

Differently from the present study, the one mentioned above did not consider multivariate aspects, namely, the relationship between commodity prices. Nevertheless, their results do not reject using the ANN model, but improve its use when decomposition of the time series is included because the investigations on the national and international agricultural market do not take into account the set of prices for agricultural products. Thus, due to the multivariate nature of the MSSA model, which eliminates white noise and takes into account the relationship between prices for agricultural products, use of the MSSA-ANN may be considered an alternative that contributes to forecast the prices for agricultural products.

CONCLUSION:

In the context of this study the empirical data showed that performance of the MSSA-ANN methodology is better than that of the ANN model, as the methodology allows obtaining a greater number of better performances for forecasts. Results obtained in the outside sample period, by using the MSE measurement and statistical test, confirm this conclusion.

This study combined the MSSA model to decompose the time series with the ANN model and added favorable evidence, showing to be another alternative in forecasting prices for agricultural products. Therefore, our results will be useful to formulate and implement policies directed to the agricultural sector. Thus, price forecasting obtained by the ANN-MSSA methodology is an alternative approach in financial planning, as it allowed to detect and forecast prices of the study agricultural products.