MODELLING AND FORECAST OF CHARCOAL PRICES USING A NEURO- FUZZY SYSTEM ABSTRACT: Using a monthly time series of charcoal prices in Minas Gerais from January

Using a monthly time series of charcoal prices in Minas Gerais from January 2000 to September 2014, this study aimed to evaluate the use of neuro-fuzzy system to model the series and forecasting prices. We used four modeling structures for different prices lags (1, 2, 3, 4 and 5 lags). The structure most appropriate for neuro-fuzzy system was chosen based on the root mean square error, mean absolute error, mean squared error, mean absolute percentage error and maximum absolute percentage error for the forecasted period. With the results found, it is possible to conclude that a neuro-fuzzy system can be used properly to predict the charcoal prices. MODELAGEM E PROGNOSE DO PREÇO DE CARVÃO USANDO UM SISTEMA NEURO-FUZZY RESUMO: Utilizando dados da série temporal mensal de preços de carvão vegetal em Minas Gerais no período de janeiro de 2000 à setembro de 2014, este estudo teve como objetivo avaliar o uso do sistema neuro-fuzzy para modelagem e previsão de preços. Foram utilizados quatro estruturas de modelagem considerando diferentes defasagens da variável preço (1, 2, 3, 4 e 5 defasagens). A estrutura mais adequada para o sistema neurofuzzy foi escolhido com base nos valores de raiz quadrada do erro médio quadrático, erro médio absoluto, erro médio quadrático, erro médio percentual absoluto e máximo erro percentual absoluto para o período de previsão. Com os resultados encontrados, é possível concluir que um sistema neuro-fuzzy pode ser usado para prever corretamente os preços do carvão vegetal.


INTRODUCTION
Brazil is the biggest charcoal producer in the world (SILVA et al., 2014).About 17,81 million cubic meters of charcoal was produced in 2012, used as raw material for 24,8% of national steel mill production (ABRAF, 2013), which represented about 4% of the gross national product in the same year (IAB, 2012).
Charcoal prices are important to regulate the national supply chain, and its prices vary in function of site conditions (REZENDE et al., 2005), like distance from producers and consumers, and in function of the fluctuation in pig iron prices (NOCE et al., 2008).Because of this, the forecast of demand, supply, and prices are important for planning, mainly due to the fact that wood maturation period is relatively large.There is a necessity to refresh the temporal series of charcoal prices, allowing that mills and producers have sufficient information to make decisions and plans (COELHO JÚNIOR et al., 2006).
Forecast models are employed to study the behavior of temporal series, such as the ones that represent charcoal prices .These models are widely used in studies of applied economics in forest sciences (Coelho Junior et al., 2006;Rezende et al., 2005), charcoal prices (Soares et al., 2010), wood prices (Cordeiro et al., 2010), sawn wood export prices (Castro et al., 2011) with cellulose production, among others.
Most works on temporal series forecasting have used the Box and Jenkins methodology (CASTRO et al., 2011).However, with the advances in computational resources, artificial intelligence tools have been considered efficient for the same goal, highlighting artificial neural networks (ANN, COELHO et al., 2007;KOUTROUMANIDIS et al., 2009;COELHO JUNIOR et al., 2013) and neuro-fuzzy systems (HONG and LEE, 2005;LIU et al., 2012).
Neuro-fuzzy systems can be considered as a hybrid intelligent system that combines ANN with fuzzy logic (FULLÉR, 1995).According to the same author, ANN are good for pattern recognition but not decisions taking, whereas the fuzzy logic systems can explain decisions from the reasoning with imprecise information but cannot automatically adjust decisions rules.In this way, the use of a hybrid system can combine the advantages of both tools, able for application in situations or problems already solved by any of the tolls separately.Therefore, this work was carried out with the goal of evaluating the efficiency of a neuro-fuzzy inference system in modelling and forecast of charcoal prices.

Data
The data used in this work was obtained from the temporal series for charcoal prices in Minas Gerais state, Brazil, from the Associação Mineira de Silvicultura (AMS).The data consists of monthly records from August 2000 to September 2013, with 158 observations.The values were corrected by the General Prices Index for Internal Availability (GPI-IA) obtained from the website of Institute of Applied Economic Research (IPEA, 2015).

Neuro-fuzzy system
A fuzzy set is defined as a set without a well defined boundary, with a gradual transition between belonging or not belonging to the set (Jang, 1995).In contrast to the classical theory, in the fuzzy theory each element has a membership value to the set that varies between 0 and 1, according to the appropriate continuous distribution (membership function), taking as example the triangular, trapezoidal and Gaussian functions, as mentioned by Zadeh (1965).According to this author, fuzzy sets have similar operations to traditional sets, such as union, intersection and complement.Also, they have two operators AND (T-norm) and OR (T-conorm), which perform product and sum operations, respectively, in terms of the values of the membership functions of two fuzzy sets.
Fuzzy logic has rules of the IF-THEN type like as "if x is the THEN y is B" (Jang, 1995), where A and B are linguistic values, and "x is A" is called antecedent and "y is B" is called consequent.
For the process of converting inputs into outputs, there is the possibility of adoption of different models of fuzzy inference, highlighting the Mandani, Tsukamoto and Takagi Sugeno models.The latter model has rules like "if x is A and y is B THEN z = f (x, y)", where A and B are antecedent fuzzy sets and z = f (x, y) is the consequent set (Jang 1997).The function f is a polynomial that is referenced by it degree, in which the consequent of each rule is a linear combination of the inputs.This model does not involve a process of defuzzyfication and the output is a weighted linear combination of the consequent (Ibrahim, 2003).The Takagi-Sugeno system with two inputs (x1 and x2) and two rules (R 1 -Equation 1 and R 2 -Equation 2) can be represented as (Heddam et al, 2012), where A1 and A2 are fuzzy sets, f1 and f2 are specified by the rules and outputs p1, q1, c1, p2, q2 and c2 are parameters set during training.A ANFIS model considering rules [3] and [4] may be illustrated according to Figure 1.
is the product of the output from Layer 3 by the value of the polynomial, where p i , q i and c i are adjusted parameters.Figure 1b is the ANFIS structure with the following description for each layer (HEDDAM et al., 2012.): Layer 1: each node represented by A1, A2, B1 and B2 has an associated membership function.Thus, this layer has as inputs the values x and y, and as output the result of the MF.
Layer 2: Each component of this layer represents a fuzzy inference rule as (1) and ( 2).The w i output is in this case, the product of the inputs (t-norm).
Layer 3: at Layer 3 occurs the standardization of the outputs of Layer 2. Thus, there is the reason of the i-th rule by the sum of all rules: Layer 4: is the layer that has the polynomial function of the consequent set.The output of such layer Layer 5: layer that will generate the sum of the Layer 4 outputs, representing the overall output of the system.
Thus, the learning system comprises changing the parameters of layers 1 and 4, so that the outputs are the most credible of the values used for the training.For this, two methods may be adopted: backpropagation or hybrid.The first comprises the gradient descent method in which the gradient vector is determined from the backpropagation algorithm (USBERTI, 2007).The second process combines the method of least squares (LS) with the gradient descent method.Thus, the system finds the parameters of the consequent by LS from fixed parameters for the membership functions, so the foregoing parameters are calculated by backpropagation from the output error (HEDDAM et al., 2012).

Neuro-fuzzy for charcoal price
The neuro-fuzzy system proposed by Jang (1993) was used.Five configurations were evaluated for the neuro-fuzzy system architecture in function of the number of lags in the input data (Table 1), all of them with the same output variable (price without lagging).ANFIS-2 2 P-1, P-2 ANFIS-3 3 P-1, P-2, P-3 ANFIS-4 4 P-1, P-2, P-3, P-4 ANFIS-5 5 P-1, P-2, P-3, P-4, P-5 A three membership Gaussian functions type was used for each input variable, (ZADEH, 1995).The membership function represent the grade of belonging of a determinate value of charcoal price to each defined class (in this case, three classes, which can be linguistically represented by "low", "medium" and "high").Thus, a value is not exclusively considered as "high", but considered as having a grade belonging to be in the "high" set, which depends on its value and of the associated membership function.The Gaussian function was adopted by the fact that it is well know in the probability and statistic fields and has desirable properties such as invariance under multiplication and Fourrier transformation (JANG, 1995).The Gaussian membership function type for the "high" set is described as, where μ is the belonging grade of P-1 to the "high" set; P-1i is the i-th value of price variable with one lag; α and β are the estimated parameters.

RESULTS AND DISCUSSION
Evaluating the temporal series of charcoal behavior (Figure 2), a sharp rise of prices on beginning of 2008 can be noted, with values passing from R$ 144.00 to R$ 245.00 in almost seven months and with a drastic fall on the last part of the same year, passing from R$ 247,00 on August to R$ 109,00 on December.This indicates that there is a possible structural break on the series.This period agrees with the peak of financial crisis unleashed by North American financial market accumulated since 2007, according to Prado (2011).
Although the existence of structural breaks is a problem in temporal series modelling based on the Box and Jenkins methodology, it was not necessary to execute data transformation for the application of the neuro-fuzzy system.This is justified by the fact that, in general, non linear models, as is the neuro-fuzzy case, have more power to distinguish existent asymmetries caused by shocks that a temporal series is subjected (GALVÃO, 2003).Furthermore, according to Lee and Woong (2007), techniques like neuro-fuzzy circumvent structural breaks problems embedded in financial time series data.
It is possible to observe a raised agreement grade between the observed values and the estimated values from the inference fuzzy system (Figure 2a).When there is overlap between two series, it is possible to consider that there was a great estimation capability (ANDRADE et al., 2010).For all evaluated architectures, the percentage error for the training set does not exceeded 25% (Figure 2b), fact that also occurred with the validation data.The ANFIS architecture that presented the smaller error for the validation data was the ANFIS-2, in which the errors did not exceed 5% (Figure 2b).
On a paper studying the forecast of prices of different Pinus products using the Box and Jenkins methodology, Broz and Viego (2014) found percentage errors less than 12%.This shows how the results of the best architecture are satisfactory and alert to the necessity of choosing an adequate architecture.
Furthermore, for the ANFIS architecture with 5 lags, the estimated values were extremely close to the data used in the training.However, it is possible to observe that there is a discrepancy between the results The grid partition method discussed by Mesiarová-Zemánková and Ahmad (2010) was adopted and its use is justified by the small quantity of input variables and membership functions, which does not cause the problem of generating a big number of fuzzy rules mentioned by the authors.The partitioning of the input data is used to create groups with certain similarity and allow the formulation of inferences rules.
For the inference system the Takagi-Sugeno model (JANG, 1995) was used.This model does not have a deffuzzification process and the output is a pondered linear combination of consequents (IBRAHIM, 2003).The system used for the ANFIS-2 architecture, with two inputs (P-1 e P-2) and nine rules (R1 a R9), can be represented as (HEDDAM et al., 2012), where "low", "medium" and "high" are fuzzy sets for the price variable with any lag, f1 to f9 are outputs specified by the rules R1 to R9 and ai, bi e ci are the parameters defined during the training, i equals 1 to 9. The training was realized using the hybrid method, which combines the backpropagation method for the parameters of the membership functions on the inputs, and least squares, to adjust the coefficients of the output polynomials (SILVA et al., 2014).The training goal is obtain the parameters of membership functions and of consequents polynomials in order that the outputs are the most verisimilar in relation to the values used in the training.A number of training epochs equal to twenty and, or, error equal to zero were considered as stop criteria.
The last twelve dataset periods were used to evaluate the quality of the trained neuro-fuzzy system forecast.For this, the residual graphs and the statistics root mean square error (RMSE), mean absolute error (MAE), mean squared error (MSE), mean absolute percentage error (MAP), and maximum absolute percentage error (MAPE) were used: from the neuro-fuzzy system and the validation values.For this architecture, the number of fuzzy rules was equals to 243, in contrast to the 9 rules for the ANFIS with one lag, what can have promoted the problem know as over-fi tting.This problem occurs when the system becomes specialized on the training data, but does not have capacity to describe the validation data (HAYKIN, 2009).According to Usberti (2007), a number of rules no larger than the necessary should be maintained, otherwise it is possible that the representation of the system becomes degenerated by over-fi tting.Thus, the results of ANFIS-5 can be considered as not satisfactory for forecasting.
Considering the results obtained for the statistics of modelling (training step) and forecast (validation step) evaluation (Table 2), it is possible observe that the values of RMSE, MAE and MAP decreased with the increase on the quantity of input variable (number of lags used) for the training data.The inverse occurred for the validation data.The addition of variables (or lags) made the model better adjusted to the observed data, but when the model was used for forecasting it did not gain precision in periods not used in the training step.This provides a major discrepancy between the forecast and the real temporal series.
2003).The fuzzy logic, through its membership functions, has the capability to model these different behavior dynamics, resulting in better performance when compared with other methods, like artificial neural networks and the Box and Jenkins method.This was observed by Mohaddes and Fahimifard (2015) in a work comparing a neuro-fuzzy system and the Box and Jenkins approach to forecast agricultural products export revenues.
The neuro-fuzzy system can map the seasonal cycles and copy them for the prediction (CAMPOS, 2008).This behavior goes against the Samuelson theory of the Geometric Brownian Motion (SAMUELSON, 1965), where the variable assumes only positive values and the importance is on the percent changes on it.However, on forest sector this is not a problem because the price of commodities, the case of most of forest products, generally follow a pattern of mean reversion (SLADE, 2001;ZHANG et al, 2015;ZHAO and GU, 2015).
In a forest industry the decisions about the optimal level of production are made taking in consideration the marginal cost and revenue.For this, the understanding of the behavior of cost and its future trend is primordial.In case of the pig iron, the charcoal prices have big influences (MONTEIRO, 2006).The neuro-fuzzy system, as demonstrated in this study, is efficient to support the decision-making about the production level, in function of its accuracy in forecasting the behavior of temporal series of charcoal prices.

CONCLUSIONS
A neuro-fuzzy inference system can be used to model and forecast the prices of charcoal with accuracy; being necessary to evaluate different structures of ANFIS before using the system to model and forecast the time series.The bigger the number of input variables in the fuzzy system the better the modelling, but worse is the forecast for the case studied.The architecture that presented the best forecasting results was ANFIS-1 (with one lag).This architecture had the smallest values of MAE, RMSE and MSE.Rezende et al. (2005) found values of MAPE equals to 38.56%, 38.28% and 26.24% when forecasting SARIMA models, where the authors considered the results as "good results".Coelho Júnior et al. (2006) presented results equal to 21.66% for MAPE and 35.18% for MAP, also with SARIMA models applied to charcoal prices, considering the forecasting as "satisfactory".Comparing these values with the results of MAPE for the architecture ANFIS-1 (5.75%), is possible to infer that the neuro-fuzzy system can be considered as an excellent toll to forecast the prices of temporal series of charcoal.
The existence of thresholds creates different behavior dynamics into a temporal series (GALVÃO,

TABLE 1 Architectures
evaluated in function of lag quantity in the input data.TABELA 1 Arquiteturas avaliadas em função da quantidade de defasagens na variável de entrada.