# ABSTRACT:

Forecast the price of agricultural goods is a beneficial action for farmers, marketing agents, consumers, and policymakers. Today, managing this product security requires price forecasting models that are both efficient and reliable for a country’s import and export. In the last few decades, the Autoregressive Integrated Moving Average (ARIMA) model has been widely used in economics time series forecasting. Recently, many of the time series observations presented in economics have been clearly shown to be nonlinear, Machine learning (ML) modelling, conversely, offers a potential price forecasting technique that is more flexible given the limited data available in most countries’ economies. In this research, a hybrid price forecasting model has been used, through a novel clustering technique, a new cluster selection algorithm and a multilayer perceptron neural network (MLPNN), which had many advantages and using monthly time series of Thai rice FOB price form November 1987 to October 2017. The empirical results of this study showed that the value of root mean square error (RMSE) equals 14.37 and the Mean absolute percentage error (MAPE) equals 4.09% for the hybrid model. The evaluation results of proposed method and comparison its performance with four benchmark models, by monthly time series of Thailand rice FOB price from November 1987 to October 2017 showed the outperform of proposed method.

Key words:
price forecasting; agricultural commodity; artificial neural network (ANN); hybrid model; data cluster

# RESUMO:

Prever o preço dos produtos agrícolas é uma ação benéfica para agricultores, agentes de marketing, consumidores e legisladores. Hoje, o gerenciamento da segurança desse produto requer modelos de previsão de preços eficientes e confiáveis ​​para a importação e exportação de um país. Nas últimas décadas, o modelo Autoregressive Integrated Moving Average (ARIMA) tem sido amplamente utilizado na previsão de séries temporais da economia. Recentemente, muitas das observações de séries temporais apresentadas em economia têm se mostrado claramente não lineares. A modelagem de aprendizado de máquina (ML), por outro lado, oferece uma técnica de previsão de preços potencial que é mais flexível, apresentados os dados limitados disponíveis na maioria dos países. Nesta pesquisa, um modelo híbrido de previsão de preços foi usado, por meio de uma nova técnica de agrupamento, um novo algoritmo de seleção de agrupamento e uma rede neural perceptron multicamadas (MLPNN), que teve muitas vantagens, e usando séries temporais mensais de preços FOB do arroz tailandês de novembro 1987 a outubro de 2017. Os resultados empíricos deste estudo mostraram que o valor da raiz do erro quadrático médio (RMSE) é igual a 14,37 e o erro percentual absoluto médio (MAPE) é igual a 4,09% para o modelo híbrido. Os resultados da avaliação do método proposto e a comparação de seu desempenho com quatro modelos de benchmark, por séries temporais mensais de preço FOB do arroz tailandês de novembro de 1987 a outubro de 2017, mostram o desempenho superior do método proposto.

Palavras-chave:
previsão de preço; commodity agrícola; rede neural artificial (ANN); modelo híbrido; cluster de dados

# INTRODUCTION:

The history of studying the forecast of prices and production of agricultural products in the market is almost a century (JAYARAMU, 2015JAYARAMU, N. Impact of seasonality on agricultural commodity price behavior. Northwest Missouri State University. 2015. Available from: <Available from: http://www.nwmissouri.edu/library/theses/2015/JayaramuNiranjan.pdf >. Accessed: Oct. 19, 2019. doi: Not available.
http://www.nwmissouri.edu/library/theses...
). Economic forecasting in the agricultural sector has had similarities with business and macroeconomic forecasting, but over time, the focus on itself has increased and expanded (ALLEN, 1994ALLEN, P.G. Economic forecasting in agriculture. International Journal of Forecasting, v.10, n.1, p. 81-135, 1994. Available from: <Available from: https://econpapers.repec.org/article/eeeintfor/v_3a10_3ay_3a1994_3ai_3a1_3ap_3a81-135.htm >. Accessed: Oct. 19, 2019. doi: 10.1016/0169-2070(94)90052-3.
https://econpapers.repec.org/article/eee...
). Agricultural commodity prices, as well as the price of other products, were on an increasing trend since 2002 (GULERCE & UNAL, 2017GULERCE, M.; UNAL, G. Forecasting of oil and agricultural commodity prices: VARMA versus ARMA. Annals of Financial Economics, v.12, n.3, p.1-30, 2017. Available from: <Available from: https://ideas.repec.org/a/wsi/afexxx/v12y2017i03ns2010495217500129.html >. Accessed: Oct. 19, 2019. doi: 10.1142/S2010495217500129.
https://ideas.repec.org/a/wsi/afexxx/v12...
). In addition, in recent years we have seen an increase in price fluctuations in most agricultural commodities, especially strategic commodities such as rice. This issue can have a direct and major impact on food security, both at the micro and macro levels of society. Price variation through time usually caused by seasonal variation in prices, annual price behavior, long run trends in price, cyclical price behavior and government‘s intervention in market (TOMEK & ROBINSON, 2003TOMEK, W.G.; ROBINSON, K. Agricultural Product Prices. Cornell University Press. 2003. Available from < Available from https://www.abebooks.com/9780801424519/Agricultural-Product-Prices-Tomek-William-0801424518/plp >. Accessed: Oct. 19, 2019. doi: Not available.
https://www.abebooks.com/9780801424519/A...
). Hence, this has increased risk faced by farmers, marketing agents and consumers. Agricultural commodity price forecasting studies allow farmers, marketing agents and consumers to make better informed decisions and manage different risks. This also helps in determining the impact of new technologies, time and other economic factors on commodity production and prices (JAYARAMU, 2015JAYARAMU, N. Impact of seasonality on agricultural commodity price behavior. Northwest Missouri State University. 2015. Available from: <Available from: http://www.nwmissouri.edu/library/theses/2015/JayaramuNiranjan.pdf >. Accessed: Oct. 19, 2019. doi: Not available.
http://www.nwmissouri.edu/library/theses...
). Since food production and its price have a special place in food security and satisfaction of the citizens of a nation, government officials are considered as the main users as well as suppliers of agricultural forecasts. They have the task of providing technical and market support for consumers and the agricultural sector by implementing special policies, and to implement this, they use various forecasts such as domestic, regional and international (ALLEN, 1994ALLEN, P.G. Economic forecasting in agriculture. International Journal of Forecasting, v.10, n.1, p. 81-135, 1994. Available from: <Available from: https://econpapers.repec.org/article/eeeintfor/v_3a10_3ay_3a1994_3ai_3a1_3ap_3a81-135.htm >. Accessed: Oct. 19, 2019. doi: 10.1016/0169-2070(94)90052-3.
https://econpapers.repec.org/article/eee...
). Agricultural forecasting uses a variety of models and methods that have a high level of variability and can be used in a variety of situations.

Price forecasting models that are simpler than others (such as: naive or distributed - lags model) that have been relatively successful in predicting the price of agricultural goods (HUDSON, 2007HUDSON, D. Agricultural Markets and Prices. Maden. MA. Blackwell Publishing, p. 1-250, 2007. Available from: <Available from: https://catalog.udom.ac.tz/cgi-bin/koha/opac-detail.pl?biblionumber=31182&query_desc=an%3A71285 >. Accessed: Oct. 19, 2019. doi: 978-1-4051-3667-9.
https://catalog.udom.ac.tz/cgi-bin/koha/...
). Conversely, Deferred Future Plus Historical Basis (DFHB) models, ARIMA models, different multivariate time series models like VAR, VECM, etc. and composite models generate more accurate predictions (KASTENS et al., 1998KASTENS, T.L. et al. Future-Based Price Forecast for Agricultural Producers and Businesses. Journal of Agricultural and Resource Economics. v.23, n.1, p.294-307, 1998. Available from: <Available from: https://webcache.googleusercontent.com/search?q=cache:wapxswGU8BsJ:https://citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.511.4986%26rep%3Drep1%26type%3Dpdf+&cd=1&hl=en&ct=clnk≷=ir&client=firefox-b-d >. Accessed: Oct. 19, 2019. doi: 10.1.1.511.4986.
; TOMEK & MYERS, 1993TOMEK, W.G.; MYERS, RJ. Empirical Analysis of Agricultural Commodity Prices: A Viewpoint. Review of Agricultural Economics, v.15, n.1, p.181-202, 1993. Available from <Available from https://onlinelibrary.wiley.com/doi/abs/10.2307/1349721 >. Accessed: Oct. 19, 2019. doi: 10.2307/1349721.
https://onlinelibrary.wiley.com/doi/abs/...
; BURAK & SHARMA, 2012BURARK, S.S.; SHARMA, H. Price forecasting of coriander: methodological issues. Agricultural Economics Research Review. v.25, p.530, 2012. Available from: <Available from: https://www.indianjournals.com/ijor.aspx?target=ijor:aerr&volume=25&issue=conf&article=abs01 >. Accessed: Oct. 19, 2019. doi: Not available.
https://www.indianjournals.com/ijor.aspx...
; DAREKAR et al., 2016DAREKAR, A.S. et al. Onion price forecasting in Kolhapur market of western Maharashtra using ARIMA technique. International Journal of Information Research and Review, v.3, p.3364-3368, 2016. Available from: <Available from: https://www.academia.edu/33434479/ONION_PRICE_FORECASTING_IN_KOLHAPUR_MARKET_OF_WESTERN_MAHARASHTRA_USING_ARIMA_TECHNIQUE >. Accessed: Oct. 19, 2019. doi: Not available.
). Just like any other method, these techniques also do not guarantee perfect forecasts. Nevertheless, the model is handy and has been successfully used for forecasting in future (DAREKAR & REDDY, 2017DAREKAR, A.S.; REDDY, A.A. Cotton price forecasting in major producing states. Economic Affairs, v.62, p.1-6, 2017. Available from: <Available from: https://www.researchgate.net/publication/320482197_Cotton_Price_Forecasting_in_Major_Producing_States >. Accessed: Oct. 19, 2019. doi: 10.5958/0976-4666.2017.00047.X.
https://www.researchgate.net/publication...
). It is clear that the statistical complexity of the model is the result of increasing the accuracy of the model (HUDSON, 2007). As the fluctuation in agricultural commodity prices increased, the results of simple forecasting methods will be less reliable and accurate. Hence, developing complex statistically forecasting methods are inevitable. The solution for ML models to develop complex forecasting models is to use an alternative (GHAYEKHLOO et al., 2015aGHAYEKHLOO, M. et al. A novel clustering approach for short-term solar radiation forecasting. Solar Energy, v.122, p.1371-1383, 2015a. Available from: <Available from: https://www.infona.pl/resource/bwmeta1.element.elsevier-6289505f-e67a-3507-9e67-d7b0bd19e8f >. Accessed: Oct. 19, 2019. doi:10.1016/j.solener.2015.10.053.
https://www.infona.pl/resource/bwmeta1.e...
; GHAYEKHLOO et al., 2015bGHAYEKHLOO, M. et al. A hybrid short-term load forecasting with a new data preprocessing framework. Electric Power Systems Research, v.119, p.138-148, 2015b. Available from: <Available from: https://www.researchgate.net/publication/266149154_A_hybrid_short-term_load_forecasting_with_a_new_data_preprocessing_framework >. Accessed: Oct. 19, 2019. doi: 10.1016/j.epsr.2014.09.002.
https://www.researchgate.net/publication...
). The ML theory is based on cognitive pattern and statistical inference, so that a model has the ability to learn to improve its performance based on its previous experiences (MJOLSNESS & DECOSTE, 2001MJOLSNESS, E.; DECOSTE, D. Machine learning for science: state of the art and future prospects. Science, v.293, p.2051-2055, 2001. Available from: <Available from: https://www.researchgate.net/publication/11789794_Machine_Learning_for_Science_State_of_the_Art_and_Future_Prospects >. Accessed: Oct. 19, 2019. doi: 10.1126/science.293.5537.2051.
https://www.researchgate.net/publication...
). The Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Relevance Vector Machines (RVMs) and Hybrid models are examples of ML models. ML models have been widely applied for financial economics modeling and agricultural commodity price prediction (ENKE & THAWORNWING, 2005ENKE, D.; THAWORNWONG, S. The use of data mining and neural networks for forecasting stock market returns. Expert Systems with Applications, v.29, n.4, p.927-940, 2005. Available from: <Available from: https://www.researchgate.net/publication/223494715_The_use_of_data_mining_and_neural_networks_for_forecasting_stock_market_returns >. Accessed: Oct. 19, 2019. doi: 10.1016/j.eswa.2005.06.024.
https://www.researchgate.net/publication...
; CO & BOOSARAWONGSE, 2007; SHAHWAN & ODENING, 2007SHAHWAN, T.; ODENING, M. Forecasting agricultural commodity prices using hybrid neural networks. In: Chen SH., Wang P.P., Kuo TW. (eds) Computational Intelligence in Economics and Finance. Springer, Berlin, Heidelberg. p.63-74, 2007. Available from: <Available from: https://link.springer.com/chapter/10.1007/978-3-540-72821-4_3 >. Accessed: Oct. 19, 2019. doi: 10.1007/978-3-540-72821-4_3.
; TICLAVILCA et al., 2010TICLAVILCA, AM. et al. Forecasting Agricultural Commodity Prices Using Multivariate Bayesian Machine Learning Regression. Proceedings of the NCCC-134 Conference on Applied Commodity Price Analysis. Forecasting and Market Risk Management. St. Louis, MO. 2010. Available from <Available from https://webcache.googleusercontent.com/search?q=cache:ewvA3OStdhgJ:https://legacy.farmdoc.illinois.edu/nccc134/conf_2010/pdf/confp06-10.pdf+&cd=1&hl=en&ct=clnk≷=ir&client=firefox-b-d >. Accessed: Oct. 19, 2019. doi: Not available.
).

The studies that have been done in predicting the price of agricultural goods are very different. For example, these studies can be done in different countries with different products and different market structures. With these interpretations, in general, the prominent goal of all of them has been price stability at the desired levels, which will result in the economic growth of countries and increase the general welfare of consumers (WICKRAMARACHCHI et al., 2017WICKRAMARACHCHI, A. R. et al., An analysis of price behavior of major poultry products in Sri Lanka. Journal of Agricultural Sciences-Sri Lanka, v.12, n.2, p.138-148, 2017. Available from: <Available from: https://www.researchgate.net/publication/316703712_An_Analysis_of_Price_Behavior_of_Major_Poultry_Products_in_Sri_Lanka >. Accessed: Oct. 19, 2019. doi: 10.4038/jas.v12i2.8231.
https://www.researchgate.net/publication...
). One of the major and most consumed products of Iranians is rice. Nearly 615,000 hectares of fertile land in Iran is dedicated to rice cultivation and more than 1.4 million tons of rice is cultivated annually in Iran (CHIZARI et al., 2013CHIZARI, A.H. et al. Investigating market integration and price transmission of different rice qualities in Iran. International Journal of Agricultural Management and Development, v. 3, p. 219-225, 2013. Available from: <Available from: http://ijamad.iaurasht.ac.ir/article_513892.html >. Accessed: Oct. 19, 2019. doi: Not available.
). Unfortunately, this production level does not meet domestic demand and in recent years more than 1 million ton of rice imported from different countries like Thailand, India, and Pakistan, mainly. Historically, Iran has been one of the rice exporters until 1940. During past decades, rice acreages, production and import have experienced many changes. Prior to Iran‘s Islamic Revolution, oil income increased in early 1950s and urbanization changed rice consumption pattern and increase its demand, considerably. After revolution, population growth, government extreme interference in rice market and distribution of coupons are the main factors that changed Iran to one of the most important rice importers in recent years. Hence, it is very important for Iranian policy makers to forecasts FOB price of the main rice exporters‘ countries as well as internal rice varieties price.

Moreover, accurate prediction of agricultural commodity prices is a major challenge due to fluctuations and chaotic nature of this type of time series data. The irregular nature of agricultural commodity prices data disrupts the neural network learning process and leads to forecasting results with high errors.

The above facts motivated us to provide a hybrid forecasting system for agricultural commodity price by focus on rice. In this way, a new time-series based K-means clustering is proposed in this study to group the price of rice time series data into clusters for a better characterization of its irregular nature. Proposed data clustering approach classifies price of rice data into separate groups of similar patterns. The most appropriate group is selected; and subsequently, preprocessed simultaneously by the HANTS (Harmonic analysis of time series) method and time-series analysis.

The complexity of the neural network in use for large price of rice data sets requires a deep learning method to provide reliable forecast results. Therefore, the HANTS methods are used to prepare the most appropriate training data for a better MLPNN learning.

The HANTS method filters the price of rice data to better characterize the price of rice behavior and provides more appropriate learning for multilayer perceptron neural network (MLPNN) to enhance the accuracy of the forecast results.

The results of the proposed method and comparedits performance with four benchmark models, showing the outperform of proposed method using monthly time series of Thailand rice FOB price from November 1987 to October 2017.

The next section illustrates the concepts of proposed data clustering method as well as proposed the hybrid forecasting technique.

# MATERIALS AND METHODS:

The proposed data clustering method

The main purpose of clustering is to generate smaller groups of data that share specific patterns in a same cluster and to separate them from data that has elements with different properties. When dealing with rice price data, this method makes it possible to classify the data into separate groups, which in turn gives a deeper understanding of the information obtained, and ultimately we will face a more accurate forecast and closer to reality.

We used an updated and improved version of the K-means algorithm to cluster rice data at different dates, which helps to better describe its irregular nature. The introduced algorithm uses a new method to adopt the centroids of the initial cluster, the advantage of which is to eliminate the shortcomings in the existing K-means algorithms, one of which is the inability to randomize the initialization.

Consider a set of data whose number is n. The selection of the centroids of K is done as follows:

1) Store specific vectors by repeating ri in the new dataset ${X}^{\text{'}}=\left[\left({\mathit{x\text{'}}}_{1},{r}_{1}\right),\dots \left({\mathit{x\text{'}}}_{m},{r}_{m}\right)\right]$ ($i\le m\le n$).

2) Arrange the data vectors incrementally in the X‘ dataset according to their Euclidean length. By sorting the data in the initialization introduced by this article, similar data will be placed next to each other. Through this operation, the K-means algorithm will be able to reach the central points of the clusters optimally and globally in the shortest possible time. The Euclidean length of each axis in a d-dimensional space is calculated by:

$‖V‖=\sqrt{{V}_{1}^{2}+{V}_{2}^{2}+\dots {V}_{d}^{2}}$ (1)

3) Now split the data set into several parts, assuming the number of data is m, and the number of data subsets is K, The maximum amount of data in it should be $P=⌈m}{K}⌉$ , corresponding to what in eq. (2), is presented, so that any data in the X‘ data set is distributed in the ${\mathit{X\text{'}}}_{1}$ to ${\mathit{X\text{'}}}_{1}$sub-data set.

${\mathit{X\text{'}}}_{1}=\left[\left({\mathit{X\text{'}}}_{1},{r}_{1}\right),\dots \left({\mathit{X\text{'}}}_{p},{r}_{p}\right)\right],$

${\mathit{X\text{'}}}_{2}=\left[\left({\mathit{X\text{'}}}_{P+1},{r}_{P+1}\right),\dots \left({\mathit{X\text{'}}}_{2P},{r}_{2P}\right)\right],$

...

${X}_{K}=\left[\left({x}_{\left(K-1\right)x\left(P\right)+1},{r}_{\left(K-1\right)x\left(P\right)+1}\right),\dots \left({x}_{\mathit{KP}},{r}_{\mathit{KP}}\right)\right]$

${X}^{\text{'}}=\bigcup _{k}^{=}{\mathit{X\text{'}}}_{k}$

(2)

4) We are now able to locate one of the initial centroids, which also has K, by having k sub-data sets. Eq. (3) is used to specify each of the initial centroids, the numberf which is K and is extracted from the sub-data set x.

$\mathit{init}{C}_{l}=\frac{\sum _{i}^{=}\left({\mathit{x\text{'}}}_{i}\bullet {r}_{i}\right)}{\sum _{i}^{=}{r}_{i}}$ , $\left(1\le l\le K\right)$ (3)

Where $\mathit{init}{C}_{l}=\left\{{c}_{1},\dots ,{c}_{k}\right\}$ is for data that is in the l-th subset of the sub-dataset and ri is a duplicate number for the i-th of the data.

5) Calculate the Euclidean distance by:

$d\left({x}_{i},{c}_{j}\right)=\sum _{i}^{=}{\sum }_{j}^{=}{‖{x}_{i}^{\left(j\right)}-{c}_{j}‖}^{2}$ (4)

6) To create k clusters, you must use the calculated Euclidean distances and assign all points to the nearest centroids.

7) By calculating the average of all the points in each cluster, new centroids are obtained.

8) Repeat steps 5, 6 and 7 until the centroids remain unchanged or the number of repetitions exceeds the user-specified limit.

Figure 1 shows the flowchart of our proposed data clustering method.

Figure 1
Proposed data clustering method. Source: author’s proposed algorithm.

The Proposed Hybrid Forecasting Method

A novel hybrid price of rice forecasting method is developed in this research. In figure 2 you can see the flow chart of the model introduced for forecasting. The procedure and steps are described below:

Figure 2
Harmonic (Fourier) analysis of time series. Source: author’s introduced algorithm.

1) Evaluate the price of rice data by allocating 80% for training and the remaining 20% for testing. In general, the higher number of pure training data, lead to better training neural network and provide better prediction results, as well as reduce the forecasting speed to achieve the final results. For this purpose, on a contract basis, the input data with 80% allocated for training and the remaining 20% allocated for testing.

2) Use the elbow method [1] to estimate the number of k clusters.

3) Use the method introduced for data clustering to cluster the rice data price in the training field into k clusters.

4) To prepare the first efficient input for MLPNN, the cluster with the minimum distance and maximum number of correlated Training-Input to the Testing-Input data is elected (best cluster) as the most appropriate input for MLPNN.

5) The Harmonic (Fourier) analysis of time series (HANTS) method [2, 3] as well as time series analysis is used for the rice data pricing process in the best cluster, which results in obtaining input for MLPNN.

6) To create a learning-enhancing process, you must use Numerical Weather Prediction (NWP) data, as well as rice crop production data that is in the best cluster, along with other inputs that are efficient for MLPNN.

The HANTS method identifies the most important time series data and uses harmonic components for the least squares curves, which is one of the main advantages of this method. The HANTS method is based on the discrete Fourier transform (DFT) as given by.

$\stackrel{\text{̃}}{x}\left({t}_{j}\right)={a}_{0}+\sum _{t}^{=}\left[{a}_{i}\mathrm{cos}\mathrm{cos}\left(2\pi {f}_{i}{t}_{j}\right)+{b}_{i}\mathrm{sin}\mathrm{sin}\left(2\pi {f}_{i}{t}_{j}\right)\right]$

$x\left({t}_{j}\right)=\stackrel{\text{̃}}{x}\left({t}_{j}\right)+\epsilon \left({t}_{j}\right)$(5)

Where y, $\stackrel{\text{̃}}{y}$ and $\epsilon$the original, time series data recovery and error is performed; observing time of y is displayed with ${t}_{j}$.; the number of instances in a time series data is represented by j=1, 2,…, N; The frequency of data in a time series data is denoted by $\mathit{nf}$; ${a}_{i}$and ${b}_{i}$are coefficients specific to the -th harmonic whose frequency is ${f}_{i}$. The curve fitting procedure is obtained by optimizing (5) using the linear least square method. The HANTS method is useful for the noise suppression of volatile time series data such as rice price time series (ZHOU et al., 2012ZHOU, J. et al., Evaluation of Harmonic Analysis of Time Series (HANTS): impact of gaps on time series reconstruction, 2012 Second International Workshop on Earth Observation and Remote Sensing Applications, p.1-6, 2012. Available from: <Available from: https://www.researchgate.net/publication/230900552_Evaluation_of_Harmonic_ANalysis_of_Time_Series_HANTS_impact_of_gaps_on_time_series_reconstruction >. Accessed: Oct. 19, 2019. doi: 10.1109/EORSA.2012.6261129.
https://www.researchgate.net/publication...
).

Time-series Analysis

Figure 3 presents the time series data related to teting and training in the form of a matrix. Time series data analysis is critical to the management and collection of historical data on rice production, and it is possible that they will later be used as inputs in the neural networks of the forecasting process.

Figure 3 -
Input-output time series structure for the Neural Networks. Source: author’s introduced structure.

This structure contains training data whose number is R and its most recent data point is W (t). The test data input consists of R^‘ rows whose output is forecasts. N acts as a representative of the time window lagging for both testing and training inputs. Each of the N points related to the testing and training has its own corresponding output. In order to form sequential and associated data, it is necessary to move each set one step back. In order to establish a proper and logical relationship between forecast accuracy and computational value and burden, an iterative experiment was used to determine the most appropriate value for N. In order to accomplish this, the accuracy of the forecast is calculated for different values of N, and while this value is increasing, an knee point is adopted as the length of the sliding window. Increasing N further of the knee point does not necessarily improve the forecast accuracy or significance. In this paper, the number of rice data delays is considered 4 (N = 4) and is calculated by iterative method. In addition, the inputs used for NN are: NWP, rice production amount and HANTS data. To see more information about the training parameters, see the following section:

1) Number of input: 4 (lagged Price of rice) + 1( NWP data ) + 1(Rice yield) +1 (HANTS);

2) The number of hidden layers was 1, the number of neurons was 4 and the number of output layers was 1

3) Tansing and purline functions were used to transfer the hidden layer.

4) Multilayer perceptron was used for the learning algorithm.

5) Comparison functions: mean absolute percentage error (MAPE) and root mean square error (RMSE).

6) Data distribution (train - test) = 80 % train, 20% test.

# RESULTS:

Evaluation of the proposed hybrid price of rice forecasting method

In this section, we will evaluate the rice crop price forecast by a model that is developed and is hybrid that uses different criteria for forecasting. To use the prediction introduced in this research and to present a comprehensive and principled analysis, the Thai rice FOB price data set was used. The data used are related to the monthly analysis of rice prices from November 1987 to October 2017. This data has been extracted from the <www.indexmundi.com>. In figure 4, you can also see the diagram for this data set.

Figure 4 -
Description: Rice (Thailand), 5% broken, white rice (WR), milled, indicative price. Source: weeklysurveys of export transactions, government standard, f.o.b. Bangkok.

80% of solar data is used for training and 20% is used for testing. To forecast the price of rice for 1 month to a year, a multilayer perceptron (MLP) with 3 layers is used. Tansig and purline functions are used for hidden layers as well as output. The length of the lagging window is equal to the number of inputs for each MLP so that M = 6 is the number of neurons in the hidden layer of MLPNN and outputs occupy the network structure. Performance indicators used in this study are: MAPE, RMSE.

$\mathit{MAPE}\left(%\right)=\frac{1}{N}\sum _{n}^{=}\frac{\left|\stackrel{\text{̂}}{P}\left(n\right)-{P}_{\mathit{Actual}}\left(n\right)\right|}{{P}_{\mathit{Actual}}\left(n\right)}$(6)

$\mathit{RMSE}=\sqrt{\frac{1}{N}\sum _{n}^{=}{\left(\stackrel{\text{̂}}{P}\left(n\right)-{P}_{\mathit{Actual}}\left(n\right)\right)}^{2}}$(7)

Where, N is the total number of hours, P ̂(n) and P_Actual (n) are the price of rice forecast and the actual price of rice for hour n.

Figure 5 shows the performance of the proposed forecasting method in time interval from Jan 2012 to Oct 2016 in the Thailand. The figure demonstrated the efficiency of the proposed method to forecast the price of rice. Evaluation and analysis of forecasting models that are used as criteria, along with the forecasting model introduced in this article, can be seen in table 1.

Figure 5
Forecast results for the proposed approach & Persistence method (GHAYEKHLOO et al., 2019GHAYEKHLOO, M. et al. A combination approach based on a novel data clustering method and Bayesian recurrent neural network for day-ahead price forecasting of electricity markets. Electric Power Systems Research, v.168, p.184-199, 2019. Available from: <Available from: https://www.sciencedirect.com/science/article/abs/pii/S0378779618303961 >. Accessed: Oct. 19, 2019. doi:10.1016/j.epsr.2018.11.021.
https://www.sciencedirect.com/science/ar...
).

Table 1
A summary of the forecast results for several benchmark forecasting methodologies includes ARIMA (HASSAN et al., 2011HASSAN, M. F. et al. Forecasting coarse rice prices in Bangladesh. Progressive Agriculture, v.22, p.193-201, 2011. Available from: <Available from: https://www.semanticscholar.org/paper/Forecasting-Coarse-Rice-Prices-in-Bangladesh-Hassan-Islam/ecbe8a9cb03029f1247f1680c760b69c7b822183 >. Accessed: Oct. 19, 2019. doi: 10.3329/PA.V22I1-2.16480.
https://www.semanticscholar.org/paper/Fo...
), EMD-ARIMA (ABADAN et al., 2015ABADAN, S. et al. Hybrid empirical mode decomposition- ARIMA for forecasting exchange rates. AIP Conference Proceedings, v.1643, p. 256-263, 2015. Available from: <Available from: https://aip.scitation.org/doi/10.1063/1.4907453 >. Accessed: Oct. 19, 2019. doi: 10.1063/1.4907453.
https://aip.scitation.org/doi/10.1063/1....
), ANFIS (FAHIMIFARD et al., 2005FAHIMIFARD, S. M. et al. Application of ANFIS to agricultural economic variables forecasting case study: poultry retail price. Journal of Artificial Intelligence, v.2, n.2, p. 65-72, 2009. Available from: <Available from: https://scialert.net/abstract/?doi=jai.2009.65.72 >. Accessed: Oct. 19, 2019. doi: 10.3923/jai.2009.65.72.
) and Persistence method (GHAYEKHLOO et al., 2019GHAYEKHLOO, M. et al. A combination approach based on a novel data clustering method and Bayesian recurrent neural network for day-ahead price forecasting of electricity markets. Electric Power Systems Research, v.168, p.184-199, 2019. Available from: <Available from: https://www.sciencedirect.com/science/article/abs/pii/S0378779618303961 >. Accessed: Oct. 19, 2019. doi:10.1016/j.epsr.2018.11.021.
https://www.sciencedirect.com/science/ar...
) and Proposed method.

The five methods that we will use as benchmarks for other forecasting include the ARIMA (Autoregressive integrated moving average) model (HASSAN et al., 2011HASSAN, M. F. et al. Forecasting coarse rice prices in Bangladesh. Progressive Agriculture, v.22, p.193-201, 2011. Available from: <Available from: https://www.semanticscholar.org/paper/Forecasting-Coarse-Rice-Prices-in-Bangladesh-Hassan-Islam/ecbe8a9cb03029f1247f1680c760b69c7b822183 >. Accessed: Oct. 19, 2019. doi: 10.3329/PA.V22I1-2.16480.
https://www.semanticscholar.org/paper/Fo...
), an integration of EMD and ARIMA (EMD-ARIMA) (ABADAN et al., 2015ABADAN, S. et al. Hybrid empirical mode decomposition- ARIMA for forecasting exchange rates. AIP Conference Proceedings, v.1643, p. 256-263, 2015. Available from: <Available from: https://aip.scitation.org/doi/10.1063/1.4907453 >. Accessed: Oct. 19, 2019. doi: 10.1063/1.4907453.
https://aip.scitation.org/doi/10.1063/1....
), ANFIS (adaptive-network-based fuzzy inference system) (FAHIMIFARD et al., 2005FAHIMIFARD, S. M. et al. Application of ANFIS to agricultural economic variables forecasting case study: poultry retail price. Journal of Artificial Intelligence, v.2, n.2, p. 65-72, 2009. Available from: <Available from: https://scialert.net/abstract/?doi=jai.2009.65.72 >. Accessed: Oct. 19, 2019. doi: 10.3923/jai.2009.65.72.
), and Persistence forecasting method which considers the data associated with the closest hours to the forecasting time as the most important data for predicting the variable of interest at a specific hour (GHAYEKHLOO et al., 2019GHAYEKHLOO, M. et al. A combination approach based on a novel data clustering method and Bayesian recurrent neural network for day-ahead price forecasting of electricity markets. Electric Power Systems Research, v.168, p.184-199, 2019. Available from: <Available from: https://www.sciencedirect.com/science/article/abs/pii/S0378779618303961 >. Accessed: Oct. 19, 2019. doi:10.1016/j.epsr.2018.11.021.
https://www.sciencedirect.com/science/ar...
). The results indicate the fact that the hybrid model has a higher capability than other forecasting models.

# DISCUSSION:

In general, when artificial intelligence systems are used to predict time series data, it will bring different points of view For example, in the WU (1995WU, B. Model-free forecasting for nonlinear time series (with application to exchange rates). Computational Statistics and Data Analysis, v.19, n.4, p.433-459, 1995. Available from: <Available from: https://www.sciencedirect.com/science/article/abs/pii/0167947394000087 >. Accessed: Oct. 19, 2019. doi: 10.1016/0167-9473(94)00008-7.
https://www.sciencedirect.com/science/ar...
) study, they predicted the neural exchange rate of Taiwan using a comparative study (between the neural network and ARIMA). Results indicated that models in both one-step-a head and six-step- a head forecasting. Similarly, ZHANG and HU (1998) reported that neural networks outperform linear models, especially when the forecast horizon is short. In similar attempt, INCE and TRAFALIS (2006INCE, H.; TRAFALIS, T.B. A hybrid model for exchange rate prediction. Decision Support System, v.42, n.2, p.1054-1062, 2006. Available from: <Available from: https://www.sciencedirect.com/science/article/abs/pii/S0167923605001223 >. Accessed: Oct. 19, 2019. doi: 10.1016/j.dss.2005.09.001.
https://www.sciencedirect.com/science/ar...
) proposed a two stage forecasting model which incurp- parametric technique such as ARIMA, Vector Auto-Regressive (VAR), Support Vector Regression (SVR) and ANN for exchange rate predation. Their results showed that input selection is very important as well as the SVR technique, outperform the ANN for two input selection.

Due to the very sharp price fluctuations in Iran, which is facing many challenges predictive algorithms, this case has been raised. HAOFEI (2007HAOFEI, Z. et al. A neural network model based on the multi-stage optimization approach for short-term food price forecasting in China. Expert Systems with Applications, v.33, n.2, p.347-356, 2007. Available from: <Available from: https://www.sciencedirect.com/science/article/abs/pii/S0957417406001503 >. Accessed: Oct. 19, 2019. doi: 10.1016/j.eswa.2006.05.021.
https://www.sciencedirect.com/science/ar...
) proposed a MSOA utilized a BP algorithm for training neural network Chinese food price forecasting. Their results showed that the forecasting power of the MSOA model is much higher and more accurate than many other models such as ARIMA, BP and MAE. In another attempt conducted by FAHIMIFARD et al. (2004), they utilized an ANFIS, a non-linear model and the ARIMA as linear model, were compared to agriculture variables time series forecasting. In a case study in Iran, in the next three horizons (1, 2 and 4 weeks) the price of chicken in the Iranian market was forecasted using the two models that were introduced. Results of this study showedthat ANFIS model, outperforms the traditional ARIMA, in all three horizons. A study by HASSAN et al. (2011HASSAN, M. F. et al. Forecasting coarse rice prices in Bangladesh. Progressive Agriculture, v.22, p.193-201, 2011. Available from: <Available from: https://www.semanticscholar.org/paper/Forecasting-Coarse-Rice-Prices-in-Bangladesh-Hassan-Islam/ecbe8a9cb03029f1247f1680c760b69c7b822183 >. Accessed: Oct. 19, 2019. doi: 10.3329/PA.V22I1-2.16480.
https://www.semanticscholar.org/paper/Fo...
) focused on deterministic models, to find out the appropriate model that could best describe the coarse rice pattern in Bangladesh. Among the five growth models tested in this study, the cubic model may be selected for describing the pattern of wholesale price of coarse rice with minimum forecast error. In another research carried out by ABADAN et al. (2015ABADAN, S. et al. Hybrid empirical mode decomposition- ARIMA for forecasting exchange rates. AIP Conference Proceedings, v.1643, p. 256-263, 2015. Available from: <Available from: https://aip.scitation.org/doi/10.1063/1.4907453 >. Accessed: Oct. 19, 2019. doi: 10.1063/1.4907453.
https://aip.scitation.org/doi/10.1063/1....
), for forecasting competition between the empirical modem decomposition between the empirical model decomposition (EMD) and the classical ARIMA using the monthly Malasian Ringgit (MYR/US\$) exchange rates. This comparative study using RMSE and MAE shown that the EMD-ARIMA outperformed the single-ARIMA model.

From a hybrid model for forecasting Iranian electricity prices over a one-day horizon using specific time series data by GHAYEKHLOO et al. (2019GHAYEKHLOO, M. et al. A combination approach based on a novel data clustering method and Bayesian recurrent neural network for day-ahead price forecasting of electricity markets. Electric Power Systems Research, v.168, p.184-199, 2019. Available from: <Available from: https://www.sciencedirect.com/science/article/abs/pii/S0378779618303961 >. Accessed: Oct. 19, 2019. doi:10.1016/j.epsr.2018.11.021.
https://www.sciencedirect.com/science/ar...
) was used. The method introduced had many advantages, including: providing a new clustering method, the method of adopting two-dimensional embedded input, a new and advanced method that focuses on the endurance approach to select clusters. The results of this study revealed the fact that the clustering algorithm that was introduced has a higher efficiency and forecasting power than the K-means method, neural gases and self-organizing map.

For decision making at all levels, agricultural price information needs are increasing, due to increasing demand for agricultural products. In response to sharp population growth in Iran, and the sustainable level of rice production, the government practiced a range of policy instruments to tame the soaring domestic rice prices. As a domestic stabilization policy, the government has become involved in commercial rice imports. Generally, it is argued that before embarking on any intervention in domestic market, a better understanding of the price formation (past and future price) is crucial for policy markets to make informed decisions for the betterment of producers, traders and consumers welfare. Furthermore, the dynamic market structure in which the demand and supply operate, necessitate a good knowledge of a better understanding of price discovery. Therefore, commodity modeling may provide valuable information to assist policy design-makers.

The innovations used in this article are: using a hybrid system to forecast the price of agricultural goods and using time series data, adopting a new clustering method for agricultural commodity price data, providing an algorithm for the selection of new clusters and MLPNN, and the main focus was on forecasting the price of rice crop. Moreover, a novel far selecting the proposal clustering method, as input to the neural network, is also proposed. Using the proposed data clustering and cluster selection methods in this paper, led to provide the most proper input to the neural network with greater attention to the most important data as well as discarding noise data, and this leads to reach a more accurate prediction results according to the evaluation results provided in table 1.

Based on the calculated error indexes including RMSE and MARE values as compared to four other benchmark models, as detected in table 1, the hybrid technique has provided better performance The findings showed that the nonlinearity of time series data plays a decisive role in providing a reliable guide in the post sampling of forecasts and its accuracy Considering that the nature of rice data is basically irregular and chaotic, and this irregularity disrupts the neural network process, which ultimately leads to many errors in the forecasting process.

Given such challenges, it seems necessary to use unique data points (discarded data) and irregular patterns to describe the personal characteristics of the collected data. It is also very important to discover the new features that affect these challenges. In the future, we try to propose a novel deep learning model for addressing the above mentioned challenges. The use of proposed model as a consistent function estimator and two overfitting detection diminishes the overfitting issue when including history spike in the training data set to forecast price spikes data.

# CONCLUSION:

In this study, the performance of the clustering algorithm was introduced and a hybrid model was used to forecast and evaluate the monthly FOB Thai rice price time series data from November 1987 to October 2017. The results of this study demonstrated the enhanced efficiency of our proposed model as compared to four other benchmark methods namely ARIMA, EMD-ARIMA, ANFIS and persistence models. According to the results that you observed in this study, the introduced hybrid model exceeded expectations and provided more favorable and better results than all the models available in the field of price forecasting. Also, the level of accuracy and predictability of this model was higher than other models.

# ACKNOWLEDGEMENTS

Authors would like to thank the anonymous reviewers and to the journal editors for their careful reading of our manuscript and their many insightful comments and suggestions.

# REFERENCES

• CR-2020-1128.R1

# Publication Dates

• Publication in this collection
14 Mar 2022
• Date of issue
2022