Economic and climatic models for estimating coffee supply

The objective of this work was to estimate the coffee supply by calibrating statistical models with economic and climatic variables for the main producing regions of the state of São Paulo, Brazil. The regions were Batatais, Caconde, Cássia dos Coqueiros, Cristais Paulista, Espírito Santo do Pinhal, Marília, Mococa, and Osvaldo Cruz. Data on coffee supply, economic variables (rural credit, rural agricultural credit, and production value), and climatic variables (air temperature, rainfall, potential evapotranspiration, water deficit, and water surplus) for each region, during the period from 2000–2014, were used. The models were calibrated using multiple linear regression, and all possible combinations were tested for selecting the variables. Coffee supply was the dependent variable, and the other ones were considered independent. The accuracy and precision of the models were assessed by the mean absolute percentage error and the adjusted coefficient of determination, respectively. The variables that most affect coffee supply are production value and air temperature. Coffee supply can be estimated with multiple linear regressions using economic and climatic variables. The most accurate models are those calibrated to estimate coffee supply for the regions of Cássia dos Coqueiros and Osvaldo Cruz.


Introduction
Coffee (Coffea arabica L.) is a commodity that generates employment and income in several Brazilian regions (Aparecido et al., 2015) and is one of the country's main agricultural exports (Resende et al., 2009;Barbosa et al., 2012).Worldwide, it is the most consumed beverage after water (Zelber-Sagi et al., 2015).Brazilian production is 51.94 million 60-kg sacks (Acompanhamento…, 2016).A method for estimating supply would undoubtedly be a useful strategic marketing tool, considering the importance of this commodity.
Previous studies have used econometric analyses.Shikida et al. (2007), for example, evaluated the variation of sugar and alcohol supplies in the state of Paraná from 1980 to 2015 and found that the supply of sugar varied inversely with the price of alcohol; the quantity of sugar on the market decreased by 2% with a 1% increase in the average price of alcohol.Satolo & Bacchi (2009) evaluated the role of sudden changes in supply and demand in the recent evolution of sugarcane production for the state of São Paulo and concluded that the variation of the price of sugarcane had an impact of >40% on the productivity of the supply of sugarcane.A few studies, however, have boldly estimated coffee supply based on economic and climatic conditions.
The objective of this work was to estimate coffee supply by calibrating statistical models with economic and climatic variables for the main producing regions of the state of São Paulo, Brazil.
Multiple linear regression models (RLM) were used for the modeling of coffee production supply.In the present study, the equation used was: in which Y is the coffee supply of the municipalities (bags of 60 kg); a, b, c, and d are the parameters of the model (weight); X 1 , X 2 , X 3 , and X 4 are the economic and climatic variables selected; CL is the linear coefficient (constant term); and ε is the random error.Economic and climatic variables were used as independent variables in the construction of the models.The economic variables were the total rural credit of the municipality (real), the rural credit of the municipality's agriculture (real), and the value of coffee production, which consisted of the unit price multiplied by the quantity produced (current thousand real).The climatic variables were air temperature (ºC), rainfall (mm per year), potential evapotranspiration (mm per year), water deficit (mm per year), and water surplus (mm per year).The coffee-supply and economic data were for 2000-2014 and were obtained from the Instituto de Economia Agrícola (IEA, 2017) and Fundação Sistema Estadual de Análise de Dados (São Paulo, 2015).The series of climatic data was obtained from the National Institute of Meteorology.
All possible combinations of up to five independent variables were tested to avoid problems of stabilization of local errors and to obtain highly consistent analyses (Walpole et al., 2012).The method of estimation was the ordinary least squares (OLS) regression with generalised reduced gradient optimisation.The model assessed was fit by analysing the possible effect of collinearity between the exploratory variables (multicollinearity) and testing the assumptions of the normality of errors and the homoscedasticity of the variables (Gujarati & Porter, 2011).Pearson's correlation analysis was used to identify any multicollinearity of the explanatory variables.The exploratory variables were extracted with the Pearson's correlation coefficient (r, ≥| 0.7 |).Collinearity between exploratory variables can be a problem in models when analysing the weights of the coefficients (elasticity/ sensitivity) (Gujarati & Porter, 2011).Normality was verified by the Kolmogorov-Smirnov test (K-S), and the assumption of the homoscedasticity of the variables was verified by the White test (1980), also known as a covariance-matrix test.
Pesq. agropec.bras., Brasília, v.52, n.12, p.1158-1166, dez.2017 DOI: 10.1590/S0100-204X2017001200004 Correlations (r) were estimated between coffee supply and the economic and climatic variables, in order to identify those with the largest effect on the coffee supplies of the municipalities.The climatic and economic variables with the highest Pearson's correlation coefficients (r=1) were also subjected to a simple linear regression analysis to explore and infer the relationship of these variables as a function of coffee supply.
A sensitivity analysis, also known as "elasticity", was applied with the calibrated models (Gujarati & Porter, 2011).In this analysis, the angular coefficients (weight) of the independent variables are compared, and the higher the weight, the greater the effect of that variable on coffee supply.
The models were calibrated using visual basic for applications.The models were selected by an evaluation of the accuracy indicated by MAPE (%) and of the precision indicated by the adjusted coefficient of determination (adjusted R 2 ), according to the equations: in which Yest i is an estimated variable; Yobs i is an observed variable; and n is the number of datapoints (years).
in which n is the number of datapoints (years), and k is the number of independent variables in the regression.
Only the significant regressions by the F-test, at 5% probability, were selected.

Results and Discussion
The economic variables evaluated had a direct relationship with coffee supply in the state of São Paulo (Table 1).Coffee supply increased with increases in total rural credit (RC), rural credit from agriculture (RCA), and the value of coffee production (VP), because the increases in the credits and the price of the product allowed coffee growers to invest in their crops, which led to increased production (Tosi et al., 2007).
The climatic variables were variously correlated with coffee supply (Table 1).Coffee supply decreased with increases in mean air temperature (T), potential evapotranspiration (ETo), and water deficit (WD).Other studies, such as that of Martins et al. (2015), have also reported that high T and intense WD negatively affected coffee production.
The economic variables were also variously correlated with the climatic conditions.P and WD were directly and indirectly correlated, respectively, with RC, RCA, and VP, showing that the financing of crops by public agencies was higher in years with high water rates and product prices.These effects of climatic conditions on the credits have already been highlighted in other studies, such those of as Rossetti (2001) and Guanziroli & Guanziroli (2015), who also found that the financing sources varied their capital investments as a function of meteorological conditions.Coffee supply was more highly correlated with the economic than the climatic variables (Figure 1 A).
It was most correlated with VP, followed by T, with high coefficients of 0.808 and -0.552, respectively.Identifying the variables that were highly correlated with coffee supply allowed to evaluate these relationships more thoroughly.VP had a direct relationship, which explained 83% of the variation in coffee supply.This value of VP was high, because production increased by 2,820 sacks of 60-kg for every 1,000 real increase in VP (Figure 1 B).Testing all possible combinations with up to five variables produced models with a total of 1,744 combinations of independent variables (Figure 2 A).This method of selecting variables by testing all combinations was efficient; p decreased as the precision increased (adjusted R² ≈ 1.00) and the mean absolute percentage error (MAPE) gradually decreased, ultimately reaching zero (Figure 2 B).
This classification was essential for selecting the most accurate models (more accurate and with less tendency), because only those with the lowest MAPEs and the highest precision (adjusted R 2 ) with p<0.05 were selected for estimating coffee supply.
The calibrated models were efficient; the accuracy and precision for all regions averaged 5.04% (MAPE) and 0.90 (adjusted R 2 ) (Table 2).These values are considered adequate by crop modellers (Santos & Camargo, 2006;Savin et al., 2007;Rosa et al., 2010).Chipanshi et al. (2015) used accurate estimation models to show that producers could improve their decisionmaking and strategic planning.The calibrated model for estimating the coffee supply of the CDC region (model 3) had a low MAPE and a high adjusted R 2 of 2.30 and 0.97%, respectively.An estimate of the supply of a region with an error of only 2.30% is considered very low (Aparecido et al., 2017); an average supply of 25, 000 sacks of coffee in a municipality would vary by only 575 sacks.RC, VP, T, and annual rainfall had the greatest effect on the quantification of the coffee supply in CDC (Table 2).Of these, T showed the greatest effect and was inversely proportional to the supply.
The largest difficulty in estimating coffee supply was the high bienniality of coffee between Figure 1.Effect of the economic and climatic variables on coffee (Coffea arabica) supply (A) and linear regression between the production value and the coffee supply of the state of São Paulo, Brazil (B).RC, total rural credit by municipality (real); RCA, rural credit of the municipality's agriculture (real); VP, value of the coffee production (real); T, air temperature (°C); R, rainfall (mm per year); ETo, potential evapotranspiration (mm per year); WD, water deficit (mm per year); and SUR water surplus (mm per year).Pesq.agropec.bras., Brasília, v.52, n.12, p.1158-1166, dez.2017 DOI: 10.1590/S0100-204X2017001200004 harvests, which is due to the physiology of the crop (Rodrigues et al., 2013;Melke & Fetene, 2014).The calibrated models for the coffee regions followed the variability of coffee supply.Notably, the models calibrated with economic and climatic variables, in addition to accompanying the bienniality, also followed the tendencies of increases and decreases in coffee supply, as in BTT and OVC, respectively (Figure 3).
The highest performances were in the models calibrated for BTT, CDC, and OVC, with adjusted R 2 of 0.97, 0.93, and 0.91, respectively (Figure 4), which are considered high in the calibration of estimation models (Savin et al., 2007).Moreto & Rolim (2015) also reported precisions >0.95 (adjusted R 2 ) in calibrated models for estimating orange [Citrus sinensis (L.) Osbeck] production for the state of São Paulo.The model developed for ESP performed the worst, with     an accuracy of 0.30 (Figure 4 E).ESP had the largest deviations in the estimates of the coffee supply, due to the effect of climatic factors not used in the calibration of the models, e.g., low minimum air temperature, cold winds, and frost.

Conclusions
1. Coffee (Coffea arabica) supply is controlled by economic and climatic variables.
2. The value of coffee production and air temperature have the greatest effect on the variation of coffee supply in the state of São Paulo, Brazil.
3. Coffee supply increases with an increase in production value.
4. Coffee supply can be estimated with multiple linear regressions of economic and climatic variables.
5. The models calibrated with economic and climatic variables are accurate and precise, and the models calibrated for estimating the coffee of the regions of Cássia dos Coqueiros and Osvaldo Cruz are the most accurate.

Figure 2 .
Figure 2. Multiple regression models calibrated (A) and example of the classification of the models according to the criteria of accuracy (minor MAPE), precision (greater adjusted R²), and reliability (p-value) (B) for Batatais, in the state of São Paulo, Brazil.MAPE, mean absolute percentage error.

Table 1 .
Pearson's correlation coefficients (r)between the economic and climatic variables for coffee (Coffea arabica) supply of the state of São Paulo, Brazil(1).

Table 2 .
Calibrated models to estimate the coffee (Coffea arabica) supply of the state of São Paulo, Brazil, as affected by economic and climatic parameters(1).