Forecasting of the annual yield of Arabic coffee using water deficiency

The objective of this work was to develop agrometeorological models for the forecasting of the annual yields of Arabic coffee (Coffea arabica), using monthly water deficits (DEFs) during the coffee cycle, in important locations in the state of Minas Gerais, Brazil. For the construction of the models, a meteorological data set spanning of 18 years and multiple linear regressions were used. The models were calibrated in highand low-yield seasons due to the high-biennial yields in Brazil. All calibrated models for highand low-yield seasons were accurate and significant at 5% probability, with mean absolute percentage errors ≤2.9%. The minimum forecasting period for yield is six months for southern Minas Gerais and Cerrado Mineiro. In high‐yield seasons, water deficits affect more the reproductive stage of coffee and, in low‐yield seasons, they affect more the vegetative stage of the crop.


Introduction
Brazil is the world's largest coffee producer with 45.563,2 thousand sacks per year (Amarasinghe et al., 2015;Acompanhamento…, 2017), and few studies have forecasted yields as functions of the climatic conditions (Aparecido et al., 2015).Coffee production and bean quality are highly sensitive to changes in microclimatic parameters (Craparo et al., 2015), mainly those related to water availability.Water deficits [DEFs, the lack of evapotranspiration in relation to a maximum (potential) theoretical value] reduce crop yields (Khamssi et al., 2011) by affecting the plant vegetative growth (DaMatta, 2004).
DEFs affect the amount of moisture extracted by roots, the spatial distribution of the root system, canopy size, and fruit growth of coffee plants (Amarasinghe et al., 2015).Reductions of coffee yields are mostly due to DEFs because the water stress after fertilisation affect fruit growth (Camargo, 2010).The effects of climate on crop yields can be evaluated with crop models (Shao et al., 2015).These models can help farmers to make decisions on achieving a sustainable agriculture (Syvertsen & Garcia-Sanchez, 2014).
Models collect information from agrometeorology, remote sensing, plant physiology, plant science, soil science, and economics, in an interdisciplinary way to predict yields (Gouranga & Ashwani, 2014).The models can also assist on strategies of production by simulating the dynamics of crop growth (Wang et al., 2015).
Several models have been published on coffee yield estimation. Santos & Camargo (2006)

developed a
This is an open-access article distributed under the Creative Commons Attribution 4.0 International License Pesq. agropec. bras.,Brasília,v.53,n.12, p.1299-1310, Dec. 2018 DOI: 10.1590/S0100-204X2018001200002 mathematical model to estimate yield losses of arabica coffee in the state of São Paulo.Carvalho et al. (2004) developed a model for estimating coffee yield in Minas Gerais, but the model was inconsistent, with highly discrepant estimate errors, highlighting the complexity of coffee modelling.
Studies of forecasting, however, are rare.Some studies have a probabilistic basis in the statistical models.Keong & Keng (2012) forecasted the yield of palm oil in Malaysia, nine months before the harvest, using multiple linear regression models and stepwise variable selection.Gouranga & Ashwani (2014) forecasted rice yields in India 30 days before the harvest.Finally, Moreto & Rolim (2015) forecasted the yield and quality of 'Valência' oranges six months before harvesting.No model, however, has yet been developed to forecast the yield of arabica coffee, in order to understand the influence of climate on the "bienniality" of coffee.
Various authors have sought to understand the bienniality of coffee.The bienniality is the exhibition of coffee high and low yields in alternate years, which is commonly attributed to the restriction of photoassimilates produced by the plants.The photoassimilates in one year are allocated to the growth of branches, with little generation of productive germplasm.The photoassimilates of the next year are allocated to the germplasm, with little growth of the branches, generating a higher yield (Silva et al., 2008;Rodrigues et al., 2013).
The physiological cause of bienniality of coffee is complex (Valadares et al., 2013;Melke & Fetene, 2014).When coffee develops plagiotropic branches, photoassimilates are allocated to this growth, promoting a large vegetative flux in the plant.This growth simultaneously promotes the multiplication of the number of buds that would mature for production only in the following year (Pereira et al., 2011).In the first year, coffee will thus show a high-vegetative growth and a low yield.In the following year, the crop will produce a high yield due to the high-bud formation in the previous year, but with low-vegetative growth (Melke & Fetene, 2014).Photoassimilates in years with high yields are allocated to bean filling, so vegetative growth is reduced, providing a lower yield again in the following year (Rodrigues et al., 2013).The maturity of plagiotropic branches is reached in three or four years in these regions.
Estimation in science is the use of actual data to simulate an actual process, and forecasting is the use of actual data to simulate a future process or event.The forecast is the vanguard in the modelling of crops.Few papers have forecast coffee production to understand the influence of water deficit on the biennial coffee.
The objective of this work was to develop agrometeorological models for the forecasting of the annual yields of Arabic coffee (Coffea arabica), using monthly water deficits (DEFs) during the coffee cycle, in important locations in the state of Minas Gerais, Brazil.

Materials and Methods
We used a historical series of climatic and phytotechnical data for coffee crops in the state of Minas Gerais, Brazil.The representative locations of coffee production used for modelling were Guaxupé (GXP), Monte Santo de Minas (MSM), and São Sebastião do Paraíso (SSP) -in the southern region of the state (SO MG ) -, and Coromandel (CRD), Serra do Salitre (SDS), and Tiros (TRS), in the Cerrado Mineiro region (CE MG ) (Table 1 and Figure 1).These data are statistics of all analysed municipalities, and were provided by the Cooxupé (Cooperativa Regional de Cafeicultores em Guaxupé).The crops were not irrigated and were of adult age.
Daily climatic data for these locations were obtained for 1997-2014 by Campbell CR21X automatic weather stations (Campbell Scientific Inc., Logan, Utah, USA), containing sensors for rainfall (TE525WS-L) and air temperature (HMP50-L).Daily rainfall data and minimum, average, and maximum temperatures were arranged on a monthly basis to calculate the potential evapotranspiration (PET), using the model proposed by Camargo (1971).This model is similar to that of Thornthwaite (1948) for tropical regions (Camargo & Sentelhas, 1997) in which: Qo is the daily solar atmospheric irradiance (MJ m -2 day -1 ); T is the mean air temperature (°C); and ND is the number of days.
The actual evapotranspiration (AET), which is estimated as a function of soil available water, was calculated following the model of Thornthwaite & Mather (1955) at a monthly scale, with a soil available water capacity of 100 mm, as follows: ALT STO STO , and  (1) C and T are calibration and test, respectively. ( 2 Following Thornthwaite's climatic classification (1948).
Pesq. agropec.bras., Brasília, v.53, n.12, p.1299-1310, Dec. 2018 DOI: 10.1590/S0100-204X2018001200002 in which: STO is the soil-water storage; R is the rainfall (mm); PET is measured in millimeters; AET is measured in millimeters; I represents the data from the current period; and, I-1 represents the data from the previous period.
The coffee cycle begins with the development of the root system induced by dormancy of the aerial part of the plant.The development of the root system is intense during this period (Figure 2).Roots grow throughout the plant cycle, but the growth is intensified when water is abundant (Larcher, 2004).Water deficits in tropical climates usually occur in the winter seasons.
The phenological cycle of Coffee arabica requires two years to be completed (Figure 2).The growing season occurs in the first year, and the reproduction occurs in the second year.Coffee phenological cycle starts with the development of root system.The root system grows during periods of DEF that typically occur between June and September, coinciding with the dormancy period (Maestri & Barros, 1975).DEF was defined by Thornthwaite & Mather (1955) as the lack of evapotranspiration relatively to a maximum (potential) theoretical value, and it is determined as In the year of vegetative development (y-1), the formation and growth of vegetative branches, and the number of axillary buds at the nodes occur, stimulated by long photoperiods (Camargo & Camargo, 2001).Then, the axillary vegetative buds develop into reproductive buds under shorter photoperiods.The productive year (y) begins with the flowering, after which the fertilized flowers lose their petals forming small round structures called "pinheads" that begin to expand into beans from November until March, and the fruit then mature and are harvested until July (Figure 2).
A schematic synthesising of the available information for the bienniality of coffee cultivation is presented in Figure 3.In addition to the strong physiological component, the weather directly influences the viability and quantity of the vegetative growth and the germplasm that will become the beans (Craparo et al., 2015).
We used multiple linear regressions to model coffee yield for forecasting.Monthly DEF data for the two years of the coffee phenological cycle were the independent meteorological variables, used in the construction of the models as Y = a × X + b× X +c× X + L + CL 1 2 3 , in which: Y is the yield (sacks ha -1 ); a, b, c, ..., are adjusted coefficients (weight); X 1 , X 2 , X 3 , ..., are selected monthly DEFs from the two phenological years of the coffee crop; and CL is the linear coefficient.A Brazilian sack of coffee traditionally weighs 60 kg.
The models were calibrated and tested for the locations SSP and TRS because we had 18 years of climatological data available for these places.We used 1997-2007 data for calibration, and 2008-2014 data for testing.The models for the locations CRD, MSM, GXP, and SDS were only calibrated due to an insufficient amount of data (12 years).The average annual yields of coffee were provided by cooperatives and producers of the regions.The biennial development of coffee is due to the physiology of the plants, so the models were calibrated separately for years of high and low yield to improve the accuracy.The calibrated models are specific to each location.
Various methods are available for selecting the independent variables for multiple linear and  Camargo & Camargo, 2001).Indexes: y-1, first phenological year, y, second phenological year; i-2, first chronological year; i-1, second chronological year; and i, third chronological year.nonlinear models, such as forward selection, backward elimination, stepwise, leap and bound regressions, orthogonal descriptors, genetic algorithms, genetic populations, choosing operators, and fitness of evaluation.These methods seek to minimize the errors associated with the insertion, or removal, of variables in the models (Xu & Zhang, 2001).These authors also stated that all possible combinations of the independent variables, in the various equations for selecting the best model, are possible when the number of independent variables is small.
Our study contained a large number of independent variables, but we tested all possible combinations (APC) of up to four independent variables in the models, to avoid stability problems in local errors and consistency of the analyses.The independent variables were the monthly DEFs during the phenological cycle of the crop, which produced an average of 16, 800 tested equations for each location.This total number of possible equations results from the sum of all the possible equations combining 1-4 independent variables.The total of possible combinations can be easily calculated using Newton's binomial equation (Mansour & Schork, 2011).
We removed models with multicollinearity between the independent variables (monthly DEFs), as it is a problem in models analyzing angular coefficients (weights) because it causes a bias in the coefficients (Gujarati & Porter, 2011).An analysis of the coefficients thus allowed us to infer which meteorological elements and at what time (month) influenced the crop yield.
An average reduction of coffee yield was observed as a function of the annual DEF at the locations, by analyzing the elasticity of the coefficients, as described by Gujarati & Porter (2011).The influence of monthly DEF on yield during the coffee phenological cycle was observed for the top 20 selected models by an analysis of the correlations and the frequency of appearance of the variables.
The significance of the coefficients (t<0.05) and the regression (F<0.05), a low-mean absolute percentage error (MAPE, %), and a high-adjusted coefficient of determination (R 2 adj) were used for selecting the variables.The models were selected by an evaluation of the accuracy indicated by the MAPE, and of the precision indicated by the R 2 adj (Cornell & Berger, 1987), as follows: in which: Yest i is the estimated variable; Yobs i is the observed variable; n is the number of datapoints (years); and k is the number of independent variables in the regression.R

Results and Discussion
The DEFs were more intense in CE MG , ranging from 269 to 613 mm per cycle.The cycle DEFs in SO MG varied from 108 to 466 mm per cycle.The yields were more sensitive to DEF in CE MG than in SO MG .The average sensitivities as functions of DEF, in SO MG and CE MG , were -0.010 and -0.031 sacks ha -1 mm -1 of DEF per cycle, respectively (Figures 4 A and B).A sensitivity of -0.031 sacks ha -1 mm -1 of DEF per cycle of coffee, with an average yield of 45.00 sacks ha -1 , represents a reduction of 3.10 sacks ha -1 for every 100 mm of cycle DEF.Assuming a DEF of zero during the cycle, an average yield in CE MG would be 51.24 sacks ha -1 , which is 20% higher than in SO MG (Figure 4 C).These values are similar to those reported by Fernandes et al. (2012), who found that the yield of coffee was usually higher in CE MG than in SO MG .
Efforts were made to remove the equations with collinear variables, during the variable selection in the models, in order to forecast coffee yield by applying the APC method.We tested 16, 830 equations for each location, and removed an average of 12, 950 equations with collinear variables, leaving only 3,880 viable equations (Figure 5 A).These viable equations were assessed for a low MAPE, a high R 2 adj, and p < 0.05 (Figure 5 B).
To synthesize the information of the effect of DEFs on coffee yield, we identified the monthly DEFs of higher frequency in the yield forecasts of the 20 top designs for each location (Figure 6 A and B).A monthly DEF can be positively or negatively correlated with yield, depending on the coffee phenological stage (Syvertsen & Garcia-Sanchez, 2014).DEFs in years of high yield (Figure 6 A) had more influence on flowering, bean formation, and yield during the phenological year (y).The monthly DEF mostly influenced the phases of vegetative development and final maturation in years of low yield (Figure 6 B); this result is in accordance with that by Pereira et al. (2011), who asserted that the growth of reproductive branches was the priority of development of coffee in years with low yield.
DEFs in years of high yield had more influence during the production year (y), with higher frequencies in the models from April to July.The DEF in April of the second chronological year (DEFapr (i-1) ), in high-yield season, was the most influential DEF (Figure 6 A).DEFjul (i-1) occurred at the end of the growing period, and had a smaller influence on coffee yield.
DEFsep (i-1) and DEFoct (i-1) showed great influence both in high-and low-yield seasons.DEFsep (i-1) was very influential in the cultivation, negatively affecting coffee high yield (Figure 6 A).This stage is at the end of the dormancy -when water restriction is desirable (Amarasinghe et al., 2015) -together with strong and uniform flowering (Ronchi et al., 2015).DEFoct (i-1), month of flowering, was negatively correlated with yield in the municipalities GXP, SSP, SDS, and TRS (Figure 6), evidencing that DEF in the coffee flowering period is not desirable.
The agrometeorological models were developed by associating coffee yield with monthly DEFs.The models were calibrated and tested for years with high-(Figure 7) and low-yield seasons (Figure 8), to understand the limiting factors and to incorporate the bienniality of coffee production.The adjusted angular coefficients indicated the sensitivity of yield in relation to the monthly DEF, during the two years of the phenological cycle, for each location.
The models developed for the years with high yields were accurate, with a minimum MAPE of 0.39% (calibration), and an average forecasting period of up to nine months.For instance, the model for the forecasting of crop yield in MSM had a forecasting period of six months, and the calibration was highly accurate, with a MAPE of 1.59% and an R 2 adj of 0.95.The MAPE of 1.59% in this model was considered low because the average yield of 53.31 sacks ha -1 varied by ±0.847 sacks ha -1 .October of the second phenological year (i-1) was the last month used in the model, and the forecasting period for yield was six months, with harvesting beginning in May (i) (Figure 2), as follows: -0.585× DEF + 59.14, Oct.

( ) ( )
in which: i-1 is the second chronological year; and i is the third chronological year.An analysis of the angular coefficients in this model identified DEFmay (i-1) as the most important month, and DEFoct (i-1) as the least important one.DeFjul (i-1) and DeFoct (i-1) were negatively correlated with yield, probably because they were at the end of bud growth and flowering, respectively, when water restriction is usually not desirable.DEF during the vegetative period (bud growth) of a high-yield season.in SO MG , was positively correlated with coffee yield.DEFjul in SDS, in CE MG , during coffee vegetative and reproductive periods, caused a reduction of the final yield (Figure 7).
The agrometeorological models developed for the years of low yield were accurate, with a minimum MAPE of 0.14%, and an average forecasting period of up to seven months.For instance, the model for predicting the yield in GXP, for a low-yield season, used DEFoct (i-1) as the last month, and the forecasting period for yield forecasting was six months.The calibration was highly accurate, with a MAPE of 0.17%, and an R 2 adj of 0.99.The MAPE of 0.17% in this model was considered low because the average yield of 32.11 sacks ha -1 varied by ±0.05 sacks ha DeFapr (i-1) , which occurs during bud growth, generally had a large influence on coffee yield in lowyield seasons.DEFsep (i-1) , at the end of dormancy, was positively correlated with coffee yield in GXP and MSM, in SO MG .DEF in June of the first chronological year [DEFjun (i-2) ] had a positive influence on yield in SSP.Coffee plants prepare to begin vegetative growth during this period, and roots grow intensively, which is favoured by DEF (Larcher, 2004).
DEFoct (i-2) caused a reduction in the final yield in CRD and SDS, in CE MG .DEFapr (i-1) had a negative influence on coffee yield in CRD and TRS because the buds normally grow in this month, a time when the coffee crop needs evapotranspiration to form the reproductive buds.DEFs during dormancy (i-1) had a positive influence on yield in SDS (Figure 8).Ronchi et al. (2015) suggested that DEFs were desirable in August (i-1) , which favours uniformity and yield of the coffee crop.

Conclusions
1.The agrometeorological models developed as functions of water deficits are accurate for all regions, and the minimum forecasting period for yield is six months for southern Minas Gerais state and Cerrado Mineiro.
2. Coffee yield in years of high yield was more influenced by water deficit in the reproductive phase, corresponding to the second phenological year.
3. Water deficits in years with low yield have a larger influence in the vegetative phase of the crop, corresponding to the first phenological year.

Figure 1 .
Figure 1.Main regions for coffee production in the state of Minas Gerais, Brazil.Municipalities in southern Minas Gerais state: GXP, Guaxupé; MSM, Monte Santo de Minas; and SSP, São Sebastião do Paraíso.Municipalities in the Cerrado Mineiro region: CRD, Coromandel; SDS Serra do Salitre; and TRS, Tiros.

Table 1 .
Geographic and climatic description of the regions.