Cacao Crop Management Zones Determination Based on Soil Properties and Crop Yield

The use of management zones has ensured yield success for numerous agricultural crops. In spite of this potential, studies applying precision agricultural techniques to cacao plantations are scarce or almost nonexistent. The aim of the present study was to delineate management zones for cacao crop, create maps combining soil physical properties and cacao tree yield, and identify what combinations best fit within the soil chemical properties. The study was conducted in 2014 on a cacao plantation in a Nitossolo Háplico Eutrófico (Rhodic Paleudult) in Bahia, Brazil. Soil samples were collected in a regular sampling grid with 120 sampling points in the 0.00-0.20 m soil layer, and pH(H2O), P, K, Ca, Mg, Na, H+Al, Fe, Zn, Cu, Mn, SB, V, TOC, effective CEC, CEC at pH 7.0, coarse sand, fine sand, clay, and silt were determined. Yield was measured in all the 120 points every month and stratified into annual, harvest, and early-harvest cacao yields. Data were subjected to geostatistical analysis, followed by ordinary kriging interpolation. The management zones were defined through a Fuzzy K-Means algorithm for combinations between soil physical properties and cacao tree yield. Concordance analysis was carried out between the delineated zones and soil chemical properties using Kappa coefficients. The zones that best classified the soil chemical properties were defined from the early-harvest cacao yield map associated with the clay or sand fractions. Silt content proved to be an inadequate variable for defining management zones for cacao production. The delineated management zones described the spatial variability of the soil chemical properties, and are therefore important for site-specific management in the cacao crop.


INTRODUCTION
In recent years, cacao (Theobroma cacao) plantations located in the south of the state of Bahia have faced a serious yield crisis, largely due to the appearance and dissemination of witches' broom disease (Souza Jr. et al., 2011), which is caused by the Moniliophthora perniciosa fungus.The disease has reduced dry pod production in the region by 60 % (Sodré et al., 2007).
The impact of witches' broom disease has been partially overcome through the use of clonal and seminal cacao varieties, leading to an expected gradual recovery of cacao production in Brazil (Arévalo et al., 2012).Nevertheless, Chepote et al. (2013) state that the sustainability of cacao cultivation will only be attained through specific production systems including the use of soil amendment and fertilizers.
Soils typically exhibit large variations in physicochemical properties over the whole of the production field, so fertilization practices based only on average values lead to errors that may jeopardize the entire management system (Silva et al., 2010).Determining spatial variability, on the other hand, allows different sites to be identified within the same field that exhibit homogeneous features related to chemical and physical properties of soil and crop yield, and this will further enable the adoption of specific management strategies (Davatgar et al., 2012).
In precision agriculture, the use of agronomic practices such as fertilization, weed control, and pest and disease control is performed in a spatially variable manner depending on information collected in the field (Ortega and Santibáñez, 2007).However, the right application rate of agricultural inputs in terms of space and time using precision agriculture has proven to be a challenge (Booltink et al., 2001;Anselin et al., 2004).
An alternative for resolving this problem is the formation of specific management zones based on characteristics and variables that are essential for certain production patterns of agricultural crops (Fraisse et al., 2001).In the case of cacao cultivation, defining management zones could be a key factor for improving management practices, and thus, for regaining large yield indexes.The division of production fields into sub-regions including a combination of yield-limiting and quality factors allows specific and independent strategies to be implemented for each sub-region, thereby improving the efficiency of cacao crop management.
Several studies have been conducted for the purpose of defining management zones for different agricultural crops (Fraisse et al., 2001;Whelan and McBratney, 2001;Li et al., 2007;Valente et al., 2012).The main variables used and the benefits of localized management practices for production systems have been highlighted in these studies.However, there are no studies on the use of precision agriculture techniques for a production management system for cacao crop.
This study investigated the possibility of using management zones in cocoa plantation and the possibility of using the same variables applied in other crops to generate this zones.The aim of the present study was to delineate management zones and create maps of combinations between soil physical properties and cacao yield.The combinations that best fit the chemical properties of the soils were also identified.

Description of experimental area
This study was conducted in an experimental area of approximately 1 ha belonging to the Cacao Research Center (Centro de Pesquisas do Cacau -CEPEC).The area is located at km 22 of the Jorge Amado highway in the municipality of Ilhéus in the south of the state of Bahia, Brazil, at latitude 14° 47' S and longitude 39° 16' W.
Rev Bras Cienc Solo 2016;40:e0150520 Climate in the region is classified as Af, humid tropical climate (Köppen and Geiger, 1928), with average annual rainfall of 1,830 mm, relative humidity of 80 %, and average annual temperature between 21.5 and 25.5 °C.The experimental area contained 31 cacao progenies planted at a spacing of 3.0 × 1.5 m.The soil was classified as a Nitossolo Háplico Eutrófico (Rhodic Paleudult), according to the Brazilian System of Soil Classification (Santos et al., 2013).
The experimental area has been cultivated since 2003 with a regular cacao cultivar in an agroforestry system using erythrina at a spacing of 24 × 24 m.Shading of the area is regular, except in the northwest and central regions, where there is a higher incidence of sunlight due to lower density of plant cover.

Sampling methodology
A regular sampling grid composed of 120 points was delineated with adoption of a local coordinate system with minimum and maximum spacing between points of 6.6 m and 9.5 m, respectively (Figure 1).Each sampling point was composed of a single cacao tree.
Soil collection was performed at each sampling point under the projection of the cacao tree canopy at a distance of 0.40 m from the trunk.Soil was collected from 0.0-0.20 m depth layers with the aid of a Dutch auger.Four subsamples (1 subsample per quadrant) were used to make up composite samples.The soil of each composite sample was processed to obtain air-dried fine earth (ARFE).

Physical and chemical analyses
Particle size analysis was performed on soil samples to quantify the coarse sand, fine sand, total sand, clay, and silt fractions.In addition, chemical analyses were performed to determine the properties: P, K + , Ca 2+ , Mg 2+ , Na + , Fe, Cu, Mn, Zn, Al 3+ , pH in water, potential acidity (H+Al), and total organic carbon (TOC).The sum of exchangeable bases (SB), effective cation exchange capacity (effective CEC), cation exchange capacity at pH 7.0 (CEC pH 7.0), base saturation index (V), and Al 3+ saturation index (m) were calculated.Both chemical and physical analyses were performed following the methods proposed by Donagema et al. (2011).

Determination of cacao yield
The yield of cacao trees was determined monthly throughout the year of 2014.The yield was stratified into early-harvest cacao (March to August), harvest cacao (September to February), and annual production (production in all months).Production of healthy fruits was evaluated per sampling point by determining the weight of dry pods per plant.The results of dry pod production per plant were extrapolated to yield (kg ha -1 ) per sampling point.

Data analysis
Geostatistical analysis was performed on data to identify and quantify the degree of spatial dependence of the physical properties of soil and cacao yield (harvest, early-harvest, and annual).This analysis is the first step in delineating management zones and was carried out in order to adjust the theoretical functions of Matheron's experimental variogram, given in equation 1: where N(h) is the pair number of measured values Z(x i ), Z(x i +h) separated by a vector h, and x i is a spatial position of the variable Z.
Experimental semivariance was calculated assuming isotropic behavior of spatial dependence.In fitting theoretical functions to variograms, the following experimental models were tested: linear with plateau, spherical, Gaussian, and exponential.The model that best fit the experimental data was selected by analyzing the smallest residual sum of squares (RSS) and the highest coefficient of determination (R²).The cross validation coefficient (R²-VC) was also considered.Geostatistical analyses were performed on Vesper 1.62 software (Minasny et al., 2006).

Delineation of management zones
The management zones were defined from values interpolated by ordinary kriging and according to a regular grid by applying the Fuzzy K-Means algorithm.This method is based on the minimization of equation 2, in accord with Guastaferro et al. (2010).
where x i (j) -c j 2 is the distance between the data point and the center of the cluster; n is the number of data; and k is the number of clusters.
The optimal class number was determined based on two indexes (Guastaferro et al., 2010): the Fuzzy Performance Index -FPI, and Normalized Classification Entropy -NCE.The FPI and NCE indexes vary from 0 to 1. Values close to 0 (zero) indicate distinct classes, with few samples that have low adhesion (Equation 3), and values close to 1 (one) indicate no distinct classes, with a high number of samples that have low adhesion (Equation 4).Accordingly, the optimal number of classes is the number that minimizes both indexes.
where MF B is a pertinence function that assumes the value 0 or 1; z is an individual that may or may not be a member of a group; and b1 and b2 are the settled limits.
The maps of management zones were created using 20 different combinations between soil physical properties and cacao tree yield ( The classification for generating management zones was carried out with two, three, and five classes.Soil physical properties were used due to their low temporal variability, which allowed management zones with higher repeatability over the years to be obtained.Yield was used as the main response of an agricultural crop in regard to management, especially because it represents the central purpose of agriculture. The best management zone was defined by evaluating the correlation levels between the zones and the classificatory variables, the soil chemical properties in this case.Kappa indexes were calculated in accordance with Kitchen et al. (2005).According to these authors, the Kappa coefficient indicates acceptance among classifications; the larger the Kappa index, the higher the correlation between the zone and the classificatory variables.
According to the methodology proposed by Congalton and Mead (1986) and used by Valente et al. (2012) and Alves et al. (2013), the following intervals were considered: Kappa <0, the correlation is poor and not significant (D); 0 ≤ Kappa <0.20, the correlation is significant but poor (C); 0.20 ≤ kappa ≤ 0.40, significant but reasonable correlation (B); and Kappa > 0.40, significant and good correlation (A), The management zones that did not minimize the FPI and NCE indexes in the same class were evaluated with different input variables to determine the most important class (Fridgen et al., 2004).The criterion used to define the best class was in accord with the method adopted by Alves et al. (2013).

RESULTS AND DISCUSSION
In geostatistical analysis (Figure 2), all size fractions, and harvest and early-harvest cacao yield showed spatial dependence.The theoretical models that best fit the experimental variograms were the spherical and Gaussian models.The spherical model is considered to be the most frequent well-fitted model for soil and plant data (Dalchiavon et al., 2012;Lima et al., 2013).
The fit of the pure nugget effect model (PNE) indicates that the annual cacao yield did not exhibit spatial dependence among samples.The absence of spatial dependence does not signify lack of variability in the phenomenon.However, this variability is not related to space, but occurs randomly (Lima et al., 2013).According to these authors, the properties well-fitted to the PNE model should be used as based on average values obtained by classical statistical analysis and, therefore, they should not be considered in the delineation of management zones.
The highest spatial variability was observed for cacao yield, with ranges of 12 and 12.5 m for harvest and early-harvest yields, respectively.Large genetic and, consequently, high yield variabilities were expected since the cacao plantation was formed by progenies.
Among the size fractions, silt had the lowest range value (62 m).Souza et al. (2004) state that silt incorporates parts of variability from other size fractions during its determination process, which contributes to an increase in its variability.
Spatial dependence, as measured by the spatial dependence index (SDI), was significant for total sand, clay, and early-harvest and harvest cacao yield, and moderate for silt, which is in accord with Cambardella et al. (1994).For these authors, a SDI that is powerful for physical properties is related to intrinsic factors of the soil, such as its formation and mineralogy.Oliveira et al. (2013) found moderate SDI values for clay and weak values for silt and sand in a Cambisol under an agroforestry system.1).Only the zone defined for the early-harvest cacao yield (T) minimized the indexes with a different number of classes (FPI -2 classes and NCE -5 classes).Fridgen et al. (2004) recommend additional analysis for the cases in which the indexes do not match a single number of classes, evaluating the correlation among different zones with classificatory variables.
When management zones are defined from a single variable, the FPI and NCE indexes tend to be minimized in two classes, as can be observed for Total Sand, Clay, Silt, and T.These results corroborate those previously reported by Peralta et al. (2015), where the highest relative efficiency for zones defined from individual variables was obtained for two classes.For combinations involving more than one variable, combinations were fitted with two, three, and four classes; Valente et al. (2012) observed minimized FPI and NCE indexes in three classes for management zones of coffee crop defined as based on variable combinations.Similar results were observed by Morari et al. (2009).
Taking into account that the highest combination number ( 16) had minimum FPI and NCE indexes for two and three classes, these are the most recommended combinations to delineate management zones for the cacao crop.Peralta et al. (2015) state that a lower number of classes in the definition of zones makes application of localized management practices more economically viable, mainly due to greater simplicity in subdividing the production field.Four was the best number of classes for delineating fertility management zones in a rice field found by Davatgar et al. (2012).Bazzi et al. (2013) observed that more than three classes is effective for fertility management of an area planted to soybean.
From the selection of an optimal number of classes for each combination of variables, management zone maps for the cacao crop were created (Figures 3, 4, and 5).The average centroid values and respective standard errors of the combinations of variables in each management zone are summarized in tables 2 to 4. These tables have the purpose of supporting interpretation of the maps.
The maps of management zones defined for individual variables were similar to the original spatial variability maps, particularly for total sand and clay, in which the highest spatial distribution values were noticed in the upper-left and lower-right regions of the area, respectively.In these regions, there are the lowest total sand class Total Sand (class 1) and highest clay class (class 2), with values of 309.34 and 375.90 k kg -1 , respectively (Table 2).Pedroso et al. (2010) report that this similarity in behavior is characteristic of the Fuzzy K-Means algorithm, which is dependent on the quality of the spatial variability map of the original variable.
Except for the combinations Total Sand, Clay, Total Sand + Clay, Total Sand + Silt, Clay + Silt, Total Sand + Clay + Silt and Early-Harvest + Harvest + Total Sand + Clay+ Silt, the other combinations, regardless of the number of zones defined, exhibited more disperse classes and, consequently, lower continuity within the management zones.Tisseyre and McBratney (2008) report that small isolated divisions within the zones impair management of activities, especially in automated systems.In the case of areas where cultivation and fertilization are performed manually, the discontinuities within the management zones are unrelated (Silva et al., 2010).According to these authors, the ideal scenario is that in which individual areas are not considerably reduced, because reduction could hinder the procedures.
Rev Bras Cienc Solo 2016;40:e0150520 The discontinuity of management zones increased along with an increasing number of classes.This behavior was expected since the variability among variables used in the combinations and the stratification of clusters are also increased.Salami et al. (2011) report that management zones defined from variables comprising widely divergent spatial distributions tend to produce higher discontinuities, even with a small number of classes.
Evaluating the contribution of the size fractions within the combinations, the behavior of Clay and Total Sand are similar up to three classes.Silt contributes to increase the discontinuity among zones and generates maps with discontinuous classes.This response is explained by the larger variation in this fraction, since its standard deviation was high for all classes (Tables 2 and 3).
Analyzing at figure 5, which shows the combination whose indices FPI and NCE were minimum of four classes is observed similarity between the maps.Management zones divided into four classes, and that which comprises yield data (early-harvest and harvest) can be defined in combination with any size fraction without significantly altering the final result, not only in regard to the spatial distribution of classes, but also the average centroid values (Table 4).
The management zone maps were cross-tabulated with the maps of classificatory variables (Table 5), respecting the number of classes necessary to obtain an efficient tabulation (Kitchen et al., 2005).This correlation analysis was performed for the purpose of comparing the different input variables used to define management zones, and to identify the most important variable for delineating specific sites of fertility management for cacao cultivation.
The management zones presented Kappa coefficient values ranging from poor to good.The Kappa coefficients were significant for most classificatory variables (Table 5).The  The properties calcium (Ca), total organic carbon (TOC), iron (Fe), and base saturation index (V) were the only properties that had Kappa coefficients higher than 0.40.The Ca and Fe maps were the maps best classified by the zones (highest Kappa values), especially for Early-harvest + Total Sand, Early-harvest + Clay and Early-harvest + Silt.Sousa et al. (2007) report that calcium (Ca 2+ ) favors the development of the cacao tree root system, increasing resistance of  Rev Bras Cienc Solo 2016;40:e0150520 the plants against water deficit and improving phosphorus uptake.Deficiency of Fe, for its part, causes severe damage to cacao tree growth and, consequently, to yield (Chepote et al., 2013).
For P and K, the largest correlations were observed for Early-harvest + Harvest + Total Sand.To increase the representability of the zones, Valente et al. (2012)      As the management zone for early-harvest cacao (T) did not minimize the FPI and NCE indexes with the same number of classes, correlations with 2 and 5 classes were evaluated (Table 6), in accordance with Alves et al. (2013).The results obtained with the Kappa coefficient indicate that the optimal number of classes for management zones of early-harvest cacao was 2 because of the higher number of well-classified properties.
Based on the findings of this study, it can be affirmed that the combinations Early-harvest + Total Sand, Early-harvest + Clay, Early-harvest + Harvest + Total Sand, Early-harvest + Harvest + Clay and Early-harvest + Harvest + Silt displayed the best performance for delineating management zones for cacao cultivation.As the highest correlation values were identified for Early-harvest + Total Sand and Early-harvest + Clay, and no differences were observed among the maps created from these combinations, the variables recommended to delineate management zones for cacao cultivation, in the area under investigation, were the early-harvest cacao yield and the total sand or total clay fractions.
The delineation of management zones using variable clusters leads to more satisfactory results (Kitchen et al., 2005); however, the best zones are those that make the procedure for recommendation of soil amendment and fertilizers in specific sites more feasible from technical, operational, and economic points of view.Morari et al. (2009) state that

CONCLUSIONS
The methodology applied to delineate management zones for cacao cultivation was satisfactory, allowing the establishment of cluster patterns between soil properties and cacao yield.
The site-specific application of inputs in cacao cultivation must be performed in two or three specific management units that have different fertility levels.
For the cacao crop, the variables recommended to define management zones are early-harvest cacao yield and the total sand or total clay fractions.

Figure 1 .
Figure 1.Distribution of sampling points in the experimental area.

Figure 2 .
Figure 2. Models and parameters of variograms for the soil physical properties and for the cacao yield.C 0 : nugget effect; C 0 +C: sill; a: range (m); R 2 : determination coefficient; SDI: spatial dependence index; SQR: sum of residual square.

Figure 4 .
Figure 4. Management zones with three classes, defined by mapping the spatial variability of the combination of Harvest + Total Sand, Harvest + Clay, Harvest + Silt, Harvest, Early-Harvest + Total Sand, Early-Harvest + Clay, Early-Harvest + Silt and Early-Harvest + Harvest.

Table 1 .
Fuzzy Performance Index (FPI) and Normalized Classification Entropy (NCE) indexes for each class of each management zoneValues in bold indicate minimization of the indexes for the management zones of classes.for combinations between Early-harvest + Harvest + Total Sand, Early-harvest + Harvest + Clay and Early-harvest + Harvest + Silt were the only combinations that showed significant correlation with all soil properties.However, the indexes of these combinations were not considered good (> 0.40).This fact was only observed for the combinations between Total Sand, Early-harvest + Total Sand, Early-harvest + Clay, Early-harvest + Silt, Harvest + Total Sand and Harvest + Clay.
zones Figure 3. Management zones with two classes, defined by the mapping of spatial variability of Total Sand, Clay, Silt, Total Sand + Clay, Total Sand + Silt, Clay + Silt, Total Sand + Clay + Silt, Early-Harvest, Early-Harvest + Harvest + Total Sand + Clay + Silt.

Table 2 .
report that selection of the best zones should take the largest absolute Kappa values into account.The centroids of the best management zones with their respective classes Figure 5. Management zones with four classes, defined by mapping the spatial variability of the combination of Early-Harvest + Harvest + Total Sand, Early-Harvest + Harvest + Clay and Early-Harvest + Harvest + Silt.

Table 3 .
The centroids of the best management zones with their respective classes

Table 4 .
The centroids of the best management zones with their respective classes (1) s: standard deviation.

Table 5 .
(Raij, 2011)l.(2012)management zones and soil chemical propertiesA, B, C, and D: Kappa index good, fair, bad, and very bad, respectively, at 95 % significance.The same letters in the same row and different columns indicate equal values between management zones.Ca 2+ /Mg 2+ ratio, and TOC were the only chemical properties that showed significant correlations with all combinations.According toAlves et al. (2013), organic matter and pH are best classified in terms of management zone.The base saturation index (V) showed correlation with all combinations, except for Harvest Yield and Early-harvest + Harvest.Arévalo et al. (2012)report that cacao cultivation requires fertile soils and good availability of Ca 2+ , Mg 2+ , K + , and Na + , which are quantified by the V index.Overall, base saturation is related to the availability of exchangeable bases, which affect the exchange capacity on the surface of colloids, especially in the presence of H + and Al 3+ ions(Raij, 2011).

Table 6 .
Kappa Index of management zone and early-harvest cacao and chemical properties A, B, C, and D: Kappa index good, fair, bad, and very bad, respectively, at 95 % significance.Equal letters in the same row and different columns indicate equality values between management zones.Rev Bras Cienc Solo 2016;40:e0150520 the most economically viable management zones are those delineated with a smaller number of combinations and classes.