Fuzzy Classification in the Determination of Input Application Zones

Correctly interpreting soil fertility and its spatial distribution within an area helps to lessen losses and environmental effects associated with agriculture, to optimize fertilization and liming practices. This study is aimed at using concepts and methods from spatial and temporal analyses to soil fertility and to develop a fuzzy classification methodology in an effort to define input application zones in three conilon coffee harvests. An irregular network with georeferenced points was built in the central region of the farm. Soil samples were collected at 0.00-0.20 m depth within the projections of tree canopies. Geostatistical analysis was used to set up maps in which the variables were shown. In such maps, input and output fuzzy sets were created and applied, as well as rules of inference and determination of to-be-applied logical operators. Fuzzy classification of the area was performed in the three harvests so as to define whether or not inputs were needed. Our main findings show that the N-P-K requirement was spatially dependent in all harvests. By classifying the area using fuzzy logic, it was possible to analyze soil fertility and to indicate the regions having the smallest and greatest needs for N-P-K and liming.


INTRODUCTION
The application of fertilizers in conilon coffee cultivation relies upon its productivity and nutrient classes, in which only the lowest or highest limit of a class is considered as a practical importance restraint, whereas intermediate classes are considered secondary constraints (Meirelles et al., 2007).This has led to the incorrect management of soil fertility, as natural phenomena do not have well-defined limits but instead gradual changes defining often inaccurate transition zones.
A way of representing soil fertility is the use of a theoretical framework capable of modeling the uncertainties associated with nutrient classes (Silva et al., 2010).The fuzzy logic, particularly, stands out as an efficient tool to manipulate phenomena in which class boundaries are not clearly defined, providing results closer to the reality of human interpretation (Chiang and Hsu, 2002).
As opposed to Boolean logic, which considers that the real world comprises only two classes, true and false, fuzzy logic attributes to real variables class sets associated with linguistic terms, allowing elements to belong to different sets concurrently with varying relevance degrees (Sousa, 2007).Thus, fuzzy set theory allows the treatment of imprecise or vague background information (Silva et al., 2010), which may increase the efficiency of agricultural management and, as a consequence, of the application of inputs.It can also improve productivity, reduce production costs, and lessen the environmental impact caused by the excessive use of inputs (Farias et al., 2003).
Fuzzy logic has been applied in several areas of science (Kavdir and Guyer, 2003;Peixoto et al., 2004;Bressan et al., 2006).Silva and Lima (2009) used fuzzy logic to map the fertility of a humic Latossolo Vermelho-Amarelo Húmico (Oxisol) cultivated with arabica coffee (catucai variety) and reported this technique to show a better visualization of gradual changes in the soil fertility classes as well as an improved definition of gradual transition zones rather than of information on classes.
A fuzzy classifier has four components: an input processor, a set of linguistic rules, a method of fuzzy inference, and an output processor informing a real number as an output (Pedrycz and Gomide, 1998).The definition of these components can help to analyze the gradual change of soil fertility classes, as it was not found in literature references.
This study aimed to use concepts and methods from spatial and temporal analyses to study soil fertility and to develop a fuzzy classification methodology in order to define input application zones in three conilon coffee harvests.

MATERIALS AND METHODS
The research was performed in an experimental area located in Cachoeiro de Itapemirim, Espírito Santo state, Brazil, which was cultivated with conilon coffee (Coffea canephora Pierre ex Froehner) to the following specifications: robust tropical variety; 2.9 × 0.90 m spacing; dimensions that ranged from 100-150 m above sea level; geographical coordinates: 20° 45' 17" S and 41° 17' 8" W.
According to the Köppen classification system, the region's climate is classified as Cwa, i.e., unevenly distributed rainfall throughout the year, a dry winter and a hot, rainy summer.The average temperature of the coldest month is lower than 20 °C, while that of the warmest month is higher than 27 °C.The soil at the experimental area is classified as Latossolo Vermelho-Amarelo Distrófico (Santos et al., 2013), a typic Hapludox (Soil Survey Staff, 2014).It has a clayey texture as well, with average clay, silt and total sand contents (at a 0.00-0.20 m layer) of 414.7, 190.5 and 393.9 g kg -1 , respectively.
Rev Bras Cienc Solo 2016;40:e0150104 A georeferenced irregular grid comprising 109 sampling points spaced by approximately 10 m in the coffee row was built in the selected area, in which a sampling point that consisted of five coffee plants in a 13.00 m 2 area was established, in an area of 1.0 ha.Soil samples were collected at 0.00-0.20 m depth within the projections of tree canopies using a stainless steel probe.The georeferencing of the sampling spot was performed by a topographic GPS, model GTR-1.
The data were collected during the three agricultural harvests from 2004/05, 2005/06 and 2006/07.In this study, the attributes pH, P, K, CTC and V of the three consecutive harvests were used.Soil samples were characterized as pH in water (1:2.5 ratio); potential acidity (H+Al) -extracted with 0.5 mol L -1 Ca(OAc) 2 at pH 7.0; Ca 2+ , Mg 2+ , and Al 3+ -extracted with 1 mol L -1 KCl; as well as P and K extracted by Mehlich-1 (Donagema et al., 2011).The sum of bases (SB) was calculated with the outcome of the chemical analyses by adding up the K + , Ca 2+ and Mg 2+ contents; T -cation exchange capacity (CEC) at pH 7 by adding up the H+Al and SB contents; and V (percentage of bases saturation) through the equation (SB/T) × 100.The three agricultural crops were manually harvested using sieves.
The productivity of dried coffee for each harvest was converted to processed coffee using the crop breakdown values suggested by Incaper.The dried-to-processed coffee ratio was 1.46:1 for 2004/05, 1.95:1 for 2005/06, and 1.94:1 for 2006/07.These values were used to calculate the productivity of processed coffee.
Relying on the fertilization of conilon coffee suggested by Prezotti et al. (2007), linear regressions were fitted for each nutrient so as to match the recommended ranges and obtained productivity in order to determine variable doses.The regression equations fitted for each nutrient, as well as the significant estimated coefficients, are presented in table 1.
In the regression analyses, the negative values in the recommendations that indicated no need for application of inputs were assigned a zero value so as to perform spatial analysis and, subsequently, set up thematic maps.In order to test the significance of the regression coefficients and the fit quality of the equations, the results were submitted to the Student's t-test (p<0.05).Then, the coefficient of determination (R 2 ) was examined.
The need for lime in the area was calculated considering an alkali saturation (V2) of 60 % -which is suitable for conilon coffee cultivation, the V and T values from each sampling point, and a PRNT of 80 %, according to equation 1: where NC is the amount of lime (Mg ha -1 ), T is the CEC at pH 7 (cmol c dm -3 ), V2 is the bases saturation suitable for the culture (%), V1 is the bases saturation at each sampling point (%), p is equal to 0.5 for surface application, and RPTN is Relative Power of Total Neutralization.); R 2 : coefficient of determination.
The spatial distribution maps for N-P-K and lime were obtained using geostatistical techniques, which made it possible to quantify the degree of spatial dependence among the samples, while adopting the intrinsic stationarity hypothesis using a variogram (Equation 2): where γ*(h) is the estimated semivariance and N(h) is the amount of measured Z(xi) and Z(xi + h) pairs, separated by a h-long vector.
As part of the spatial analysis, spherical, exponential and Gaussian theoretical models were tested to determine the following parameters: nugget effect (C 0 ), landing (C 0 +C), and reach (a) of the spatial dependence.The highest R 2 in the variogram and the smallest residual sum of squares (RSS) were adopted as the ideal condition when choosing the suitable model.However, as the definitive criterion for each attribute, models with significant correlation between the observed and estimated values by cross-validation were chosen.
The degree of spatial dependence (DSD) was calculated in accordance with Cambardella et al. (1994) by means of the relation [C 0 /(C+C 0 )] × 100.According to these authors, DSD values of up to 25 %, from 25-75 % and above 75 % denote strong, moderate and weak spatial dependences, respectively.Once spatial dependence of the attributes was present, the values were interpolated through the ordinary kriging method to estimate the values in unmeasured locations.
Fuzzy classification was performed in Matlab (version 7.1, 2010) software after the spatial analysis.The system was fed with data entries, these being the maps of the needs for N, P, K and liming in the three harvests determined by the regression equations (Table 1) and equation 1.
The fuzzy function used in this study for modeling uncertainty of fuzzy subsets of input and output was a trapezoidal type (Hines, 1997), as shown in equation 3: The fuzzy input subsets were established based on the N, P and K recommendations and the needs for liming.These were divided into three categories of needs for input application within the area; namely: low, medium, and high.
For constructing the fuzzy enter set referring to the needs for N, the data from nitrogen recommendations (Table 2) were used.Low, medium and high needs for N application were established considering the first, second and third quartile of the N recommendations (NN).
As for the needs for liming (NC), fuzzy subsets were obtained from a quartile classification of needs for liming for each of the three studied harvests, thereby establishing values for low, medium, and high applications.
Rev Bras Cienc Solo 2016;40:e0150104 Regarding the needs for P (NP), the productivity was first verified and then each productivity level was defined on the basis of P content in soil: low (P <5 mg dm -3 ), medium (average P content from 5-20 mg dm -3 ) or high application (P >20 mg dm -3 ) (Table 2).The same procedure was used for K: low (K <60 mg dm -3 ), medium (average K content from 60-200 mg dm -3 ) or high application (K >200 mg dm -3 ) (Table 2).
The "Fuzzy Output" system is a diagnosis of the needs for inputs and lime in the area (Figure 1).This variable is associated with three application levels: high (at least two input variables belonging to the high application category), medium (at least two input Possession of fuzzy input and output sets, rules of inference and logical operators for fuzzy classification was performed.In this classification, it is possible to obtain application grade maps of inputs in the three consecutive crops by means of the centroid defuzzification method (Driankov et al., 1993).• IF NN is medium AND NP is medium AND NK is medium AND NC is medium THEN the application is medium.
• IF NN is high AND NP is high AND NK is high AND NC is high THEN the application is high.
• IF NN is low AND NP is low AND NK is low AND NC is medium THEN the application is low.
• IF NN is medium AND NP is low AND NK is low AND NC is low THEN the application is low.
• IF NN is low AND NP is medium AND NK is low AND NC is low THEN the application is low.
• IF NN is low AND NP is low AND NK is medium AND NC is low THEN the application is low.
• IF NN is medium AND NP is medium AND NK is low AND NC is low THEN the application is medium.
• IF NN is low AND NP is low AND NK is medium AND NC is medium THEN the application is medium.
• IF NN is low AND NP is medium AND NK is medium AND NC is low THEN the application is medium.
• IF NN is medium AND NP is low AND NK is low AND NC is medium THEN the application is medium.
• IF NN is medium AND NP is low AND NK is medium AND NC is low THEN the application is medium.
• IF NN is low AND NP is medium AND NK is low AND NC is medium THEN the application is medium.
• IF NN is high AND NP is high AND NK is high AND NC is low THEN the application is high.
• IF NN is high AND NP is high AND NK is high AND NC is medium THEN the application is high.
• IF NN is high AND NP is high AND NK is low AND NC is high THEN the application is high.
• IF NN is high AND NP is high AND NK is medium AND NC is high THEN the application is high.
• IF NN is high AND NP is low AND NK is high AND NC is high THEN the application is high.
• IF NN is high AND NP is medium AND NK is high AND NC is high THEN the application is high.
• IF NN is low AND NP is high AND NK is high AND NC is high THEN the application is high.
• IF NN is medium AND NP is high AND NK is medium AND NC is high THEN the application is high.
variables belonging to the low application category), and low application of inputs in the area (at least three input variables belonging to the low application category).
Once the Fuzzy input and output sets were defined, the inference rules that were used by the Fuzzy output set were defined.The combination of rules (r i ) used in this study was based the Mamdani model (Driankov et al., 1993;Hines, 1997).Such inference rules obey the following format: where the classification high of the needs are Fuzzy input sets, while the classification high of APPLICATION is the Fuzzy output set, associated with a given Fuzzy rule.The premises (NN is high), (NP is high), (NK is high) and (NC is high) comprise the antecedent of the rule.Each premise leads to a numeric value that is extracted from the corresponding relevance function, i.e., μNN(high), μNP(high), μNK(high), and μNC(high).
The number of suitable fuzzy rules depends on the problem.The fuzzy sets were aggregated by means of the union operator (maximum).The inference rule set used are showed in table 2.
Provided the fuzzy input and output sets, inference rules as well as the logical operators, the fuzzy classification was carried out.In this classification, it was possible to obtain grade maps of input application during the three consecutive harvests by means of the centroid defuzzification method (Driankov et al., 1993).

RESULTS AND DISCUSSION
The descriptive analysis of the corrective and fertilization needs for N, P and K were in table 3. The values were calculated by the equations provided in table 1.
Only the NN3, NP3, NK2 and NK3 attributes did not present normal distribution as indicated by the Kolmogorov-Smirnov test (p<0.05).All attributes showed asymmetric distribution to the right, except for NP1, NP2, NK1, NK3, NC1, NC2, and NC3.Considering the coefficients of variation (CV) limits that were adopted by Wilding and Drees (1983), the results indicate low variability (CV < 15 %) for NN1 and NN2, moderate variability (15 % < CV < 35 %) for NN3, NP1, and NC3, as well as high variability (CV > 35 %) for NP2, NP3, NK1, NK2, NK3, NC1, and NC2.In this analysis, high CVs are related to two factors: low productivity at some points, which leads to no need for fertilization as well as V values greater than 60 %.
Spatial dependence analysis of the studied attributes defined the theoretical semivariograms that were scaled by the data's variance (Table 4).The data from the experimental semivariograms for most of the needs for input application were fitted well by the exponential model.Only NN1, NK1 and NK2 were fitted by the spherical model.
The reach values can be separated into two distinct groups: the first group represented by the needs with a reach lower than 30 m, namely: NN2, NP2, NP3, NK2, NK3, NC1, NC2, and NC3; and the second group comprising the needs with a reach greater than 30 m, namely: NN1, NN3, NP1, and NK1.This indicates the distance limit by which the points are correlated with each other.The collected points with distances greater than the reach are independent and their analyses should be performed by classical statistics (Vieira, 2000).
The NN2, NN3, NP3, NK3, NC2 and NC3 attributes presented strong spatial dependence (GDE < 25 %) while the others presented moderate dependence (25 % <GDE< 75 %) according to the classification proposed by Cambardella et al. (1994).Thus, in all needs for input, a high precision for ordinary kriging estimates were obtained.These were corroborated by cross-validation analysis, in which the observed and estimated values presented significant correlation (p<0.05).
Using the parameters obtained in the fits from the variogram models as well as the ordinary kriging, contour maps were obtained so as to describe the variability within the area in terms of the needs for N (Figure 2), P (Figure 3), K (Figure 4) and lime (Figure 5) for all harvests.
According to the spatial variability of input application during the different harvests, if the application of N-P-K and the correction is performed considering the mean values only the mean needs will be achieved.This approach does not take into account the specific needs in each part of the field.Thus, analyzing the spatial distribution of the M: mean; Md: median; CV: coefficient of variation.NNi: need for nitrogen (in years 1, 2 and 3); NPi: need for P (in years 1, 2 and 3); NKi: need for K (in years 1, 2 and 3); NCi: need for liming (in years 1, 2 and 3).
Rev Bras Cienc Solo 2016;40:e0150104 variables involved in the production enables the distinction of regions with lesser and greater variabilities and also allows differentiated application maps to be generated from the agricultural inputs.Therefore, one should take into account the amount of nutrients required for the satisfactory development of the cultivation, as well as   Rev Bras Cienc Solo 2016;40:e0150104 the amount available in different spots within the area (spatial variability), thereby optimizing the production system.
The fuzzy input sets for the needs for lime and N during the three harvests are in figure 6.For the remaining fuzzy sets, uncertainty moments were observed (pertinence values of    ) Lime (kg ha -1 ) 0.5) between the definitions of low and medium as well as medium and high application.This behavior was expected because it differentiates fuzzy from classical logic, meaning that this value may belong to two subsets simultaneously.
There are situations in which a set A, defined in a universe X, does not present its well-defined limits (Zadeh, 1965).Thus, for elements that are not possible to state whether or not they belong to set A, are assigned an intermediate value.
For lime, the values that express this high degree of uncertainty fall between 600-1,400; 1,000-1,400 and 1,400-2,200 kg ha -1 of lime for the first, second and third harvests, respectively (Figure 6).However, it can be asserted that values of 0-500 and 1,000 kg ha -1 belong to a set of low and medium application of input for the first and second harvests, while those of 0-1,400 and 1,700 kg ha -1 belong to a set of low and medium application of input for the third harvest.
For the input set for N with the application of 350 and 500 kg ha -1 of N there is an uncertainty degree in which simultaneously belongs to two sets (Figure 6).However, for values between 200-700 kg ha -1 of N, they belong to the low and high sets of input application, respectively.
The fuzzy input sets for the needs for P are shown in figure 7. The pertinence degree range equal to 1 for the low, medium and high fuzzy subsets of P 2 O 5 application varied depending on the expected productivity.This means that the expected productivity will indicate which set of fuzzy classification will be carried out.Additionally, increased productivity plays a role on the amount of P application from the fuzzy subsets (low, medium, and high applications), as well as on K application (Figure 8).
When a given value had a pertinence degree equal to 1 classified within a certain subset, this indicate the value did not cause doubts when specifying the low, medium or high application of inputs.For all the presented fuzzy sets, intertwining conflicts between the subsets were generated, e.g., it is to be expected that a determined potassium application simultaneously belongs to subsets of low and medium application.However, it was not expected that a determined potassium application simultaneously belongs to subsets of low and high application.
Once the input and output fuzzy sets had been constructed, the inference rules created and the logical operators determined, the fuzzy classification of the area during the three harvests was performed in grades (Figure 9) so as to diagnose the needs for input within the area.Classifying the area using fuzzy logic presented a spatial variability of the grades among the different harvests.The percentages of areas related to the grades of input application among the harvests are in table 5.
Spatializing the grades from the fuzzy classification showed that more than half of the area presented values between 3.4-6.3during harvests 1, 2 and 3.This observation shows that at least two to-be-applied inputs belong to the low application category.However, during the second harvest, the greatest percentage of the area had grades between 0-3.3 compared to the third harvest, indicating that at least three to-be-applied inputs belong to the high application category.
A representative area extension (14.43 %) with grades between 6.4-10 was observed only for the first out of the three studied harvests.This allows us to state that, in these places, there is a need for the low application of three inputs concurrently.
Finally, it should be highlighted that the use of fuzzy logic in this study indicated regions with the greatest need for the application of inputs.The identification of the quantity and of the to-be-applied input depends on the evaluation of individual need maps for each input.
and x < a , and a ≤ x < b 1, b ≤ x < c , and c ≤ x < d the trapezoidal relevance depends on a set of four parameters -a, b, c, and d, where a and d determine the range in which the relevance function assumes nonzero values, while b and c determine the range in which the relevance function is maximum and equal to 1.

Figure 4 .
Figure 4. Spatial distribution of potassium needs in the three studied crops.

Figure 3 .Figure 5 .
Figure 3. Spatial distribution of phosphorus needs in the three studied crops.

Figure 6 .
Figure6.Fuzzy sets for the input variable of lime application need in the three crops and nitrogen.

Figure 7 .Figure 8 .
Figure7.Fuzzy sets for the input variable of phosphorus application need (NP).

Table 1 .
Equations used in the recommendation for nitrogen, phosphorus and potassium rates NN: need for nitrogen; NP: need for phosphorus; NK: need for potassium; Prod: productivity in sc ha

Table 2 .
Fuzzy inference rules• IF NN is low AND NP is low AND NK is low AND NC is low THEN the application is low.

Table 3 .
Descriptive analysis of the corrective and fertilization needs using regression equations

Table 4 .
Estimated models and parameters of the variograms scaled to the input application needs for the three coffee conilon crops Figure 2. Spatial distribution of nitrogen needs in the three studied crops.

Table 5 .
Fuzzy classification maps based on the application needs of nitrogen, phosphorus, potassium and liming on the three crops.Percentage determination of the areas with input application grades