MANAGEMENT CLASS DELIMITATION IN A SOYBEAN CROP USING ORBITAL IMAGES

The delimitation of management classes is critical for successful precision agriculture. This process involves choosing the variables to use and analyzing the spatial variability of the variables. Thus, the objective of this study was to analyze the correlation between management class maps generated from orbital images and yield maps. A 95-hectare area of rainfed grain was evaluated. Yield maps were obtained for the 2015/2016 and 2016/2017 soybean crops. Orbital images were used from two dates for each crop to generate vegetation index maps. The spatial correlation between the vegetation indices and the yield maps was obtained using a bivariate Moran index. The delineated management classes were compared using the Kappa index. This study demonstrated that the Kappa values resulting from the comparison between the management class maps generated from the soybean yield and the vegetation index ranged from 5% to 67% depending on the number of delineated classes. The highest Kappa values were observed when the area was delineated into three classes.


INTRODUCTION
Among the recommended management strategies in precision agriculture, management class delimitation (CM) is one of the most frequently used approaches. The use of CMs allows for significant advances in the management of field variability by farmers and agronomy specialists, and it facilitates operation by agricultural machines (Leroux et al., 2017). Once defined, the number of soil samples needed to characterize the variables in the production system is reduced (Valente et al., 2014). The crop yield, topographic data, soil apparent electrical conductivity and remote sensing multispectral index are the most frequently used variables when defining the CMs by cluster analysis (Martínez-Casasnovas et al., 2018).
Orbital images are a source of information used in the delimitation of CMs because of their easy acquisition, wide coverage area and broad range of available spectra. One of the advantages of this type of information is that a historical series of data can be used, providing greater robustness to the analysis process. Through the Landsat 7 and Landsat 8 platforms, the Landsat program provides the largest collection of remote sensing images at moderate resolution at no cost to the user. These platforms provide high quality multispectral images at a 30-meter resolution. Haghverdi et al. (2015) were successful at using Landsat 8 imagery and the apparent soil electrical conductivity for generating irrigation management classes.
Another program that provides free orbital images is the Copernicus Program. Multispectral images are provided by two satellites, Sentinel-2A and Sentinel-2B, which are equipped with identical multispectral instruments (MSI), and it is possible to acquire data in 13 bands at different spatial resolutions (between 10 m and 60 m) using this program. Martínez-Casasnovas et al. (2018) used an NDVI (Normalized Difference Vegetation Index) derived from Sentinel-2A images, soil apparent conductivity data, and a digital elevation model for delimiting the management classes in an irrigated maize crop. The authors established a model to complement the automated process of delimiting CMs by aggregating farmer knowledge and thus obtaining better results.
The spectral bands of orbital images can be combined to create vegetation indices. A vegetation index (VI) is a number generated by some combination of bands from a remote and/or quantitative sensing image about vegetation in a given image pixel (Ortega-Blu & Molina-Roco, 2016). Kuiawski et al. (2017) used the combination of terrain elevation data and vegetation indices to delimit the management classes in a soybean crop, and those classes differed in terms of their phosphorus, clay and silt contents.
Engenharia Agrícola, Jaboticabal, v.39, n.5, p.676-683, sep./oct. 2019 Producing yield maps can be considered as the first step to adopting precision agriculture. These maps represent the crop response to management practices and to biotic and abiotic factors, and they provide valuable insights about spatial and temporal variabilities. Thus, remote sensing data collected at different stages of plant growth that correlate with crop yields may be a feasible way to facilitate management class delimitation and the adoption of precision farming techniques. Thus, the objective of this study was to evaluate the vegetation indices obtained using orbital data to delimit management classes in soybeans cultivated without irrigation.

MATERIAL AND METHODS
The study area was part of a farm located in the municipality of Iepê, in the southwestern part of the state of São Paulo, Brazil, which is situated between the geographic coordinates at a latitude from 22 ° 39'28.33 "S to 22 ° 40'3.5" S and a longitude from 51 ° 02'21.15 " W to 51 ° 02'35.23" W. The studied field area measures 95 hectares, and it is used for grain production under no-tillage, without any irrigation system. This field is cultivated with a succession of corn and soybean crops. The study was performed using data from the soybean crop. FIGURE 1. Geographic location of the soybean field used in this study, which was located in Iêpe, SP, Brazil.
The soybean yield data were taken during the 2015/2016 and 2016/2017 crop seasons and were georeferenced using a yield monitor that had a GNSS (Global Navigation Satellite System) that recorded one measured point every second. Yield data processing was performed using the R 3.3.4 computer program (R Core Team, 2015). The identification and removal of outliers from the soybean yield data were performed by applying a three-step filtering process (Taylor et al. 2007).
The first step of the filtering process was the removal of yield data with extreme values far above or below the mean soybean yield value, in which the threshold used for the acceptable minimum yield value was 100 kg ha -1 and the acceptable maximum yield value was 8400 kg ha -1 . During the second stage of the filtering process, the global outliers were identified. During the third and last stages of the filtering process, the univariate Moran spatial correlation index (Ii) was used as a tool to identify local outliers, in accordance with recommendations by Córdoba et al. (2016).
The interpolation of the filtered yield data was performed using the block kriging method and local semivariograms; a minimum of 90 and a maximum of 100 neighboring yield data points were used to interpolate each point in the grid. This process was performed with VESPER (Variogram Estimation and Spatial Prediction plus Error) software (Minasny et al., 2005). A grid measuring 30 by 30 m was used for the interpolation process. Images from the Sentinel-2 platform were also resampled to this scale (30 m x 30 m) using the nearest neighbor method.
Images were acquired from different platforms, with images selected from the middle to near the end of the crop cycle. The presence of clouds during the cultivation period was the primary impediment to obtaining images from a single sensor. Images from the Landsat satellites were taken from the USGS (United States Geological Survey) EarthExplorer image catalog and were part of the Landsat Collection 1 Level-2 dataset. At this level of processing, the images are orthorectified and subjected to a 6S atmospheric correction. The Sentinel-2A satellite images (product: Level-1C) were taken from the European Space Agency (ESA) Copernicus Open Access Hub (SciHub) platform, and these images were subsequently corrected atmospherically (level-2A) using the Sen2Cor algorithm (http://step.esa.int/main/third-party-plugins-2/sen2cor).
Each year, a different soybean cultivar was used, with one early cultivar (with a cycle of 120 days) and one late cultivar (with a cycle of 140 days). Table 1 presents the sowing and harvest information for each crop and the imaging dates.
Engenharia Agrícola, Jaboticabal, v.39, n.5, p.676-683, sep./oct. 2019 The near infrared, blue, green, red and SWIR bands were used to generate the vegetation indices selected for the management class delimitation. Band 8A of Sentinel-2A was selected instead of band 8, because the spectral response for band 8A is similar to that of band 5 in Landsat-8 (Skakun et al., 2017). To obtain the vegetation indices, Sentinel-2A band 8A was used instead of bands 5, 6 7 and 8 because it was found to be more stable, even when 8A was replaced by band 5 in Landsat 8. Table 2 presents the characteristics of each band for the satellites used in this study. These bands allow for calculations of the most commonly used vegetation indices in crop mapping. * The values in parentheses (in nanometers) correspond to the spectral width of each sensor band.
The vegetation indices used here are shown in Table  3. They were chosen because they are the most common in studies involving spatial variability analyses of crop development. The use of six indices allows us to explore the potential use of orbital images for identifying the spatial variability of vegetation, because each of them may be sensitive to certain factors that can interfere with the crop yield. Within the delimitation of the management classes, the vegetation indices were used separately for each analyzed date to uncover the date when the indices had the highest correlation with the crop yield. Table 3 presents the indices used in this study and the equations that define each vegetation index used here.
Engenharia Agrícola, Jaboticabal, v.39, n.5, p.676-683, sep./oct. 2019 The pixels located at the border of the field were removed from the analysis to avoid edge effects that could contaminate the spectral signatures. Then, to evaluate the spatial correlation between each vegetation index and the soybean yield maps, the Moran Global Bivariate Index (Ixy) was used. The significance analysis of the index was obtained using a pseudo-significance test, in which random permutations of the point values were performed to obtain an empirical distribution. Thirty random samples containing 50 data points were taken from the dataset to test the significance of the Moran index.
The significance of the spatial correlation test can be influenced by the sample size, since each analyzed pixel corresponds to a sample. Thus, in large samples, the p-value approaches zero rapidly (Lin et al., 2013). For that reason, the small mathematical differences between the value assumed in the null hypothesis (an autocorrelation index equal to zero) and the estimated value of the Moran index will become significant. When using samples with smaller sizes (50 data points), only the expressive differences were significant. The vegetation indices that showed a significant spatial correlation in 90% of the tests were used in the cluster analysis.
The vegetation indices that presented significant spatial correlation on each date with the soybean yield from each analyzed crop season were used in the fuzzy k-means cluster analysis to delimit the management classes. Because the NDVI is one of the most frequently used indices in agronomic research and technical studies, a cluster analysis using only this index was also performed using only this vegetation index. Management classes using vegetation indices were compared to those generated only with the yield maps. This comparison was made using Kappa statistics (Cohen, 1960) to calculate the degree of agreement of all the pixels in the maps. The Kappa coefficient value determines the similarity between a reference map and another map. Management classes using data from both crop seasons were also evaluated using the Fuzziness Performance Index (FPI) and the Modified Partition Entropy (MPE). The ideal number of classes is the one in which both indices are minimal.
To verify if the classes were able to discriminate among the areas of different yield potentials, an analysis of variance (ANOVA) was performed to compare the soil clay content in each cluster. For this purpose, 30 points were sampled in the field for the soil clay content determination.

RESULTS AND DISCUSSION
The Modified Simple Ratio (MSR) and Green Chlorophyll Vegetation Index (GCVI) obtained 52 days after sowing during the 2015/16 crop year showed a significant correlation with the soybean yield. For 92 days after sowing, only the SIWSI (Shortwave Infrared Water Stress Index) showed a significant correlation with the crop yield. Regarding the indices analyzed for the 2016/17 crop, only the SIWSI obtained at 111 days after sowing showed a significant correlation with the crop yield.
For the data from the middle of the soybean cycle, there were indices that presented significant correlations with the crop yield only for the first crop year; they were the GCVI and MSR indices. The MSR index is related to the plant biophysical characteristics (Vescovo et al., 2012), and the GCVI is more resistant to saturation (Lobell & Azzari, 2017). The different stages of plant development for each crop year may have caused the lack of significant correlation on one of the analyzed dates. Although the imaging dates are from the same period, close to the middle of the cycle for each crop year, the cultivars were different and thus may display differences in some agronomic characteristics. Burke & Lobell (2017) achieved good results by applying the GCVI to smallholder farmer areas in Africa. The authors used different spatial resolutions and observed that the green band reflectance responded better to variations in the leaf chlorophyll concentration compared to the red band reflectance, which was used in the NDVI and EVI (Enhanced Vegetation Index) indices. Therefore, the GCVI is likely to capture differences in nutrient deficiencies that correlate with the yield. In the present study, vegetation indices that did not use the red band were able to better detect the crop yield variability. The reflectance of the red band becomes insensitive in the presence of moderate to high chlorophyll contents, while the green band is more sensitive due to the lower chlorophyll absorption coefficient, which prevents reflectance saturation and allows for the greater penetration of light inside the leaves and canopy (Kira et al., 2017).
The SIWSI vegetation index differs from the NDVI in that it uses the shortwave infrared band (SWIR) instead of the red band. The most significant absorption of the SWIR band occurs when water is present in the leaves of the plants (Fensholt & Sandholt, 2003). This index showed a significant spatial correlation with the crop yield for the two crop seasons on the dates closest to the harvest. Despite the variation in the rainfall, the accumulated precipitation for the first crop season was 2096.7 mm and 836.4 mm during the second crop season, and there was no significant correlation between the SIWI and the soybean yield, which may indicate that there was enough rainfall for the development of the soybean plants. According to Embrapa (2013), the total water required to reach the maximum yield of the soybean crop varies from 450 to 800 mm per crop cycle. However, this need may vary according to the climatic conditions, cycle length and crop management.
The senescence process of the plants may have influenced the variability detected in the SIWSI index, and, consequently, the significant correlation of this index to the crop yield. Thus, in areas where the soybean yield was lower, it is possible that the plants reached the end of the production cycle earlier, probably due to some limitations in these areas. This result may justify the significant correlation between the soybean yield and the SIWSI. Xiao et al. (2014) used modeling to study the sensitivity of the reflectance activity on variations in biochemical and biophysical plant variables at different scales, and thus they concluded that at the canopy scale, the variation in the reflectance in the visible region, NIR (Near infrared) and SWIR (Short Wave Infrared) is controlled by chlorophyll a + b, dry matter and water content, respectively.
After the spatial correlation was analyzed between the vegetation indices and the soybean yield, the indices with significant correlations (significant at pseudo p-value = 0.05) were selected on each analyzed date for each crop to delimit the management classes. Table 4 shows the results of the Kappa agreement analysis between the management classes delineated with the crop yield maps (reference map) and the delimited classes using the vegetation indices that presented a significant correlation with the crop yield and the management classes delimited using the NDVI.
Engenharia Agrícola, Jaboticabal, v.39, n.5, p.676-683, sep./oct. 2019 For the management classes delimited using the NDVI, the values with the highest agreement were found for the classes obtained on the second date. In general, the classes obtained using the NDVI provided lower Kappa values compared to the classes delimited with the index that presented significant correlation indices with the crop yield. The NDVI has limitations at moderate and high leaf area index values due to the high chlorophyll absorption coefficient, which impairs the red band reflectance sensitivity (Nguy-Robertson et al., 2012).
The Kappa index values obtained for the comparison between the vegetation index class maps and the crop yield class maps (Table 4) were higher when the vegetation indices were classified into three classes. With four or more classes, it is very likely that they have begun to lose continuity. The highest Kappa index value (67.3%) was obtained during the delimitation of two management classes using data from the first crop season. When the two crop seasons were used in the cluster analysis, the best results were obtained for the delimitation using three management classes. Bazzi et al. (2013) used the chemical and physical soil attributes to delineate the management classes of a soybean area, and they found that a subdivision into two management classes presented better results due to the greater number of attributes that differed between classes.
When using data from the two crop seasons analyzed here, the highest Kappa index values were observed in the delineated classes using vegetation indices that presented a significant spatial correlation with the crop yield when the field was divided in two or three classes. The Kappa index obtained between the NDVI class map and the crop yield class map was higher when the four class division was used. However, subdividing the field into a larger number of classes makes the application of crop management practices difficult. Table 5 shows the values of the Fuzziness Performance Index and the Modified Partition Entropy Index, and the ideal number of classes is the one in which the two indices presented the minimum values. The management classes were delimited using data from both crop seasons as analyzed using the maps of soybean yield and vegetation indices that showed significant spatial correlations with the crop yield and NDVI. When using only the NDVI maps, the minimum FPI and MPE values occurred for two and three management classes (Table 5), respectively. For the other cluster analysis, the recommended number of classes was the same according to both indices, with three management classes when using the yield map and two management classes when using the vegetation index maps that had a significant correlation with the crop yield. Martínez-Casasnovas et al. (2018) obtained similar results when using the accumulated NDVI to generate management classes using the fuzzy cmeans algorithm. Figure 2 shows the resulting management class maps for the different input variable combinations. A map of three management classes for the crop yield is shown, and there is a map of two management classes for the vegetation indices that presented a significant correlation with the crop yield, and a map of two and three classes for the NDVI. While yield maps can highlight areas of high or low yield potential in the same field, vegetation indices, which are sensitive to different crop characteristics, can incorporate relevant information for crop management and can therefore be used to delimit management classes. Rabello et al. (2017) highlighted the potential use of remote sensing to delimit management classes, since the variability observed in an orbital image for maize demonstrated the same pattern observed for the apparent electrical conductivity and other soil attributes in a field that was explored using an integration between agriculture and crop-raising systems. The authors also noted that these results are consistent with the claim that the crop variability reflects the variability in soil properties in the area where the study was conducted. The descriptive statistics and ANOVA results for the clay content according to the different delimited management classes ( Figure 2) are shown in Table 6.
The comparison between the means of the soil clay content used to validate the cluster analysis showed that there was a significant difference among the delimited management classes when using vegetation indices that presented a significant correlation with the crop yield and with the management classes delimited using only the NDVI. The validation showed that a subdivision of the field into three classes did not produce significant differences in the soil clay content between them. Kuiawski et al. (2017) detected significant differences among the management classes for phosphorous, silt and clay when the management classes were delimited based on the terrain elevation and on the vegetation indices calculated using the reflectance measured using a spectroradiometer.
The use of multiple data layers is necessary to adequately describe the spatial variability in a field (Córdoba et al., 2016). In this study, the delimitation of management classes was performed using the vegetation index data. These data were easily accessible and are spatially correlated with the soybean yield. There are still gaps that require further investigation in relation to factors that affect the crop yield and that can be detected by vegetations indices, with the aim of improving the management class delimitation in soybean crops.

CONCLUSIONS
The MSR, GCVI and SIWSI vegetation indices showed a significant correlation with the yield, and thus, we were able to detect crop variability that is spatially correlated with the soybean yield. The Kappa index values resulting from the comparison of the management class maps generated from the soybean yield with management class maps delimited using vegetation indices ranged from 5% to 67%, depending on the number of classes. The subdivision of the plot into two management classes using vegetation indices as input variables was effective for finding areas with similar soil clay contents.