SPATIAL AUTOCORRELATION OF NDVI AND GVI INDICES DERIVED FROM LANDSAT / TM IMAGES FOR SOYBEAN CROPS IN THE WESTERN OF THE STATE OF PARANÁ IN 2004 / 2005 CROP SEASON

1 Matemático, Mestre em Eng. Agrícola, Prof. Assistente, UTFPR, Toledo-PR, Fone: (45) 3379 6800; gustavodalposso@utfpr.edu.br. 2 Estatístico, Dr. em Estatística, Pesquisador de Produtividade do CNPq, Prof. Associado do PGEAGRI, UNIOESTE, CascavelPR, Fone: (45) 3220-3228; Miguel.opazo@unioeste.br. 3 Dr. Engenharia Agrícola, Pesquisador de Produtividade CNPq, Prof. Adjunto do PGEAGRI, UNIOESTE, Cascavel-PR, erivelto.mercante@unioeste.br. 4 Dr. em Geoprocessamento, Pesquisador de Produtividade do CNPq, Pesquisador, UNICAMP, Campina-SP. lamparel@unicamp.br. Recebido pelo Conselho Editorial em: 16-12-2011 Aprovado pelo Conselho Editorial em: 6-8-2012 Eng. Agríc., Jaboticabal, v.33, n.3, p.525-537, maio/jun. 2013 SPATIAL AUTOCORRELATION OF NDVI AND GVI INDICES DERIVED FROM LANDSAT/TM IMAGES FOR SOYBEAN CROPS IN THE WESTERN OF THE STATE OF PARANÁ IN 2004/2005 CROP SEASON


INTRODUCTION
Brazil occupies a prominent position among the soybean producers, second behind the United States.Soybeans, as a commercial crop, came to the State of Paraná in the mid-1950s.In 2010/2011 crop season, it was estimated a production of 13 million tons.In 2004/2005 crop season, Paraná produced 9.7 million tons while Mato Grosso, the major national producer, contributed with 17.9 million tons (CONAB, 2010).
One of the alternatives to estimate agricultural production at regional and national levels is the use of remote sensing technology.According to CARVALHO et al. (2008), remotely sensed data is a regular source of information and it is important for the systematic monitoring of the vegetation dynamics.
Many algorithms have been developed for the remote estimation of biophysical characteristics of vegetation and the most widespread type used is the mathematical combination of visible and near-infrared reflectance bands, in the form of spectral vegetation indices.Applications of such vegetation indices have ranged from leaves to the entire globe, but in many instances, their applicability is specific to species, vegetation types or local conditions (VIÑA et al., 2011).
The Normalized Difference Vegetation Index-NDVI, computed as (NIR -Red)/(NIR + Red) where NIR and Red are the amount of near-infrared and red light, respectively, reflected by a surface and measured by satellite sensors, is the most widely utilized index (PARISE & VETTORAZZI, 2005).According to KIM et al. (2010), vegetation indices such as NDVI, are widely used in long-term measurement studies of vegetation changes, including seasonal vegetation activity and interannual vegetation-climate interactions.The utility of the NDVI index has been demonstrated in various areas, such as precision agriculture (PARISE & VETTORAZZI, 2005), study of the vegetation phenology (CARVALHO et al., 2008) and monitoring rice cultivation areas (CHEN et al., 2011).
The tasseled cap transformation is another well-known vegetation index, originally generated by using Landsat data set (SONG & WOODCOOK, 2003).The tasseled cap transformation turns original, highly covariant data into three uncorrelated indices called brightness, greenness, and wetness (SHENG et al., 2011).The greenness image or green vegetation index (GVI) presents high values in targets with high density of green vegetation, being obtained according to Equation (1), in which B j represents the j-th band of the Thematic Mapper sensor of the Landsat-5 satellite.
The utility of this transformation can be reported in studies to distinguish differences in the vegetation, in the turgidity conditions between the different types of vegetation, and in the recognition of the various stages of development of agricultural cultures in relation to other elements (BRITES et al., 1996).
Techniques of spatial statistics present three basic elements: spatial proximity matrix (W) nxn , deviation vector (Z) nx1 and weighted mean vector (Wz) nx1 .The spatial proximity matrix or neighborhood matrix is a useful tool to describe the spatial arrangement of objects (SANTOS & RAIA JUNIOR, 2006).According to a set of n areas {A 1 ,...,A n }, the matrix W = [(w ij )] nxn is built, where each w ij element represents a measure of proximity between A i and A j , so that w ij = 1 if A i shares a common side with A j ; if otherwise, w ij = 0.
The use of spatial proximity matrices that consider contiguity is important to define the form of neighborhood, which can be rook, queen or bishop.These criteria are based on the movements of the pieces of chess game.Thus, considering any A i area, all the areas that have non-null intersection with the A i area, will be considered its neighbor according to the queen criterion.Considering the rook criterion, only the ones having a common side with A i will be its neighbors.For the bishop criterion, the A i neighboring areas will be those which ones are in diagonal with A i .In literature, it is possible to find variations of these contiguity schemes, for example, the union of the rook and queen criteria, which receives different values according to the criterion (SHORTRIDGE, 2007).
The global Moran I index is used to assess the correlation among neighboring observations and to identify patterns and levels of spatial clustering in neighboring districts.Local Indicators of Spatial Autocorrelation (LISA) statistics (ANSELIN, 1995) provides information related to the location of spatial clusters and outliers and the types of spatial correlation.Local statistics are important because the magnitude of spatial autocorrelation is not necessarily regular over the study area (TSAI & PERNG, 2011).
The use of Global Moran's I and LISA indices is crucial in agricultural studies involving the relationship between municipalities because, as reported by MARCONATO et al. (2012), they can analyze the configuration of the geographical distribution of agricultural fields.In Brazil, because of its continental dimension, it is extremely important to identify areas of low or high technological developments, and their relations with other spatial configurations, such as transportation infrastructure or size distribution of farms.
In this context, the objective of this research is to use the spatial statistics to study the spectral traits of soybean biomass in 36 municipalities of western Paraná from the analysis of NDVI and GVI derived from TM/Landsat-5 data converted into surface reflectance.In order to obtain better understanding of the spatial dynamics of the NDVI and GVI in the studied region, resources of the explanatory analysis of spatial data (ESDA) were used.The ESDA is defined as a collection of techniques that describe and visualize spatial distributions, identify unique situations, find spatial association standards, group similar values (clusters) and suggest spatial regimes or other forms of spatial heterogeneity (ANSELIN & BAO, 1997).In this study, the criterion used in the construction of the neighborhood matrix was the queen neighborhood, because municipalities are non-uniform shapes and often share borders in all possible directions.

MATERIAL AND METHODS
Each i area is associated to a real number (x i ), which represents the value of the vegetation index in the i area.For the calculation of the vector of the Z deviations, it is necessary to calculate, first, the average (μ) of the vegetation indices, considering the n areas.Each i element of the vector of the deviations, called z i , is obtained by subtracting the average value, from the corresponding attribute value, that is, The vector of weighted averages (Wz) is obtained multiplying the transposed vector of the deviations by the spatial proximity matrix with normalized lines, where each element of any I line, originally sets to 1, is divided by the number of non-null elements of the same line.As result, each wz i element indicates a value corresponding to the average of the neighboring values of the i area, characterizing a spatial mobile average (DRUCK et al., 2004).
The verification of the spatial dependence was carried out by Moran's I index (Equation 2), which is a global indicator of the spatial autocorrelation and it shows how the values are spatially correlated, where w ij indicates the ij element of the spatial proximity matrix with normalized lines, x i represents the value of the vegetation index in the i area and μ represents the average of the vegetation indices.

∑ ∑∑
Moran's I index leads itself to a statistical test of hypotheses whose null hypothesis is the spatial independence (H 0 : I = 0 vs H 1 : I ≠ 0).The positive values of Moran's I index (between 0 and +1) indicate positive correlation, in other words, areas which present high (low) values of an attribute are surrounded by areas with high (low) values and the negative values of Moran's I index (between -1 and 0) represent negative spatial correlation, that is, areas with high (low) values of an attribute are surrounded by areas with low (high) values.
To test the significance of Moran's I index, an empirical reference distribution is generated, randomizing the values of the vegetation indices in the areas and calculating a new value of the index for each permutation made.As just one of the arrangements corresponds to the observed situation, if the value of the measured index corresponds to an "extreme" of this distribution, then it deals with a value with statistical significance.
To complement the studies of the spatial autocorrelation, Moran scatter plots were created, which are charts that help the interpretation of Moran's I index and also studies were made at the local level by utilizing the LISA statistics (ANSELIN, 1995).As Moran's I index represents the angular coefficient of the regression straight line of Wz (neighbors' average) variable by virtue of the Z deviations, a graph can be built, called Moran scatter plot, to visualize the spatial association between the deviations and the average of the values of their neighbors.This chart is divided into four quadrants, AA (Z positive and Wz positive) and BB (Z negative and Wz negative), which indicate positive spatial association points, in the sense that a localization has neighbors with similar values and AB (Z positive and Wz negative) and BA (Z negative and Wz positive), which indicate negative spatial association points (DRUCK et al., 2004).
When one works with a large number of areas, it is very likely that there occur different spatial association regimes, which makes convenient the use of techniques which disaggregate the global statistics according to their local constituents.The LISA, presented in Equation (3), is a statistics that produces a specific value for each area, allowing evaluating the individual contribution of each observation for the global statistics. (3) Values of I i statistically different from 0, indicate that the i area is spatially associated to their neighbors.The statistical significance of Moran's local index is calculated similarly to the case of the global index.For each area, the local index is calculated, and afterwards the value of the other areas is randomly permuted, until obtaining a pseudo-distribution for which the significance parameters can be computed.
Once the statistical significance of the LISA index is determined, it is important to generate the LISA map, indicating the regions that present local correlation significantly different from the rest of the areas.The combination of the Moran scatter plot with the LISA map permits the creation of LISA cluster map.In this way, the regions detected as significant at the LISA map are classified according to their location in the Moran scatter plot.It is possible to observe differences in the planting dates, since some municipalities have their cycles later than the others.Figure 3 represents the temporal evolution of the GVI index, in which the classes were calculated similarly to the NDVI index.FIGURE 3. Temporal evolution of the GVI index by using equal amplitude.

RESULTS AND DISCUSSION
The temporal evolution of the GVI index presents a similar behavior in relation to the evolution of the NDVI index.The GVI values are low in the beginning of crop season and high in the end of crop season.A certain insensitivity of the index on 11/23/04 was noticed, since 97.2% of the municipalities presented GVI values in the 0.0513-0.1170interval.Therefore, one cannot identify which municipalities present an advanced plantation.
Table 1 presents Moran's I index of the NDVI and GVI indices with its respective descriptive level (p-value) referring to the hypothesis of lack of spatial correlation.For each date, it was generated empirical distributions by 99 independent chosen permutations and with an equal probability among all the possible permutations of the indices of studied municipalities.Regarding the NDVI, for all the dates considered, the Moran's I index was higher than zero, indicating a positive correlation, that is, the studied region presents municipalities with high and/or low NDVI indices surrounded by municipalities with the same situation.Regarding the GVI index, for all the dates, the calculated indices were positive and significant, indicating the presence of clusters with similar value.The descriptive levels (p-values) referring to all the dates under study were lower in relation to the pre-defined levels of significances (0.05 of probability), showing that the positive spatial correlations of the NDVI and GVI indices on the monitored dates are significant.The comparison of Moran's I index of the NDVI and GVI indices indicates that the behavior of both is similar.However, in the three first monitored dates, the NDVI values were higher than those from GVI.The opposite happens for the last two dates, in which the Moran's I index of the GVI was higher.Figure 4  In Figure 4, each point represents a municipality of the studied region.Considering the date 11/23/2004, most of the municipalities are in the AA and BB quadrants, regardless of vegetation index, which indicates positive spatial correlation.Considering the date 12/25/04, the municipalities previously located in the AA and BB quadrants moved to the other quadrants, evidencing a transition period.This behavior is due to the fact that, in regions where the indices are high, some municipalities start presenting lower indices, because of the passage of the vegetative peak.In opposition, municipalities with late planting dates presented high vegetation values, by virtue of being reaching the stage of bigger biomass.For a more detailed investigation, it is quantified the percentage of municipalities allocated in each quadrant (Table 2).Points in the AB and BA quadrants of the Moran scatter plot represent municipalities with negative spatial association, that is, they are understood as municipalities with a high (low) vegetation index surrounded by municipalities with a low (high) vegetation index.Observing the NDVI, it is possible to verify that the date that presents the greatest percentage of municipalities allocated in the AB and BA quadrants is 12/25/04, with 27.78% of the municipalities.The same occurred for the GVI, however, the percentage of municipalities allocated in the quadrants that represent an inverted correlation is 33.34%.Therefore, the GVI index is more effective in detecting the transition period.Considering the last date (02/11/2005), for the NDVI index, the percentage of municipalities found in the AA quadrant was higher than the percentage of municipalities found in the BB quadrant.Thus, there is a greater amount of municipalities with a high NDVI surrounded by municipalities with a high NDVI than municipalities with a low NDVI surrounded by municipalities with a low NDVI.This behavior is inverted when we consider the GVI index, since for the same date the percentage of municipalities allocated in AA is smaller than the percentage of municipalities allocated in BB.This behavior can be explained by the fact that the GVI index considers more TM bands, for example, band 5 (1.55 -1.75 μm), which is sensitive to the plant water content, and band 7 (2.08 -2.35 μm), which is sensitive to the morphology of the land.
Moran's I index is a spatial association measurement for all studied regions (36 municipalities).Therefore, it is possible that there occur different spatial association regimes.In this case, it is necessary to complement the analysis with the use of an index that may produce a specific value for each municipality, which will permit the identification of groups.Table 3 presents the LISA of the NDVI index for all monitored dates accompanied by their classification as the statistical significance.ANSELIN (1995), the interpretation of the I i local index is similar to the interpretation of the global index.Positive values of I i indicate a spatial aggregation of similar values, while negative values of this statistic indicate a spatial aggregation of dissimilar values.According to Table 3, in the date 12/25/04, only seven indices were classified as significant.This can be explained by the fact that soy is in different development phases in the region and that this is a transition period, considering that, in the regions in which plants initially have presented the leaves in development, the pods have started to present the color of a mature pod, and, in the regions with late plantations, the plants have started to present a full filling of the pods.Table 4 presents the I i indices referring to the GVI vegetation indices.The date 12/25/04 presents a great amount of non-significant indices.The municipalities of Anahy, Goioerê, Iguatu, Janiópolis and Tuneiras do Oeste showed significant spatial correlation indices.For a better understanding of the local index of spatial association I i , it is convenient to present it in a thematic map that indicates the municipalities presenting significant spatial association.According to DRUCK et al. (2004), these regions can be seen as non-stationarity concentrations.They are areas with spatial dynamics of their own and deserve a more detailed analysis.
Figure 5 represents the LISA significance map of the NDVI index of soy in 2004/2005 crop in the monitored region.The legend of this map is made by three classes.The first indicates the municipalities in which the I i index is not significant, the second indicates the municipalities in which the index is significant at the level 5% of probability and the last class indicates the municipalities in which the index is significant at the level 1% of probability.
Observing the Figure 5, it is possible to identify the behavior of the NDVI index during the analyzed period.On 11/23/2004, the presence of a group of municipalities which starts around Cascavel and extends itself to the northwestern region until it reaches the northeastern region.This group indicates the municipalities which present a significant spatial correlation.There is a decreasing trend until almost disappears on 12/25/04, the period in which the development of the crop is under transition.On the last monitored date (02/11/05), the existence of two groups of municipalities the NDVI vegetation index is spatially correlated is evident.Figure 6  In Figure 6, the low number of municipalities with significant indices on 12/25/04 and the presence of two groups of municipalities with significant indices on the other monitored dates are highlighted.In Figure 7, the LISA significance map of the NDVI index now classified according to the legend of the Moran scatter plot is highlighted.This map is known as LISA cluster map and permits to identify which type of correlation there is between the municipalities identified as significant.The date 11/23/04 in Figure 7 shows that the cluster of municipalities, which presents a significant spatial correlation, is made up of two groups with different characteristics.The first group, located in the region of Cascavel and Toledo, indicates the municipalities in which the NDVI is high and surrounded by municipalities the values are also high.The other group, in the Araruna and Tuneiras do Oeste regions, indicates the municipalities in which the NDVI index is high and surrounded by low-index municipalities.This behavior is inverted on the last monitored date, which makes it clear that soy is not cultivated in the same period in the western region of the State of Paraná.While the crop development in the Cascavel region is in the maturation stage, the soy in the northern region is still in the pod filling stage, reaching its vegetative peak.
Figure 8 presents the LISA cluster map elaborated on the basis of the GVI index.The behavior of the GVI vegetation index is similar to those from the NDVI index, since, observing Figure 8, we can notice two regions with soy plantation occurring on different dates.In both maps (Figures 7 and  8), there is little appearance of municipalities classified in the categories referring to the inverted correlations.However, when this occurs, it is interesting to investigate in more detail.For example, considering the date 01/26/05 on Figure 7, we can observe that, in spite of the northern region presents a cluster of municipalities with low NDVI values, the municipality of Peabiru presents high value, indicating that, in this municipality, soy is in a more advanced stage.A similar analysis is made considering Figure 8 on the date 01/26/05, in spite of the region of Cascavel presents high GVI values, one verifies that the city of Toledo presents a low value, indicating that, in this municipality, soy is in a less developed stage that in the others.
As noted previously, we detected a relationship between vegetation indices and the different stages of the soybean phenological cycle.These results corroborate the results found by ESQUERDO et al. (2011)

CONCLUSION
From the analyses, we found that spatial analysis of vegetation indices provide knowledge for farmers needing to monitor crop growth and identify regions with similar characteristics, which, on its turn, information can be used for the participants of the productive chain of soy, permitting that producers, cooperatives, consumers and other are more informed, so they can better prepare their plans for the strengthening of the culture.
Nonetheless, more study is needed to determine the best strategy for correlating yield maps with vegetation indices, for example, the use of new vegetation indices and new methodologies for image calibration to improve the reliability of interpretations for technology applications in precision farming.

Figure 1
Figure 1 presents the 36 municipalities considered in this study.They are located in the region of the biggest production of soy within the scene of the Landsat-5/TM satellite, path/row 223/77.A Spatial database was built with these municipalities based on the digital map of the municipal network (IBGE, 2009) and the set of NDVI and GVI indices of the 2004/2005 crop season obtained by MERCANTE (2007) on 11/23/2004, 12/09/2004, 12/25/2004, 01/26/2005, and 02/11/2005.The values of these indices were extracted using a soybean mask elaborated by rating images.Further details of this procedure are shown in LAMPARELLI et al. (2008).

FIGURE 1 .
FIGURE 1. Location of the municipalities studied in the western region of the State of Paraná.

Figure 2
Figure 2 represents the temporal evolution of the NDVI index.The intervals were defined by grouping the NDVI values of all dates to obtain the maximum (0.8750), the minimum (0.2609) and the general amplitude (0.61403) of the indices.Five equal interval classes were considered.The temporal evolution of the NDVI index is related to the different phases of the phenological cycle of the crop.Lowest values of NDVI were observed on 11/23/04, where the crop is in the beginning of development.Highest values were found between 01/26/05 and 02/11/05, the period of greater development of the plants.

FIGURE 2 .
FIGURE 2. Temporal evolution of the NDVI index by using equal amplitude.
FIGURE 5. LISA significance map of the NDVI index of soy in the 2004/2005 crop.Only the values with level of significance less than 5% are shown.

FIGURE 6 .
FIGURE 6. LISA significance map of the GVI index of soy in the 2004/2005 crop.Only the values with level of significance less than 5% are shown.

FIGURE 7 .
FIGURE 7. LISA cluster map of the NDVI index of soy in the 2004/2005 crop.

FIGURE 8 .
FIGURE 8. LISA cluster map of the GVI index of soy in the 2004/2005 crop.
to monitor the soybean production in municipal districts of the western Paraná in the 2002/2003 and 2003/2004 harvests.They concluded that monitoring by NDVI index could explain most of the variability of soybeans at the municipal level.WALL et al. (2008) also worked with the index NDVI to model the wheat productivity in 40 regions in the Canadian prairies, and this research highlighted the explanatory power of this index.

TABLE 1 .
Moran's I index of the NDVI and GVI indices.
*Hypothesis of lack of spatial correlation is rejected at the level of 5% of probabilities.

TABLE 2
. Percentages of municipalities in each quadrant of the Moran scatter plot of the NDVI and GVI indices.

TABLE 3 .
Local index of spatial association of the NDVI.

TABLE 4 .
Local index of spatial association of the GVI on the monitored dates.