CROP AREA ESTIMATE FROM ORIGINAL AND SIMULATED SPATIAL RESOLUTION DATA AND LANDSCAPE METRICS

Images acquired at the same day by the ETM+/Landsat-7 (30 m of spatial resolution) and MODIS/Terra (250 m) sensors were used to estimate areas of three major crops (soybean, sugarcane, and corn) with different landscape patterns in Southeastern Brazil. Majority filtering of ETM + classification results was applied to describe the behavior of 15 selected landscape metrics at distinct simulated spatial resolutions (90, 150, 210 and 270 m). By using regression models, the performance of MODIS and derived metrics to predict adequately the crop area, considering ETM+ data as reference, were analyzed. Results showed that the MODIS instrument overestimated the areas of soybean (15%) and sugarcane (1%), and underestimated the area of corn (12%). Multiple regression results indicated that coarse spatial resolution sensors can be used to predict adequately the area viewed by the 30 m spatial resolution instruments only for crops with low fragmentation pattern such as soybean. These sensors cannot be used to predict the area of corn due to aggregation pixel effects of the less fragmented crops (soybean and sugarcane) over the most fragmented one (corn), as demonstrated by the spatial resolution simulation using majority filtering of the ETM+ image. Landscape metrics improved MODIS area estimates only for sugarcane, as indicated by higher values of R for multiple than for simple regression. Only a small set of metrics was select to compose the multiple regression models because most of them were not preserved across different spatial resolutions (30 m and 250 m).


INTRODUCTION
Remote sensing data can be used to obtain up-to-date information on different crops (Epiphanio et al., 2002;Luiz et al., 2002;Simões et al., 2005;Xavier et al., 2006).Although sensors with 30 m of spatial Sci.Agric.(Piracicaba, Braz.), v.65, n.5, p.459-467, September/October 2008 resolution (e.g., Enhanced Thematic Mapper (ETM+)/ Landsat-7) are adequate to estimate croped areas, they usually have a long revisit time of scene (e.g., 16 days) that limits the acquisition of cloud-free images in tropical regions on critical dates of crop development.On the other hand, coarse spatial resolution sensors such as MODIS (Moderate Resolution Imaging Spectroradiometer)/Terra with 250 m of resolution (bands 1 and 2) usually have a near daily global coverage.However, depending on the spatial pattern of the land cover under analysis, the coarse spatial resolution introduces differences in area estimates from the images in comparison with the 30 m resolution (Nelson & Holben, 1986;Moody, 1998).Errors of area estimation with coarse resolution sensors result from aggregation effects whose magnitude and scale-dependence are related to the proportions of the classes and their landscape pattern (Moody & Woodcock, 1994).The aggregation effects lead to changes in the size and shape of land cover patches and to the disappearance of small objects at critical thresholds of resolution (Mayaux & Lambin, 1995, 1997).
To quantify landscape composition from remote sensing data, a great number of metrics have been proposed.The effects of spatial resolution on the performance of these metrics have been studied by Turner et al. (1989), Saura (2002Saura ( , 2004)), Shen et al. (2004), Wu (2004), Wu et al. (2000Wu et al. ( , 2002)), Frohn & Hao (2006).According to these authors, many landscape metrics are highly correlated and others may not be suitable for direct comparison across different spatial resolutions.
Most of the studies that use coarse spatial resolution data to estimate land cover areas have analyzed deforested regions (Malingreau & Belward, 1992;Mayaux & Lambin, 1995, 1997;Moody & Woodcock, 1995;Moody, 1998;Ponzoni et al., 2002;Millington et al., 2003;Frohn & Hao, 2006).Only a few investigations have addressed the potential use of these data to estimate agricultural areas.Pax-Lenney & Woodcock (1997) degraded Thematic Mapper (TM)/ Landsat-5 images to simulate different spatial resolutions and observed that agricultural lands in Egypt were slightly underestimated at spatial resolutions of 120 m and 240 m when compared to the original 30 m TM/ Landsat-5 area estimates.
The objective of this study was to analyze the sensitivity of coarse spatial resolution data acquired by MODIS (250 m) to estimate areas viewed at the same day by the ETM+/Landsat-7 (30 m) of three major crops (soybean, sugarcane, and corn) with different landscape patterns.Majority filtering of ETM+ classification results was applied to describe the behavior of 15 selected landscape metrics with varying simulated spatial resolutions (30 m, 90 m, 150 m, 210 m, and 270 m).The performance of the 250 m MODIS images and derived metrics to predict adequately the crop area viewed by the 30 m ETM+ data was analyzed using regression models.

Study Area and Image Classification
The study area of 46 × 66 km (20º27' S and 47º56' W) is located in the north of the São Paulo state (southeastern Brazil) and includes the cities of Ipuã, Guará and São Joaquim da Barra (Figure 1).It was selected due to the occurrence of three important crops with distinct landscape patterns in farm size and agricultural activities: soybean (less fragmented), sugarcane (intermediate), and corn (more fragmented).
ETM+/Landsat-7 and MODIS/Terra (Product MOD09) data were acquired on January 5, 2002.The MODIS Reprojection Tool (MRT) was used to convert the sinusoidal projection into planar coordinates.Landsat data were acquired by Level 1G with geometric correction but Ground Control Points (GCPs) collected by GPS were also used for georeferencing the data.
During the classification phase unsupervised classification was used in order to reduce the effects of subjectivity on data analysis.The k-means technique (Mather, 1999) was applied only over the red and nearinfrared bands of both sensors to reduce effects of spectral resolution differences on classification results.The number of classes was set to 15 and the maximum number of iterations to 20, as suggested by Frohn & Hao (2006).At the end, classification results were grouped into four classes: soybean, sugarcane, corn and "other" (not considered in statistical analysis).The existence of previous works in the study area (Epiphanio et al, 2002;Luiz et al., 2002;Sanches, 2004) facilitated also the evaluation of classification results through their comparison with available ground truth information.

Landscape Metrics and Spatial Resolution
A great number of metrics have been proposed but most of them can be reduced into a few general measures of landscape pattern and structure (Ritters et al., 1995).Based on Saura (2004), Wu et al. (2002) and Frohn & Hao (2006) the 15 most frequently used landscape metrics were selected for analysis.Table 1 shows them and provides an indication of the type of measurement of each.A detailed description of these metrics can be found in McGarigal & Marks (1995) and Coppedge et al. (2001) are related to class adjacency or intermixing effects.Finally, the shape metric indicates the relationship between area and perimeter of the class polygons, whereas the connectivity metric expresses the level of cohesion between them.All metrics were calculated using the Fragstats software (McGarigal & Marks, 1995) over unsupervised k-means classification of original and simulated ETM+ and MODIS images.For the calculation of core area metrics, an edge depth of two pixels was chosen.
A very useful method to analyze the behavior of landscape metrics with change in spatial resolution is to degrade maps by using a majority rule filter.In this study, majority filtering, a common procedure used to scale landscape maps to coarser spatial resolutions (e.g., Saura, 2004;Frohn & Hao, 2006), was applied over unsupervised classification results of the ETM+/ Landsat-7 image using window sizes of 3 × 3, 5 × 5, 7 × 7 and 9 × 9 pixels.In practice, this procedure was equivalent to degrade progressively the thematic classification map from 30 m of spatial resolution to 90, 150, 210, and 270 m, respectively.After calculating the 15 metrics with the Fragstats software, results were plotted and analyzed as a function of the four spatial resolution simulation.

Regression Analysis of Original ETM+ and MODIS Data
Regression analysis provides a means for assessing the relationships between landscape pattern and errors in the estimates of land cover areas as land cover data are aggregated to coarser scales (Moody & Woodcock, 1995).In this study, to predict the area of crops viewed by the ETM+ 30 m from MODIS 250 m data and to improve area estimates from coarse resolution data, multiple regression analysis was used.Area and the other remaining 14 metrics (independent or explanatory variables) calculated from MODIS data were plotted against area values obtained from ETM+ (response or dependent variable).The following criteria were used to select the best subset of metrics to compose a multiple regression model: Mallow's Cp, R 2 p (R-squared) and R 2 a (adjusted R-squared) (Neter et al., 1996).Before the selection of variables, an explanatory analysis was carried out to test data normality (Shapiro & Wilk, 1965) and to detect outliers (DFFITS) and multicollinearity (Variation Inflation Factor -VIF) between the variables.Based on this explanatory analysis, the need of variable transformations (e.g., square root of values) was considered in the regression procedure.Analysis of Variance (ANOVA) was performed and the Levene test was used to analyze the variance of the residuals.
For the calculation of the metrics, including area, the MODIS and ETM+ images were divided into a regular grid (15 × 15 pixels for MODIS; 125 × 125 pixels for ETM+) of 648 cells (216 cells per crop), following the procedure described by Mayaux & Lambin (1995).From the total of 648 cells, 528 cells were randomly selected to obtain the general (all crops together) and specific (176 cells per crop) regression models.The remaining 120 cells (40 per crop) were used to validate the models.At the end, a t-test was applied to evaluate the statistical significance of the models (0.95 confidence level).

Unsupervised Classification and Landscape Metrics
Unsupervised k-means classification results of the ETM+ image for the three crops under analysis are shown in Figure 1.The class "Other" is dominated by pasture, bare soils and natural vegetation cover.Soybean, sugarcane and corn represented 87% of the cultivated lands of the study area in the date of image acquisition.In comparison with sugarcane and soybean, corn occurs in very small fields (Figure 1).Average measures of landscape composition and dominance (TCA and LPI), aggregation (PLADJ), homogeneity (MESH) and connectedness (COHE) decreased from soybean/sugarcane to corn since this was the most fragmented class (Table 2).The largest SPLIT values were observed for corn.The less fragmented crop was soybean, whereas corn was the most fragmented one (Table 2).Sugarcane is the dominant crop in the area, presenting larger average area values in the cells than the other two crops.

Landscape Metrics and Spatial Resolution
To facilitate the graphic representation of pixel aggregation effects with degraded spatial resolution, Figure 2 presents majority filtering results of the ETM+ unsupervised classification map only for the small inset area (red rectangle) indicated in Figure 1.From 30 m to 270 m, aggregation effects of some large crop polygons that clumped others and became less fragmented can be observed.In general, small polygons of corn became gradually smaller and disappeared completely at coarser spatial resolutions.Such results are consistent with previous investigations (e.g., Mayeaux & Lambin, 1997;Moody, 1998;Frohn & Hao, 2006), which have shown that the less fragmented classes (soybean and sugarcane in this study) tend to aggregate the more fragmented ones (corn).
The behavior of five metrics representative of different categories (Table 1) with degraded spatial resolution of ETM+ unsupervised classification map is illustrated in Figure 3. From 30 m to 270 m, aggregation effects were stronger for sugarcane and soybean, which encompassed great part of the area of corn at the spatial resolution of 270 m.TCA values, the sum of all core area polygons of a given class, were abruptly reduced at 90 m for all crops, especially for the most fragmented one (corn).Adjacency effects between pixels of the same crop, expressed by PLADJ values, were stronger for soybean than for corn at all spatial resolutions.PAFRAC results indicated a higher complexity of polygon shape for sugarcane than for soybean, which presented an overall decline in the degree of complexity with degraded spatial resolution.Finally, the measure of class connectivity (COHE) presented the lowest value at 30 m of spatial resolution for corn, which increased towards the 270 m resolution due to incorporation of small corn polygons into areas of the other crops.

Regression Analysis of ETM+ and MODIS Data
The comparison between crop area estimates from ETM+ and MODIS data is shown in Figure 4. MODIS overestimated the areas of soybean (15%) and sugarcane (1%) viewed by ETM+ and underestimated the area of corn (12%) due to aggregation pixels effects of soybean and sugarcane over corn.
The relationships between area estimates from both sensors are illustrated in Figure 5a for all crops and in Figures 5b, 5c and 5d for corn, sugarcane and soybean, respectively.Results refer to the 528 cells selected to obtain the regression model.Since the Shapiro-Wilk test indicated non-normality of the variable area, results were expressed in square root of area, that presented a normal distribution after this transformation.The overall relationship of Figure 5a is in fact a superposition of distinct relationships for each crop (Figures 5b, 5c and 5d).Thus, each crop is differently affected by scaling up, as indicated by the three clusters of symbols in Figure 5a.The coefficients of determination (R 2 ) decreased from the less fragmented (soybean; Figure 5d) to the most fragmented crop (corn; Figure 5b), which anticipated the difficulties of MODIS (250 m) to estimate adequately corn areas in relation to the ETM+ (30 m) performance.
Statistical parameters for the general (all crops) and specific (per crop) multiple regression models are presented in Table 3.Besides the square root of area,   the following MODIS-derived metrics were selected based on the Mallow's Cp, R 2 p (R-squared) and R 2 a (adjusted R-squared) criteria: PLADJ (general model); PAFRAC and COHE (soybean); CLUMP, IJI and NLSI (sugarcane); IJI and NLSI (corn).Thus, from the 15 metrics considered in the analysis, besides area, only six were selected to compose the models.An inspection of the correlation coefficients between MODISand ETM+ derived metrics had correlation values lower than 0.5 for the metrics that did not enter in the models.Thus, in general, they were not preserved across    3 were statistically significant and neither influential outliers nor inconstancy in variance were observed.The validation process with 120 cells indicated also the reliability of the models with most of the points close to the 1:1 line.Figure 6 allows a better comparison of the role played by the landscape metrics to improve area-based regression models.For the less fragmented crop (soybean), the R 2 values slightly improved from simple  (Area_ETM+ and Area_MODIS) to multiple (Area_ETM+ and Area_MODIS plus Metrics) regression models.Among the three crops, soybean presented the largest R 2 values (0.865 and 0.878).This result indicated that MODIS (250 m of spatial resolution) can be used to estimate adequately soybean areas when compared with ETM+ (30 m) without the need of other metrics than area.The best results of the use of metrics (CLUMP, IJI and NLSI) were observed for sugarcane, which presented an improvement of R 2 values from 0.593 (simple regression model) to 0.671 (multiple regression model).For the most fragmented crop (corn), R 2 values were still low for the simple and multiple regression models.This indicated the low performance of MODIS and their landscape metrics to improve area estimation corn compared with ETM+.

FINAL REMARKS
In relation to the ETM+ area estimates, the MODIS instrument overestimated the areas of soybean and sugarcane, and underestimated that of corn.Coarse spatial resolution sensors (e.g., MODIS/Terra, 250 m) can be used to predict agricultural areas with similar precision to the 30 m spatial resolution instruments (e.g., ETM+/Landsat-7) only for crops with low fragmentation pattern such as soybean.This crop pre-  The use of landscape metrics improved MO-DIS area estimates using multiple regression only for sugarcane (intermediary fragmentation pattern), as expressed by an increase in R 2 values from 0.593 (Area_ETM+ versus Area_MODIS in simple regression) to 0.671 (Area_ETM+ versus Area_MODIS plus Metrics in multiple regression).The performance of the metrics was also poor for corn, the most fragmented crop, whose area cannot adequately be estimated by coarse spatial resolution sensors.Less fragmented crops (soybean and sugarcane) aggregated the most fragmented one (corn), as also indicated by the results of spatial resolution simulation with majority filtering of ETM+ images.
From the 15 metrics under analysis, besides area, only six were statistically selected to compose the regression models.Some metrics were highly correlated and others were not preserved across different spatial resolutions, as indicated by low correlation values between MODIS and ETM+ derived metrics, the most important factor of variable exclusion in the regression models.

Figure 1 -
Figure 1 -Unsupervised k-mean classification of the ETM+/Landsat-7 image for the three major crops under analysis.The red inset area was enlarged in Figure 2 to show in more detail the aggregation effects due to degraded spatial resolution.

Figure 3 -
Figure 3 -Variations in metrics of distinct categories with different spatial resolutions simulated using majority filtering from unsupervised classification map of ETM+/Landsat-7 data.

Figure 5 -
Figure 5 -Relationships between square root of the area calculated from ETM+/Landsat-7 and MODIS/Terra images for (a) all crops; (b) corn; (c) sugarcane; and (d) soybean.The relationships are significant for p < 0.001.

Figure 6 -
Figure 6 -Variations of coefficient of the determination (R 2 ) for simple (Area_ETM+ versus Area_MODIS) and multiple (Area_ETM+ versus Area_MODIS plus Metrics) regression models for the three crops under analysis.

Table 1 -
List of the 15 landscape metrics evaluated in this study.

Table 2 -
Average and standard deviation of the 15 metrics calculated from MODIS/Terra data, 216 cells per crop.

Table 3 -
Multiple regression parameters for the general (all crops) and specific multiple regression models.