DETECTION OF SOYBEAN PLANTED AREAS THROUGH ORBITAL IMAGES BASED ON CULTURE SPECTRAL DYNAMICS

The soybean is important to the economy of Brazil, so the estimation of the planted area and the production with higher antecedence and reliability becomes essential. Techniques related to Remote Sensing may help to obtain this information at lower cost and less subjectivity in relation to traditional surveys. The aim of this study is to estimate the planted area with soybean culture in the crop of 2008/2009 in cities in the west of the state of Paraná, in Brazil, based on the spectral dynamics of the culture and through the use of the specific system of analysis for images of Landsat 5/TM satellite. The obtained results were satisfactory, because the classification supervised by Maximum Verisimilitude MaxVer along with the techniques of the specific system of analysis for satellite images has allowed an estimate of soybean planted area (soybean mask), obtaining values of the metrics of Global Accuracy with an average of 79.05% and Kappa Index over 63.50% in all cities. The monitoring of a reference area was of great importance for determining the vegetative phase in which the culture is more different from the other targets, facilitating the choice of training samples (ROIs) and avoiding misclassifications.


INTRODUCTION
The soybean is a fundamental part of the development of various regions in Brazil and the soybean culture has a significant involvement in the supply and demand for products of the agroindustrial complex.Brazil is the world second largest producer of soybeans with an equivalent volume of 69 million tons, which corresponds to 26.5% of world production; the country export 28.45 million tons of soybeans, occupying the second place in exportation volume of the product in the world (USDA, 2010).According to the data of IBGE (2011), the state of Paraná is the second largest producer of soybeans in the country, with 14 million tons of the product in the crop of 2009/2010, and the West and Midwest regions of the state are responsible for 35% of this production.
Of the major oleaginous producers, Brazil is one of those with the best scenario to expand production beyond the large areas that can be exploited by the culture, the raise also occurs due to the increase of competitive capacity of Brazilian soybeans, as well as to scientific advances and availability of technologies in the industry.In Brazil the main producing areas are in the South and Midwest of the country (LAZZAROTTO & HIRAKURI, 2010).
Because of the importance of the culture to the economy of the country, frequent and appropriate information about the planted area and the productivity of the culture are indispensable for the supply and trade of the grain, as well as for producers to decide more safely about the planting season and commercialization of the production goods.Thus, the planted area is one of the main information involved in the crop forecast.
The monitoring and forecast of soybean crops, as well as other cultures, are traditionally subsidized by empirical surveys conducted by entities attended with the agricultural production.The collect data of the cities, states and country are subsequently grouped considering the entire extension of the country.Despite the importance of these data for the economy, most of the time, the subjectivity in the evaluation interviews, determines a certain degree of uncertainty in the generated information (EPIPHANIO et al., 2002).
In the last years, the crop forecast in Brazil has been changed in order to become less subjective.Because of this, statistical instruments of sampling and monitoring of cultures has been aggregated.The remote sensing is one of the instruments that have been used for monitoring the cultures.The papers report the use of satellite images in order to provide a synoptic view of planted areas, generating information of planting sites, estimates of planted area and production (ESQUERDO, 2007;YI et al., 2007;LAMPARELLI et al., 2008;ARAUJO et al., 2011).Thus, the final aim is to obtain the estimates of planted area and production with higher antecedence and reliability, preferably with cost and subjectivity reduction comparing to the traditional surveys.
However, there are some difficulties associated with the use of satellite images especially in the estimation of planted areas with agricultural cultures.Besides the occasional presence of clouds, the spatial resolution of the data used, which in some cases are kilometric, has also variations of the agricultural calendar, and even in areas considered climatologically homogeneous a culture can have months apart at planting date (KASTENS et al., 2005).However, a detailed exploration of the techniques related to specific system of analysis for satellite images, available in geographic information systems -GIS, and with the knowledge of the spectral behavior of soybean, able to characterize the different phases of phenological development of this culture (MERCANTE et al., 2009), may help to minimize some of these difficulties.
In this context, the aim of this study is to estimate the planted area with soybean during the crop of 2008/2009 in cities of the west of Paraná state, based on spectral dynamics of the culture and through the specific system of analysis for images of Landsat 5/TM satellite.

MATERIAL AND METHODS
The study was conducted in nine cities located between latitudes 24º18'S and 25º22'S and longitudes 52º55'W and 54º02'W, concomitantly it was monitored a commercial agricultural area of 300 ha located in Cascavel, State of Paraná, Brazil (Figure 1).The cartographic information of the cities boundaries of the study area were obtained from the Brazilian Institute of Geography and Statistics (IBGE), data referring to the year of 2001 in DXF format.
The choice of these cities was characterized by two factors.First, they are in the scene (orbit/point) 223/77 of the Landsat 5/TM satellite used for the research, and second, because these cities belongs to the same grain producing region, near to the town of Cascavel-PR, where are the agricultural area of 300 ha, used as the basis of the study.Thus, the monitored cities tend to have the same pattern of agricultural cultivation in the summer crop, regarding to the dates of planting and harvesting of the crop, an essential factor for the use of the monitored area, as a methodological referential for monitoring the culture spectral dynamics during their vegetative cycle.In the region where are the cities the soybean planting for the summer crop, occurs mostly from mid October until the first half of December.However, in order to the soybean crop be in the best conditions for development, it is recommended to be planted, between the first half of November until the first half of December, according to the agricultural zoning of AGRITEMPO (2011).
According to the classification of Köeppen, the climate is defined as subtropical CFA variation, with average temperature in the coldest month below 18 ºC (mesothermal) and average temperature in the warmest month above 22 ºC, with hot summers, frosts infrequent and tendency of rainfall concentration during the summer months, however without a defined dry season (IAPAR, 2011).In the region studied are mainly the soil types Red Latosol and Red Nitosol (IBGE, 2011).
It was used Landsat 5/TM images that have the following characteristics: 16-day temporal resolution, radiometric resolution of 8 bits (256 gray levels), spectral resolution with three bands, covering the region of the visible spectrum (Bands 1; 2 and 3 ), a band covering the near infrared -NIR (band 4), two bands covering the mid infrared -MIR (Bands 5 and 7), all with 30 m spatial resolution, and a thermal band (Band 6), with 60m spatial resolution.The satellite orbit is descendent polar and sun-synchronous.The images were acquired at the Brazilian National Institute for Space Research -INPE, in GeoTIFF format.During the study it was used a false color RGB-453 image composition where the bands 3; 4 and 5 present a spectral range of (0.63 to 0.69 µm) red, (0.76 to 0.90µm) near infrared and (1.55 to 1.75 µm) mid-infrared, respectively.
The images were obtained during the crop year of 2008/2009.Of these images, the ones that showed a high percentage of cloud cover were discarded, because they incapacitate the study of the target.Finally, the images used in the study were the days 12/04/2008, 12/20/2008 and 02/22/2009.
The images were processed in ENVI 4.7, which were transformed from the GeoTIFF format to the ENVI Standard and saved with UTM cartographic projection, Zone 22 South, Datum WGS-84.
In the first step of image processing was performed the radiometric transformation, where values of gray levels of the image are converted to physical values of the apparent reflectance factor on the top of the atmosphere, according to procedures described by CHANDER et al. (2009).After, the geometric correction (georeference) was performed for that each pixel in the image correspond to the same geographical location in any of the images in the analyzed period, in order to harmonize the municipal grid of IBGE ( 2001), with the cartographic projections of the cities.This method consists in identifying the control points on the precise georeferenced image (from the GeoCover Technical Guide) and the Landsat images used in the study.Thus, it is associated to the image a projection system and geographical coordinates through a mathematical model.
From the color compositions of the near infrared, mid infrared and red bands (RGB-453 false color), samples were collected from the regions of interest.Then, digital supervised classifications were executed, and by superposition was generated a thematic map containing the areas occupied by soybean culture, here conventionally called Soybean Mask.
To make the classifications, samples of soybean culture were selected, based on the monitored agricultural area of 300 ha, embraced by Landsat scene studied.The regions of interest were based by the evolution of soybean culture in the crop of 2008/2009 in the monitored agricultural area, which is situated near the city of Cascavel-PR.
From the monitoring of this agricultural area was possible to characterize most areas of western Paraná, because there is cultivation of soybean and maize crop at the same time in different large planting fields (bounded spaces that characterize a rotation from a crop to another one).As in this area there are large planting fields with soybean and maize (Figure 2), it is possible to follow through the images the entire spectral evolution of these cultures throughout its development cycles (Figure 3), assisting in the definition of the samples of pixels with soybean and consequently in the mapping of soybean areas in the entire length of the scene studied of all images.
In Figure 2, it is possible to visualize the spatial arrangement in the monitored agricultural area of the large planting fields of soybean and maize, as well as the monitoring of the development of soybean sowed on different dates for each large planting field.In the agriculture area it is possible to notice that the large planting field S1 represents the area with soybean culture sowed between the second half of October and first half of November, and the large planting field S2 of soybean sowed in the first half of December.The other large planting fields are of maize (M1), soil (large planting fields with crop residues for direct seeding), other cultures (pasture) and rural constructions.This attendance served as "terrestrial reference" for the three images to be classified during the crop, determining the spectral dynamics of soybean culture during its development.In the image of 12/04/2008, the area with soybean, large planting field S1 of Figure 2, which is presented in the orange color is about two months after sowing.In the image of 12/20/2008, besides the large planting field S2, which was still presented in the orange color and with two and a half months, after seeding, it is also observed in the large planting field S2 (yellowish color), the start of soybean cycle which is currently less than one month after planting.At last, for the image of 02/22/2009, the large planting area S2, which is about two and a half months to three months after sowing, has orange coloration indicating that it is in the phase of vegetative peak.Comparing the soybean and maize cultures, it is noted that the large planting field M1 (maize crop in Figure 2) for both 12/04/2008 and 12/20/2008 first images showed the red color and for the image of 02/22/2009 the large planting field showed a whitish-blue color (soil + straw) meaning that had already been harvested.
Depending on the area and large planting fields of soybean monitored, samples of soybean pixels were collected in the entire study area for further execution of supervised classifications.The classification algorithm used was the method of Maximum Verisimilitude (MaxVer).For this, we used the spectral bands 3; 4 and 5 that provide the lowest correlation between the three bands spectral information, assigning a greater chance of identifying objects.The result of MaxVer is better the higher the number of pixels in the training sample (in this case, samples of soybean culture) to implement them in a covariance matrix (SULSOFT, 2008).The method permits to set a probability value in percentage, that a given pixel belongs to specific class (Soybean class); in the case of the paper, was determined to be 60%, considered relatively low, however, if the samples are truly representative, avoids the inclusion of pixels that do not represent soybean being classified in the Soybean class, since the space attributes of soy, translated by the training samples tends to become smaller (CRÓSTA, 1992).
The samples of soybean culture were created for each of the three images and the classifier compiled for each of these.The results were three images classified in two classes: Soybean and Non-Soybean, one for each image date, in order to achieve the classification of soybean culture always at its vegetative peak, reducing the error of the classification.After, each one of the three classified images was superimposed using the boolean OR logical operator in the Envi 4.8 software.Thus, all pixels classified as Soybean, in the three images, that cover the cycle of development of culture in the studied area, originated the "Soybean Mask."In order to reduce the uncertainties that the used classifier has in the classification results (soybean mask), we implemented the specific system of analysis for images, which consists in the adoption of techniques of filtering and reclassification, which in this specific case was implemented the soybean mask image.After the generation of the soybean mask calculations of the estimated planted area with soybeans in the crop of 2008/2009 were performed for the cities of the region under study.
The assessing of the accuracy of the information generated from the digital classification of the images was performed using the Global Accuracy and Kappa Coefficient metrics.The Kappa Coefficient indicates the quality of the classification, ranging from 0 to 1, and the more it approaches the value 1, the most the classification approaches to an ideal (CONGALTON & GREEN, 2009).
To perform the calculation of Kappa Index and Global Accuracy it was used the distribution of 100 points randomly on the cities overlapping the mask.The evaluation of the points to determine if they were or not in Soybean Class was performed by visual inspection on the RGB-453 image.With this it was determine the accuracy metrics.
To analyze the results of the comparisons between the performed estimate soybean area and the official data released by the Secretary of Agriculture and Supply of Paraná/Department of Rural Economy (SEAB/DERAL), it was adopted the scale of PIMENTEL-GOMES (2000) wherein the Relative Error (RE) is considered low when it is less than 10%, average when it is from 10% to 20%, high when it is from 20% to 30% and very high when greater than 30%.

RESULTS AND DISCUSSION
To understand the process of digital classification of images the concept of attributes space is fundamental.Thus, in Figure 4 were made graphs illustrating the attributes space of RGB images composition.In these, are plotted the values of gray level of apparent reflectance at the top of atmosphere of bands 3; 4 and 5 for the targets identified in Figure 3, large planting fields relating to soybean and maize cultures, in two of the three dates of Landsat 5/TM images (12/04/2008 and 12/20/2008) used in the paper for a better understanding of the spectral dynamics and identification of the spectral differences between these two targets.Adopting the agricultural area as a base to monitor the spectral dynamics of soybean, it was identified that the biggest difference between the soybean and maize targets, observed in RGB-453 color composite, was when, visually, the soybean areas are in orange color, meaning the vegetative peak.This aspect occurred according to the dates of seeding soybean on the large planting field monitored in the agricultural area, approximately between the first and third month of the development of the culture.
Knowing that the development of soybean may be defined in two stages or phases of development, the vegetative (V) and the reproductive (R), we sought to characterize the culture as its developmental stages, for the above mentioned period.
After the first month after seeding, the soybean culture is located near the vegetative stage called V5, which is characterized by the plant being with five nodes on the main stem with a fully developed leaves, i.e., the spectral response is already only of the vegetation, because the culture has the total soil covering.During the third month after seeding, the soybean culture is in the vegetative stage Vn, which is defined as having "n" number of nodes on the main stem with fully developed leaves.In this same period, the culture is already near its reproductive stage called R6, which is characterized by the filling of the seed, where the pods contains green seeds, located in one of the last four nodes of the main stem (ALVARES FILHO, 1988).These aspects were used as the basis for the collection of training samples of soybean culture in images, and for each of the three dates of the images during the monitored crop, training samples of soybean on the orange color were collected (period of soybean development between 30 to 90 days).With that, we assured that the final soybean mask to the cities studied considered all variations of culture seeding dates of the region, which stretches from mid October to mid December.
In Figure 5 it is possible to see that the technique of superposition of three images classified during the culture development was efficiently because it achieved to classify the soybean areas planted on different dates.
Later, it was created the "soybean mascara" for the crop of 2008/2009 using the MaxVer supervised classification and superposition of three classified images.It is possible to note in the final superposition of the three images classified of Figure 5, that there are isolated pixels, i.e., pixels classified as Non-Soybean surrounded by others classified as Soybeans and vice versa.Performing visual analysis across the soybean mask, it was noted that there were many of these pixels that have been "wrongly" classified.To minimize this occurrence was adopted the specific system of analysis for images (filtering and reclassification techniques).The processing of filters is widely used to clean isolated pixels in an image (CRÓSTA, 1992).Thus, in order to eliminate those isolated pixels of the soybean mask, it was used a filter with (5x5), i.e., in a sample of 5x5 pixels of the image, it was obtained the average of these values, and this average value was admitted as the new value of the central pixel of the sample.Subsequently to this filtering operation, the image has passed through a "reclassification", where the pixels were specified with values above 0.2 and are now classified as soybean (value 1) and pixels with values equal to or below this as a Non-Soybean (value 0) as shown in the diagram of Figure 6.FIGURE 6.Digital processing of used images, filter and reclassification.
In Figure 7 it is possible to see how the adopted specific system of analysis for images could adjust satisfactorily the classification in the monitored agricultural area.FIGURE 7. Example of the specific system of analysis for images adopted in the soybean mask of the monitored agricultural area.
Figure 8 shows the final soybean mask after the adoption of the specific system of analysis for images for the crop of 2008/2009 in the nine cities studied.To evaluate the performance of the classifier and the methodology used to determine the area cultivated with soybean, were performed the Global Accuracy and Kappa Index metrics, as previously described.Table 1 shows the values of the Global Accuracy and Kappa Index metrics in each of the nine cities.It is observed that the classifier presented limitations when performing the recognition of the culture, despite the standard soybean had been well defined in the training samples.The data of Global Accuracy, which represents the number of matching points classified as Soybean or Non-Soybean in the mask when compared with the 100 sampling points in each city, had an average value of 79.05%.The Kappa Index values found for all cities are above 63.50%,considered as very good rating, according to LANDIS & KOCH (1977).
Table 2 shows the area values calculated by counting the pixels in supervised classification and official data released by SEAB/DERAL (2011) for the crop of 2008/2009, along with the calculated values of RE between the soybean estimated area by the mask and official area.The RE is calculated by subtracting the first estimate and the second, dividing the value of the second estimate, multiplied by 100 to be expressed as a percentage.Since the analysis was performed between two matched estimates of area, then the RE is equal to the coefficient of variation, based on PIMENTEL-GOMES (2000).It is possible to note that the soybean mask underestimated the estimated area for eight of the nine cities compared to the official area data, overestimating only in Cascavel-PR, and then even with a small amount of amplitude.However, the soybean mask obtained values with similar behavior to official data, evidenced by the correlation index of 0.99.As the p-value was less than 0.05 for all comparisons of results, the association between them was statistically significant, with a confidence level of 95%.This indicates that the data are strongly correlated and have the same tendency, moving in perfect proportion in the same direction.
The estimate of soybean area obtained for the cities of Ouro Verde do Oeste-PR and Cafelândia-PR, underestimated the estimate of the SEAB/DERAL at 34.87% and 28.14%, respectively.For the other cities the RE was within the variation accepted, being the lower of 2.54% for the city of Cascavel-PR.
Although the attributes space of soybean have been well defined, making the training samples representative, some cities showed a significant difference in the results of estimate soybean area in relation to the official area, one possible cause for these results might have occurred due to having only three cloud free images available for classification in the crop of 2008/2009, so the areas that did not show the vegetative peak between the dates 12/04/2008, 12/20/2008 and 02/22 / 2009, would not be classified as soybean, decreasing the precision of the mask.However, only two cities had RE values above 20%, considered high by the scale of PIMENTEL-GOMES (2000).
We emphasize also that using the specific system of analysis for images was efficient on the obtained results of estimate soybean areas.Another factor to be considered is that the estimate results of SEAB despite being carried out for a long time has a methodology that takes into account subjective information because it considers data from agricultural research of commercialization of seeds and inputs in the cities.

Cities
Area

CONCLUSIONS
The obtained results were satisfactory, since the supervised classification system for MaxVer along with the specific system of analysis for images allowed an estimate of the area planted with soybeans (soybean mask).
The monitoring of a reference agricultural area with the soybean crop was of great importance in determining the vegetative phase where the culture is different from most other agricultural targets, facilitating the choice of samples and avoiding errors in classification.
The accuracy metric values indicated that the levels were very good, since the results of areas with soybean cultures in the cities were relatively close when compared to the official estimates.

FIGURE 1 .
FIGURE 1. Cities and monitored agricultural area.

FIGURE 2 .
FIGURE 2. Sketch of the agricultural area monitored, with cultures of soybean and maize at different vegetative phases of soybean and soil with different covers.

Figure 3
Figure 3 illustrates the characteristic colors of soy and maize, in each of the dates of the acquired images, notably in the composition of colors used RGB-453 and in certain phase of the development the soybean culture differs from other targets (other cultures) because the soybean presented orange tone and the maize presented a red tone, different from each other.

FIGURE 4 .
FIGURE 4. Attributes space of soybean and maize cultures from Landsat 5/TM images extracted from the large planting fields of the monitored agricultural area.

FIGURE 5 .
FIGURE 5. False color RGB-453 images in monitored agricultural area and figure demonstrating the classification and used superposition technique.

FIGURE 8 .
FIGURE 8. Final Soybean Mask for the crop of 2008/2009 in the cities studied.

TABLE 1 .
Global Accuracy and Kappa Index Metrics.

TABLE 2 .
Planted area estimate by the MaxVer classifier and data of SEAB.