USE OF ARTIFICIAL NEURAL NETWORKS AND GEOGRAPHIC OBJECTS FOR CLASSIFYING REMOTE SENSING IMAGERY

The aim of this study was to develop a methodology for mapping land use and land cover in the northern region of Minas Gerais state, where, in addition to agricultural land, the landscape is dominated by native cerrado, deciduous forests, and extensive areas of vereda. Using forest inventory data, as well as RapidEye, Landsat TM and MODIS imagery, three specific objectives were defined: 1) to test use of image segmentation techniques for an object-based classification encompassing spectral, spatial and temporal information, 2) to test use of high spatial resolution RapidEye imagery combined with Landsat TM time series imagery for capturing the effects of seasonality, and 3) to classify data using Artificial Neural Networks. Using MODIS time series and forest inventory data, time signatures were extracted from the dominant vegetation formations, enabling selection of the best periods of the year to be represented in the classification process. Objects created with the segmentation of RapidEye images, along with the Landsat TM time series images, were classified by ten different Multilayer Perceptron network architectures. Results showed that the methodology in question meets both the purposes of this study and the characteristics of the local plant life. With excellent accuracy values for native classes, the study showed the importance of a well-structured database for classification and the importance of suitable image segmentation to meet specific purposes.


INTRODUCTION
Mapping of land use and land cover is crucial for understanding, monitoring and predicting the effects of the complex man x nature interaction at local, regional and global scales (CLARK et al., 2010). Yet, using field inventories alone to map extensive areas of vegetation is prohibitively expensive, thus resulting in increasingly more frequent use of aerial and satellite imagery to meet the same purposes (BRADTER et al., 2011).
Having both the capability and potential for making systematic observations at various scales, remote sensing can provide data over entire previous decades (XIE et al., 2008), and the technology has been successfully applied to mapping through image classification techniques. Classification of land cover ideally requires use of multisource data so as to enable extracting as much information as possible about the area of interest (GISLASON et al., 2006). Use of time series imagery is extremely important to capture the effects of Silva, P. R. et al.
seasonality in forests, particularly in regions where rainy periods are well defined. However, acquiring data at suitable spatial and temporal scales is key to achieving the accuracy required for mapping (CARVALHO et al., 2004).
Object-based analysis is based on use of image segmentation algorithms for creation of clusters of spectrally similar pixels that deal with such objects as an atomic unit, that way enabling spatial analysis and classification of image data (SMITH, 2010). Compared with pixel-based classification, which only uses spectral characteristics, object-based imagery knows its neighbors, even prior to being classified, while carrying additional information such as texture, shape, correlations with subobjects etc (ANDERSEN et al., 2004). Working with mean values and standard deviation of reflectance within objects, Navulur (2006) argues that the object-based approach can offer advantages in terms of spectral, spatial, morphological, contextual and temporal information.
Several classification algorithms are used in remote sensing. Many studies have demonstrated the effectiveness of Artificial Neural Networks (ANN) in remote sensing classification (PRATOLA et al., 2011). ANNs are suitable for analysis of virtually every data type, regardless of their statistical properties (XIE et al., 2008). The Backpropagation training algorithm and Multilayer Perceptron (MLP) networks are surely responsible for the popularization of this technique in various fields of knowledge. MLP networks are architectures in which each node receives inputs from previous layers and information flows in one direction to the output layer (PRATOLA et al., 2011). The number of nodes in the intermediate layer(s) defines both the complexity and the power of a neural network model to describe underlying relationships and structures inherent in a training data set (generalization power) (KAVZOGLU, 2009), and what more nodes in such layers may be required for classification of more complex, grainy satellite images (JARVIS; STUART, 1996).
Bearing that in mind, the overall purpose of this study was to develop a methodology for classifying land cover in northern Minas Gerais state. Technically speaking, the region poses a challenge to mapping due to its highly heterogeneous landscape that combines deciduous forests, cerrado, transitional areas, agricultural land, pastureland and degraded areas. The following specific objectives were thus defined: 1) to test use of image segmentation techniques for an object-based classification encompassing spectral, spatial and temporal information, 2) to test use of high spatial resolution RapidEye imagery combined with Landsat TM time series imagery for capturing effects of seasonality, and 3) to test data classification using Artificial Neural Networks, with different MLP network architectures.

MATERIAL AND METHODS
To help understand the proposed methodology, all stages and processes are summarized in a flowchart ( Figure 1).

Study Site
The study site is located in northern Minas Gerais state, Brazil, and is delimited by three mosaics, each consisting of four RapidEye satellite images ( Figure  2), with an area of around 690,000 hectares. According to the mapping and inventory report Mapeamento e Inventário da Flora Nativa e dos Reflorestamentos de Minas Gerais (SCOLFORO; CARVALHO, 2006), the middle mosaic is crossed by the São Francisco river and represents a transitional zone between the mosaic to the right, where deciduous forests predominate, and the mosaic to the left, where the cerrado predominates. With Use of artificial neural networks ... peculiar vegetation formations which include the veredas of Veredas do Peruaçu state park, today the region boasts rainforest remnants and the largest fragments of native in the state, being home to a wealth of fauna and flora species threatened with extinction.

RapidEye Mosaics
To conduct the study, 12 RapidEye images were used comprising the three different regions intended for classification ( Figure 2). The RapidEye sensor produces 5-meter resolution imagery and consists of five broad bands with wavelengths that range from the visible to the near-infrared region.

Time Series
The seasonal pattern of the dominant local vegetation formations was observed using a time series with 12 NDVI images from the MODIS sensor taken in 2010, along with the RapidEye mosaics and the plots used in the forest inventory of Minas Gerais state (SCOLFORO; CARVALHO, 2006).
With the above data at hand, the next step was to segment the RapidEye images, and the objects created where the plots had been inventoried were identified according to the vegetation formation they belong to. These objects had their NDVI values depicted in a graph, and then time signatures were generated for the local physiognomies cerrado and deciduous forest ( Figure 3).  With deciduous forests shedding over 50% of their leaves in the dry season, the minimum NDVI values for such forests, as well as the amplitude values, indicate a distinctive characteristic from other types of land cover, providing valuable information for use in classification of land use and land cover (SILVEIRA et al., 2008).
Based on that information, four Landsat TM images were selected to best represent the variability in NDVI values, observing maximum, minimum and mean values ( Figure 3), in other words the idea was to represent at different dates the annual cycle of the vegetation.
In order to compare images taken at different dates or using different sensors, the images have to be all registered to the same coordinate system (ERBEK et al., 2004). To ensure a perfect data overlap, the registration of all Landsat images was based on a new mosaic containing forty RapidEye images that included the twelve images constituting the three mosaics over the study site. During the registration process, meticulous care was taken in collecting and identifying ground control points to ensure a perfect image overlap.
Following registration, the Landsat images had their NDVI values extracted. To try and reduce the computational costs involved in the process, a subset was created in each NDVI scene so as to leave them with the dimensions of the study site mosaics. As with the RapidEye images, no atmospheric corrections were made to the Landsat images.

Image Segmentation
For creation of objects, the Multiresolution Segmentation algorithm was used, as it enables extracting Silva, P. R. et al.
segments based both on pixel value (reflectance) and on object shape. Due to having better spatial resolution, segmentation was done with weights being assigned only to the bands of RapidEye images, while the four NDVI -Landsat TM images were used only for extraction of attributes (temporal information) for the classification process.
A major prerequisite for classification of remote sensing data using object-based conceptions is that the segmented objects should have descriptive force and should contain only pixels from one semantic class in the same group (BAATZ; SCHÄPE, 2000). Different values of shape and smoothness were tested in a segmentation with multiple scales (Table 1) for better representativeness of the landscape.
Segmentation was assessed for quality by visual analysis of the different input parameters, comparing shape and size of formed objects as well as their representativeness.
17 that included mean reflectance of objects in each of the five bands, total brightness, contribution ratio of a given band to the overall brightness (bands 4 and 5), maximum difference between the mean intensities of each band, maximum difference between the pixel values of objects (bands 4 and 5), NDVI RapidEye, Standard Deviation (bands 4 and 5).

Artificial Neural Networks
For data classification with ANN, the software application JavaNNS was used (available at http://www. ra.cs.uni-tuebingen.de/software/JavaNNS/welcome_e. html). Multilayer Perceptron (MLP) networks with a sigmoid activation function were used, trained by the Backpropagation algorithm. To attenuate effects of the saturation zone of the sigmoid function, all attribute values were normalized between 0.1 and 0.9 by Equation 1, which, according to Gorgens et al. (2009), also equalizes data, thus helping improve neural network predictability.
The number of neurons in the input layer was defined according to the number of attributes to be used per sample, each neuron receiving one input only.
The number of neurons in the output layer was determined according to the number of classes to be acknowledged. As classes included agricultural land, water, cerrado, eucalyptus, deciduous forest, other, pastureland and vereda, the response vector presented to the output layer had eight binary positions (1 = belongs and 0 = does not belong). Even having only binary values in the output data of each trained sample, results of classifications obtained by the trained network are values between zero and one, contained in the codomain of the sigmoid function [1,0] being used. As in Heinlet et al. (2009), such values are adopted as the value of pertinence of each sample in the respective class, and as in Chini et al. (2008) and Erbek et al. (2004), a competition model was adopted (winner takes all) in deciding the final classification.
The number of hidden layers, as well as the number of neurons in each layer, is an empirical process determined by test running. The networks were thus tested with one, two and three hidden layer(s) and with a different number of neurons in each layer (

Classification
Classification was based on the local physiognomies, according to an official mapping of the state proposed by Scolforo and Carvalho (2006), and included cerrado, deciduous forest, vereda, eucalyptus, water and other (agricultural land, pastureland, bare soil, urbanized areas). Looking to improve the existing mapping and seeking information gains, agricultural land and pastureland were mapped as individual classes, leaving a total of eight classes.
Following segmentation, in all three RapidEye mosaics a total of 724 significant samples were collected from the existing classes, 30% of which was reserved for validation of the training results.
Other than the four NDVI values from the Landsat TM images, another 13 attributes were selected relating to the RapidEye images to describe each object, to a total of Use of artificial neural networks ... Figure 4 shows that even highly fragmented areas had very well defined objects that were in conformity with the landscape. The same occurred in other areas of the study site, demonstrating the success and effectiveness of the segmentation process.  rate of 0.1. In all trainings, a set of samples was introduced randomly at each iteration, and the stopping criterion adopted was cross-validation.

Accuracy Measures
For mapping assessment, a third set of independent samples was collected. The set was based on the mapping of land use and land cover created by Scolforo and Carvalho (2006) for sample stratification. 450 random points were generated, distributed across all vegetation types (physiognomies) present in the study site. All points generated, in each class of land use and land cover, were checked for veracity with the help of RapidEye images and Google Earth software so as to prevent result inaccuracies.

RESULTS AND DISCUSSIONS
Optimal segmentation parameters may differ according to the region. Presence of a very fragmented, heterogeneous area leads to use of lower scale values so that each formed object will contain only pixels of one same class (Figure 4).
Since the area of interest has a diverse landscape that includes plains, mountains and valleys with different types of land cover, the choice was to select the parameters that produced the best independent segmentation of the region, that is, for each of the three mosaics the same parameters of Shape, Compactness and Scale were adopted. The best overall results found were 0.4 for Shape and Compactness and 350 for Scale.

ANN Training
All tested architectures reached a satisfactory convergence of the summed mean squared error (SMSE), and the training data are illustrated in Table 3.
Accuracy values were extremely similar for all tested networks, only really differing in the number of iterations (epochs) required to reach convergence and error stability.

Silva, P. R. et al.
The number of iterations set for each network training is closely related to the number of hidden layers and neurons in the architectures. Networks 1, 2 and 3, due to having only one hidden layer and thus greater difficulty in learning complex patterns, required a larger number of iterations for the training.
With the intersection of stratified sampling points and the maps generated by each network, a confusion matrix was applied to each network along with the respective values of Kappa Index, Overall Accuracy, User's Accuracy and Producer's Accuracy (Table 4).
Network seven had the highest values of Overall Accuracy (0.8310) and Kappa Index (0.7963), and the classification result is illustrated in Figure 5. The classification quality expressed by the accuracy measures   is visually noted, since no postclassification or editing technique was adopted. The result obtained from image segmentation was crucial in determining the detailing of the final maps, as is illustrated in Figure 6.
As with network validation, all maps had very close accuracy values, showing no clear correlation with the number of hidden layers and neurons in them.
With good accuracy values in the remainder of the classes, the classes Other and Pastureland are surely responsible for the drop in Overall Accuracy and Kappa Index values.
The classes Cerrado, Deciduous Forest and Vereda were in good conformity with the validation samples.
With accuracy values close to 90% and even 100% in the case of Veredas, the study methodology proved effective in recognizing the seasonal and spectral pattern of the local native physiognomies.
Several studies were conducted in the same area as this study, including the works of Acerbi-Junior et al. (2006) who classified the cerrado using merging techniques with MODIS and Landsat TM imagery, Silveira et al. (2008) who used decision tree algorithms and MODIS imagery to characterize and classify cerrado and seasonal forests, and Oliveira et al. (2010) who classified cerrado and seasonal forests with MODIS and Landsat TM imagery merged by ANN, all of which produced similar accuracy results. However, the scale factor should be compared in all of them, noting that use of better spatial resolution imagery increases image granularity, and that intensifies problems associated with pixel-based classification. Use of artificial neural networks ...    In a study to classify savannas in Namibia using MODIS imagery, Hüttich et al. (2009) found that, given the highly heterogeneous structure of the savannas, the RapidEye satellite seems promising for accurate mapping of the vegetation. With the excellent accuracy results found for the Cerrado class (Brazilian savanna), this hypothesis is confirmed.
With 70% of its land cover being native forest, the study site boasts the state's largest forest fragments. Despite representing a minute portion of the land cover though very well delimited and mapped, the Veredas are very representative of the region and have great environmental importance. Typically formed by valley-side marsh and palm groves with upwelling groundwater and water springs, the Veredas provide shelter to a wealth of local fauna and flora species, therefore it should be well grasped in order to ensure its preservation and maintenance.

CONCLUSIONS
The proposed methodology proved highly effective in mapping land use and land cover in a region with such high diversity of flora and occupancy classes, therefore using multisource data in combination with object-based analysis is both desirable and recommended.
Time signatures of the local vegetation and Landsat TM imagery did provide valuable information about spectral-temporal variations in the local plant physiognomies.
While integrating the spatial and spectral information of RapidEye imagery with the spectral and temporal information of Landsat TM imagery, the image segmentation technique enabled combining the best of both worlds with no loss or compromise of data.
The number of hidden layers in a MLP network, as well as the number of neurons in each layer, seems to have no direct correlation with mapping accuracy, proving capable of classifying data with great accuracy and differing only by the number of iterations required for each network to learn the input patterns.