PALSAR-2 / ALOS-2 AND OLI / LANDSAT-8 DATA INTEGRATION FOR LAND USE AND LAND COVER MAPPING IN NORTHERN BRAZILIAN AMAZON

In northern Brazilian Amazon, the crops, savannahs and rainforests form a complex landscape where land use and land cover (LULC) mapping is difficult. Here, data from the Operational Land Imager (OLI)/Landsat-8 and Phased Array type L-band Synthetic Aperture Radar (PALSAR-2)/ALOS2 were combined for mapping 17 LULC classes using Random Forest (RF) during the dry season. The potential thematic accuracy of each dataset was assessed and compared with results of the hybrid classification from both datasets. The results showed that the combination of PALSAR-2 HH/HV amplitudes with the reflectance of the six OLI bands produced an overall accuracy of 83% and a Kappa of 0.81, which represented an improvement of 6% in relation to the RF classification derived solely from OLI data. The RF models using OLI multispectral metrics performed better than RF models using PALSAR-2 L-band dual polarization attributes. However, the major contribution of PALSAR-2 in the savannahs was to discriminate low biomass classes such as savannah grassland and wooded savannah.


Introduction
Optical remote sensing has been generally used to map land use and land cover (LULC) changes (Silva et al. 2014).Recent global and regional LULC mapping programs based on remote sensing imagery have emerged in the scientific literature.For instance, the Finer Resolution Observation and Monitoring of Global Land Cover (FROM-GLC) (Gong et al. 2013) and the GlobeLand30 (Chen et al. 2015) are examples of high-resolution global LULC projects.However, they had limiting results in tropical landscapes, especially in the Amazon, with Kappa values of 0.262 and 0.677, respectively.Another example is the TerraClass project in Brazil, which mapped the entire Legal Amazon with a Kappa of 0.67 and an overall accuracy of 76.64% (Almeida et al. 2016).Thus, depending on the method and type of satellite data used in the analysis, there are uncertainties to detect the magnitude and extension of the LULC changes and to classify correctly the classes in tropical areas.In the Brazilian Amazon, most of the uncertainties are related to the difficulties inherent to optical remote sensing in persistent cloud-covered regions (Lu et al. 2007).The fragmentation of tropical landscapes and the subtle transitions between the vegetation types are also sources of uncertainties for LULC mapping using optical remote sensing (Laurin et al. 2013).
In this context, orbital Synthetic Aperture Radar (SAR) sensors have become increasingly important in LULC studies.Furthermore, they are sensitive to the geometry of the surface and vegetation canopy structure (Lu et al. 2007).For instance, L-band SAR data have been used to detect deforested sites in the Brazilian Amazon (Santos et al. 2008).In addition, SAR data can be integrated to optical data whenever possible to obtain information not only associated with the biophysical attributes of vegetation, but also with the structural characteristics of the surface (Lu et al. 2011).Among the several approaches for integrating SAR and optical data, two strategies are commonly used: image fusion and hybrid approaches combining more than one method.Image fusion is often employed by means of Principal Component Analysis or Wavelet Transformations (Pereira et al. 2013;Otukei et al. 2015).Hybrid approaches generally require feature selection or radiometric transformations that may affect the quality of the information retrieved from data integration or the interpretation of results (Lu et al. 2011;Hong et al. 2014).
Another emerging method to integrate SAR and optical data for LULC mapping is the Random Forest (RF) algorithm due to its robustness and capacity of handling a great number of variables (Jhonnerie et al. 2015).RF ranks the variables according to their importance for classification (Breiman 2001).The use of RF for SAR and optical data integration provides more accurate maps than the other classifiers (Forkuor et al. 2014).Furthermore, RF is a non-parametric classifier, which confirms its suitability for classification and integration of both datasets (van Beijma et al. 2014).Compared to support vector machine, RF presents better classification accuracy, requiring less user-defined parameters, as shown in previous studies using Enhanced Thematic Mapper Plus (ETM+) and RapidEye data (Adam et al. 2014).
Two orbital sensors that represent the state-of-the-art of the new generation of SAR and optical instruments are the PALSAR-2/ALOS-2 and the Operational Land Imager (OLI)/Landsat-8, respectively.OLI was launched in 2013 to ensure Landsat data continuity (Roy et al. 2014).PALSAR-2 was launched in 2014 to provide L-band data to complement the PALSAR-1 mission, which operated from 2007 to 2011 (Rosenqvist et al. 2014).PALSAR-1 was employed successfully for LULC mapping in combination with other sensors over tropical forests in the Amazon (Liesenberg and Gloaguen 2013;Liesenberg et al. 2016) and West Africa (Laurin et al. 2013) with accuracy close to 90% for the classification of seven LULC classes in each study.Compared to PALSAR-1, the PALSAR-2 instrument has better Noise Equivalent Sigma Zero, better Signal to Ambiguity ratio and higher spatial resolution in selected modes (Kankaku et al. 2009).Despite these advantages, as far as we know, PALSAR-2 has not been used for LULC mapping in the Amazon.In other parts of the world, PALSAR-2, combined with other sensors, had good classification performance in Myanmar (Torbick et al. 2017) and in Bolivian lowlands (Reiche et al. 2018).In this context, we hypothesised that the combined use of PALSAR-2 and OLI data in the Amazon region is important because this combination can provide potentially more accurate LULC maps than the use of each dataset separately.
The objective of this study is to analyse the thematic mapping products provided by the integration of PALSAR-2 and OLI data for LULC classification using RF in northern Amazon, Brazil.Differently from the other studies, we evaluated the classification of a great number of LULC classes (17).The region comprises a landscape of ecological tension in the transition zone between tropical semi-deciduous forests and savannahs.Due to the complexity of this landscape and the persistent cloud cover, three specific goals were defined: (1) to assess the potential of each dataset (PALSAR-2 or OLI) and derived metrics for LULC mapping; (2) to compare the LULC classification from each dataset with that derived from the integration of PALSAR-2 and OLI data; and (3) to identify the LULC classes with the highest gains in classification accuracy from the synergistic use of both datasets.

Study area
Located in northern Brazilian Amazon, in the state of Roraima, the study area comprises 1260 km² (approximately 33 km length and 38 km width) in parts of the municipalities of Mucajaí, Alto Alegre and Boa Vista.The region is characterized by the contact between semi-deciduous forests and savannahs, which are separated from each other by the Mucajaí River (Figure 1).According to the Köppen classification, the climate is tropical Awi with a dry season lasting five months (Barbosa and Fearnside 2005).Most of the precipitation in the rainy season occurs from May to August.The mean annual accumulated rainfall and temperature are 2000 mm and 28•C, respectively.
The predominant vegetation type is a sub-montane semi-deciduous forest (Santos et al. 2008).The most important savannah physiognomies are woodland savannah (15-40% of canopy cover), wooded savannah (4-15%), shrub savannah (<4%) and savannah grassland (no trees).The Campinarana is a site-specific vegetation type (grasslands and shrubs) that occurs associated with alluvial fans.Due to deforestation/fire and subsequent land abandonment, areas of initial (less than 5 years of vegetation regrowth) and intermediate (5-15 years) secondary successions are usually observed in the study area.The contact between tropical rainforests and savannah forms a region of ecological tension.In addition to the natural complexity of the vegetation physiognomies, the study area also presents a mosaic of anthropogenic land use for both Smallholder and large-scale agribusiness farms.
Seventeen thematic classes were chosen during two fieldwork campaigns in May 2014 and January/February 2015.They had their geographic coordinates registered using a Global Positioning System (GPS).The classes and their respective number of sample plots (polygons) from which pixels were extracted to avoid autocorrelation effects were, as follows: ( 1 2015) includes the OLI/Landsat-8 bands 6, 5 and 7 in red, green and blue, respectively.

Fieldwork and remote sensing data
The PALSAR-2 and OLI datasets were pre-processed for the extraction of several metrics, which were used for LULC RF classification (Figure 2).The main attributes from the optical and SAR datasets are shown in Table 1.Images were acquired in the dry season.From a gauge station located in Boa Vista, only 7.1 mm of precipitation was registered 30 days before image acquisition by both sensors.The OLI/Landsat-8 image was acquired on February 6, 2015, with 30 m spatial resolution.The Landsat Climate Data Record (CDR) surface reflectance product was obtained from the United States Geological Survey (USGS) database.This product is automatically generated by the Landsat Ecosystem Disturbance Adaptive Processing System (LEDAPS) and is atmospherically corrected following a procedure similar to that used for the Moderate Resolution Imaging Spectroradiometer (MODIS).The procedure is based on the Second Simulation of a Satellite Signal in the Solar Spectrum (6S) algorithm (Masek et al. 2006).The CDR product has been used in several studies in the Amazon, such as those performed for monitoring secondary succession (Galvão et al. 2015) or for LULC change detection in lowland floodplains (Fragal et al. 2016).
Six OLI bands were used in the data analysis: 2 (450-515 nm), 3 (525-600 nm), 4 (630-680 nm), 5 (845-885 nm), 6 (1560-1660 nm) and 7 (2110-2290 nm).Clouds and resultant shadows accounted for 5.2% of the OLI scene.They were masked out using the mask information provided by the Landsat CDR product.In addition to the band reflectance, we calculated the Normalized Difference Vegetation Index (NDVI) and the Enhanced Vegetation Index (EVI), which have been also used in LULC studies (Akar and Güngör 2015;Jhonnerie et al. 2015).The product was provided in ground range and was georeferenced in the UTM coordinate system (WGS84 zone 20N projection), which was the same projection of the OLI image.The PALSAR-2 image was processed in the open source Sentinel-1 Toolbox 1.1.1(STB1), developed by the European Space Agency (ESA).The STB1 was used for speckle filtering and for extracting texture attributes.The Lee and Frost speckle reduction filters were tested with window sizes (WS) of 3x3, 5x5, 7x7, 9x9, 11x11 and 13x13 pixels.Based on the analysis of the coefficient of variation, on the equivalent number of looks over a homogeneous tropical forest area and on the inspection of the edge degradation, face to the size of several patches of LULC classes, the Lee filter with WS of 3x3 pixels was selected.
After filtering the PALSAR-2 images, the Gray-level co-occurrence matrix (GLCM) was obtained to retrieve several texture metrics from the HH and HV images.The window size for the GLCM extraction was based on the variogram method (Szantoi et al. 2013) with a window size of 5x5 pixels in both HH and HV polarizations.Using the STB1, ten texture metrics were computed for each image: mean, variance, contrast, entropy, energy, dissimilarity, correlation, homogeneity, angular second moment and maximum correlation coefficient.In addition to the metrics of texture, synthetic bands that contributed to improve classification in other LULC studies were calculated: HH+HV (Lehmann et al. 2012); HH-HV (Dong et al. 2012); HH/HV and HV/HH (Avtar et al. 2012); and the SAR Index = (HH*HV)/(HH+HV) (Lu et al. 2011).
The OLI and PALSAR-2 images were co-registered and the optical data were resampled to 10 m using the nearest neighbour algorithm in order to preserve SAR's fine spatial resolution.Thus, a total of 35 metrics derived from PALSAR-2 (27 metrics) and OLI (8 metrics) was used for RF classification.They were, as follows: the reflectance of six OLI spectral bands and two vegetation indices (NDVI and EVI) for OLI; HH and HV polarizations, twenty GLCM metrics (ten for each polarization) and five synthetic bands for PALSAR-2.

Random Forest (RF) classification and validation
The RF algorithm adopts a bootstrap approach for building decision trees.The RF classification was performed in R environment with the packages "raster", "maptools", "GIStools" and "randomForest".In the classification, the original set of training samples is randomly divided into subsets with 65% of the original samples being used to create the trees.The remaining samples are used for cross-validation of the model through the determination of the out-of-bag estimate error (OOBE).At each split of the tree, a subset of m attributes is randomly selected and evaluated by the OOBE.The most accurate attributes divide the nodes of the trees.The variable importance (VI) is estimated by randomly permuting the value of the OOBE samples for a variable X j and, thus producing a new estimative of error.The sum of the difference between the disturbed sample and the observed error for all trees provides a measure of the mean decrease in accuracy of that variable for the model.Variables with high importance show a great disturbance in the accuracy of the model.The VI is determined by Equation 1: where, VI is the variable importance of X j ; ntree is the number of trees in the forest; sum is over all t trees OOBE   is the OOB error estimated by permuting the variable X j ; and OOBE   is the OOB error at each tree for that given variable.
Users must define two parameters in RF: the number of trees in the forest (ntree) and the number of variables at each node of the tree (mtry).Default values for these parameters are 500 trees and the square root of the number of input variables (√N), which are commonly used for LULC mapping using remote sensing (Forkuor et al. 2014).There is no consensus on the minimum number of trees to reach an optimal RF performance (Akar and Güngör 2015).Therefore, many previous works assumed 1000 trees (van Beijma et al. 2014) because this quantity did not affect the performance of the model (Breiman 2001).Some efforts have been made to optimize both the ntree and mtry values aiming to reduce the OOBE and to yield more accurate results (Jhonnerie et al. 2015).The RF calibration approach adopted in this research was adapted from Eisavi et al. (2015).The approach consisted in processing RF for each of the established models with 100, 200, 300 and up to 1000 trees and 1, 2, 3, ..., N mtry values, where N is the number of input variables in each RF model.Thus, the number of forests in each model was 10xN.The forest whose combination of ntree and mtry parameters resulted in the lowest OOBE was used for classifying the specific remote sensing dataset of a certain model (Table 2) and the different models were compared by means of independent validation.
Following the approach by van Beijma et al. (2014), different RF classification models were obtained using either the PALSAR-2 metrics or the OLI attributes, or the combination of them (hybrid models) (Table 2).The objective was to allow comparison of the RF classification accuracy as a function of the spectral range (optical or microwave) and the number and type of metrics.In addition, the idea was to better understand the role played by the polarization-derived metrics on the LULC discrimination.Because of the well-known SAR limitations for LULC mapping, demonstrated by Li et al. (2011), Li et al. (2012), Laurin et al. (2013) and Liesenberg et al. (2016), a greater number of SAR metrics (27 metrics) was adopted in the data analysis, when compared to the optical metrics (8 metrics).In the literature, the SAR texture is commonly used for adding spatially dependent information for improving classification (Sheoran and Haack 2013).In our study, the texture attributes were calculated for both HH and HV SAR polarizations, resulting in a much larger number of input variables for the SAR-derived RF models than for the optical-derived RF models.For validation of the RF classification, we selected pixels over the field-visited areas of each class and divided the pixels randomly into training (70%) and validation (30%) datasets.Multiple sites were selected per class to retrieve the training and validation pixels, to represent the heterogeneity of the classes and to avoid the autocorrelation effects.The performance of classification using the validation set of pixels was evaluated by obtaining the overall classification accuracy and the Kappa statistics from the confusion matrices.Following previous studies, Kappa values were compared with paired-Z tests to verify statistical differences between the models at 0.01 significance level (Silva and Santos 2011).

Random Forest classification models
In general, classification accuracy and Kappa values were lower for PALSAR-2 models than for OLI models and improved with the combination of optical and SAR metrics in hybrid models (Table 3).
The parameters of the best PALSAR-2 RF classification (model 4) were 900 trees and 10 variables at each split (N = 22).For the OLI dataset, the best classification (model 10) was obtained with 600 trees and 2 variables (N = 6).The hybrid model number 15 presented the highest accuracy with 600 trees and 3 variables (N = 8).None of the models reached the lowest OOBE with the default parameters in the R software.This issue had already been reported in the literature, indicating the importance of the RF parameter of calibration before LULC classification with remote sensing data (Odindi et al. 2014).In the PALSAR-2 model 4, the classification accuracy and Kappa were 44.60% and 0.390, respectively, after adding the GLCM textures and SAR metrics to the RF model.The classes with the highest producer's accuracy (PA) were silviculture (95%), water (83.72%) and savannah grassland (82.53%).The most important metrics in model 4 were the GLCM mean HV (23%) and GLCM variance HV (20%), followed by the GLCM mean HH (17%) and GLCM variance HH (15%).In the OLI model 10, after the inclusion of the six multispectral bands (2, 3, 4, 5, 6 and 7), the classification accuracy and Kappa reached 76.88% and 0.744, respectively.The most important variables were the reflectance of the near-infrared (NIR) band 5 (54%), followed by the reflectance of the shortwave infrared (SWIR) band 7 (50%) and red band 4 (45%).
The best hybrid model (number 15 in Table 3) produced a classification accuracy and Kappa of 82.96% and 0.810, respectively.In addition to the six OLI multispectral bands, the PALSAR-2 HH and HV amplitudes were added into the model.In reality, paired comparisons of Kappa values by the Z test showed no statistically significant differences at the 0.01 level between the hybrid models 15, 16, 17 and 18.However, model 15 had the advantage of including a smaller number of attributes (8) than the other hybrid models (28, 31 and 35 attributes, respectively).

RF classification maps using hybrid metrics
The resulting RF LULC classification map of model 15, which had the highest classification accuracy in Table 3, is shown in Figure 3. From the analysis of the confusion matrix of this model, the classes that had more than 90% of pixels correctly classified were, as follows: agriculture, water bodies, silviculture, campinarana, clear-cut silviculture and mature forest (Table 4).The first three ranked metrics in order of importance were the reflectance of the NIR, SWIR and red OLI bands 7, 5 and 4, respectively, with more than 40% of the mean decrease in accuracy (Figure 4).Compared to the optical attributes, the PALSAR-2 HH (8%) and HV (18%) amplitude had lower importance for classifying the LULC classes.On the other hand, they increased the overall classification accuracy in 6%.From the OLI RF model 10 to the hybrid model 15, the Kappa increased from 0.744 to 0.810 (Table 3).The main contribution of the PALSAR-2 HH and HV amplitude was to improve the discrimination of the lowest biomass classes, as deduced from the analysis of gains and losses in producer's (PA) and user's (UA) accuracies after the inclusion of these metrics to the OLI spectral bands (Figure 5).
For instance, savannah grassland and wooded savannah showed more than 20% of improvement in PA, while overgrown pasture had 44% of improvement, compared to the classification using only the reflectance of the OLI bands.On the other hand, wooded savannah and palm swamp had more than 5% of loss in PA, while SS2 showed a significant decrease (13%) in UA.
Scatterplots of the relationships between the reflectance of the OLI bands 5 and 7 confirmed the spectral confusion between the savannah grassland and wooded savannah using the optical dataset (Figure 6a).When the HV amplitude replaced the OLI band 7 in the scatterplot, a better Bulletin of Geodetic Sciences, 24(2): 250-269, Apr-Jun,2018 discrimination between the two land cover types was observed, showing the contribution of the SAR signal to improve classification accuracy (Figure 6b).

Discussion
The current approach makes three important contributions to other LULC studies in the Amazon.
First, it demonstrated the potential of the new generation of polarimetric PALSAR-2 data to classify LULC classes in a specific large area of ecological tension between forest and savannah in the Amazon during the dry season.It also indicates the ability of RF to compose hybrid models using a small number of metrics retrieved from SAR and optical data.
As a second contribution, the use of the hybrid RF models combining optical and SAR data demonstrates that it is possible to classify correctly a great number of classes (17 classes) without the need of merging them a priori to improve classification accuracy.In addition, the classification accuracy (83%) obtained here was excellent and surprisingly high because some of the 17 classes were relatively similar land covers from a spectral point of view.Liesenberg et al. (2016) obtained a Kappa of 0.79 using ALOS-1/Landsat data with a much smaller number of classes (7).The protocol established here can be therefore tested for a major understanding of broad approaches/scenarios in the Amazon rainforest and in other complex fragmented landscapes in the savannah and tropical forest environments.
The last contribution of our work is to indicate clearly the LULC classes that benefit from the multispectral optical and SAR dual polarization L-band data integration.According to Joshi et al. (2016), in a literature review on the topic, this is a critical aspect of most LULC studies addressing optical and SAR data fusion.The findings showed that the main contribution of adding SAR to optical data, especially HH and HV amplitude, was to classify correctly classes with low biomass such as savannah grassland and wooded savannah, in which OLI did not have good performance to separate them.
Our findings showed that the use of five PALSAR-2 synthetic bands (HH+HV, HH-HV, HH/HV, HV/HH and SAR Index) in model 6, in these specific conditions of incidence angle, passing, dual polarization, and water content, did not result in significant improvement in classification accuracy, compared to the classification using the HH and HV amplitude and the related GLCM texture metrics (model 4).Zhu et al. (2012) reported an enhancement from 31% to 72% in classification accuracy by adding metrics of texture to the polarizations in an urban study.In our study area, the addition of these metrics to the polarizations increased the classification from 30.0% to 44.6% but the number of classes selected for classification (17) was much higher than in the other studies.
The discrimination of silviculture plantations from croplands and savannahs using RF was facilitated in the study area due to the interaction of the HV polarization in L band with canopy structure (Li et al. 2012).The signal depolarization due to volumetric backscattering, related to multiple scattering process of the incident radar signal, also explains the confusion between the silviculture plantation and mature forests, which are similar in vertical structure, volume and aboveground biomass content.The discrimination of waterbodies and savannah grasslands from the other scene components was related to the small amounts of backscattering of the microwave signal (Sano et al. 2005).
Despite the complexity and large number of LULC classes of our study area, the PALSAR-2 allowed separation of the vegetation physiognomies according to their structure and biomass, such as the savannah and forests.Similar results in L band were found in Central Africa using the JERS-1 amplitude, in which savannah grassland, forests and flooded vegetation were correctly classified (Simard et al. 2000).In the absence of optical data due to frequent cloud cover in tropical regions, the monitoring of the transition zone between savannahs and forests in the study area is important, because deforestation and fire frequently occur in this zone (Laurin et al. 2013) (Barbosa and Fearnside 2005).In this context, further studies with SAR polarimetric decomposition can reveal disturbances in forests affected by fire (Santos et al. 2008).For instance, Martins et al. (2016) showed that polarimetric L-band PALSAR data were sensitive to variations in forest structure and biomass caused by forest fire.
When compared to the SAR classification, results using RF applied to the reflectance of the OLI bands showed higher classification accuracy.The most important OLI metrics were the NIR band, closely followed by the SWIR band.In contrast, studies in forests from Belgium and Costa Rica reported the SWIR bands as more important than the NIR bands for LULC classification (Chan and Paelinckx 2008).
The hybrid dataset (PALSAR-2 plus OLI) produced better RF results than the use of OLI or PALSAR-2 metrics separately, which was consistent with previous studies integrating optical and SAR data (Pereira et al. 2013;Laurin et al. 2013).In the hybrid dataset, the optical bands were more important than the SAR attributes.From the PALSAR-2 metrics tested in our study, the HV and the HH amplitudes were selected in the RF classification of the hybrid dataset.Similar findings were also found by Braun and Hochschild (2015) when working with OLI and C-band Sentinel-1 data.In our study area, the main contribution of SAR was to classify correctly classes with low biomass such as savannah grassland and wooded savannah.They are not easily discriminated using optical data.On the other hand, the addition of PALSAR-2 attributes to the OLI metrics affected the RF classification of SS2, woodland savannah and palm swamps.The confusion between woodland savannah and mature forests is caused by similar branches and trunk arrangements of these Bulletin of Geodetic Sciences, 24(2): 250-269, Apr-Jun,2018 classes (Hess et al. 1998).Secondary successions are well discriminated with surface reflectance even after thirty years of vegetation regrowth (Galvão et al. 2015), but signal saturation in the SAR L-band due to high biomass and branches/trunk structure generally leads to misclassification (Araújo et al. 1999).
Palm swamps also present high biomass mainly due to the height of the trees (9-38 m) but have a less organized branched structure (Goodman et al. 2013).In the differentiation between palm swamps and mature forest, seasonality has an important role on SAR backscattering.The classification of them is better delineated in the wet season than in the dry season (Einzmann et al. 2012) because of the greater amounts of water in the soil profiles and streams where the palms are located.In the wet season, the waters of the streams favour double bounce scatter mechanisms between the water surface and the smooth palm trunks (Horritt et al. 2003).Because our investigation was performed in the dry season, the PALSAR-2 metrics did not allow correct classification between palm swamps and forests.Further studies are therefore necessary to evaluate gains from the use of PALSAR-2 images acquired in the rainy and dry seasons in the classification approach.

Conclusions
The potential of the PALSAR-2 and OLI sensors was evaluated as well as the combination of different metrics from them for LULC classification with RF in the ecological tension zone of northern Amazon.The results showed that the combination of the PALSAR-2 HH and HV amplitudes with the reflectance of six VNIR-SWIR OLI spectral bands (2 to 7) produced an overall classification accuracy of 83% and a Kappa of 0.81.This result represents an improvement in classification of 6%, in relation to the classification derived from the use solely of the OLI bands.They highlight the importance of the hybrid models to classify a great number of LULC classes, as in the case of the current study (17 classes).
In general, the addition of some metrics into the RF models, such as vegetation indices and texture attributes, did not improve the accuracy of the combined OLI and PALSAR-2 classifications.All RF models using OLI metrics performed better than the RF models using PALSAR-2 attributes.While the inclusion of the NDVI and EVI did not improve the classification performed with the reflectance of the OLI bands, the texture attributes increased the classification performed with the PALSAR-2 HH and HV amplitudes.
Compared to the OLI, the PALSAR-2 was able to distinguish the major LULC classes because of the differences in canopy structure (e.g.semi-deciduous forest and savannah physiognomies).This is useful for monitoring this landscape in the absence of optical data due to cloud cover.More importantly, the inclusion of the PALSAR-2 HH and HV amplitude into the OLI reflectance dataset improved the discrimination of the low biomass classes such as savannah grasslands and wooded savannah.

Figure 1 .
Figure 1.Location of the study area in Roraima state, Brazil.The colour composite (February 6, 2015) includes the OLI/Landsat-8 bands 6, 5 and 7 in red, green and blue, respectively.

Figure 2 .
Figure 2. Flow diagram for Random Forest (RF) classification of the LULC classes using optical (OLI/Landsat-8) and SAR (PALSAR/ALOS-2) attributes and the combination of metrics of both datasets.
-2/ALOS-2 Stripmap Fine mode, dual polarization (HH and HV) image, was acquired on February 23, 2015, from the Japan Aerospace Exploration Agency (JAXA) by means of Base Aerofoto company.The PALSAR-2 amplitude image has a pixel size of 6.25 m (2 looks), range resolution of 9.1 m, azimuth resolution of 5.3 m and a spatial resolution of 10 m.

Figure 3 .
Figure 3. LULC map derived from the best Random Forest classification (model 15) using the HH and HV PALSAR/ALOS-2 amplitude and the surface reflectance of OLI/Landsat-8 bands 2 to 7. From top to bottom in the map, red rectangles are zoomed below showing the result of the RF classification, the OLI colour composition (R6G5B7) with examples of training polygons and the PALSAR-2 HH Amplitude band with examples of validation polygons.

Figure 4 .
Figure 4.The most important variables in the hybrid Random Forest model 15, expressed by the normalized mean decrease in classification accuracy, which ranges from zero (low importance) to one (high importance).

Figure 5 .
Figure 5. Producer's accuracy (PA) and User's accuracy (UA), indicating gains (positive values) and losses (negative values) in Random Forest classification from the inclusion of the PALSAR-2 HH and HV polarization into the OLI dataset.

Figure 6 .
Figure 6.(a) Spectral confusion between the savannah grassland and the wooded savannah in the OLI/Landsat-8 bands 5 and 7. (b) Increase of discrimination between the two physiognomies with the inclusion of the PALSAR-2/ALOS-2 HV amplitude.Values of OLI data are in reflectance and PALSAR-2 in amplitude, as used for inputs in the RF classifier.

Table 1 :
Main attributes from optical and SAR datasets.

Table 2 :
Combination of attributes (marked with X) used in the different Random Forest (RF) models.Models 1 to 6 were processed only with PALSAR-2 metrics, while models 7 to 12 were generated only with OLI attributes.Models 13 to 18 were hybrid when combining optical and SAR attributes.

Table 3 :
Accuracy assessment of each of the 18 Random Forest (RF) classification models.Values in bold highlight the best models using either PALSAR-2 or OLI data, or the combination of both datasets (hybrid models).

Table 4 :
Continuation . For instance, the Mucajaí municipality presents high deforestation rates in the Roraima state in Brazil,