Wood Volume Estimation in a Semidecidual Seasonal Forest Using MSI and SRTM Data

The objective of this study was to evaluate the use of the MSI Sentinel-2 and SRTM data to estimate the volume of wood in a Semidecidual Seasonal Forest. Regression equations were fitted based on the remote sensing data, taking into consideration the individual bands and vegetation index of the MSI, elevation values and their derivatives obtained from the SRTM mission and the combination of the data drawn from the MSI and SRTM. RMSE and graphic analysis of residues were used to assess the accuracy of the fitted equations. The best model revealed values of 0.6508 and RMSE of 20.41% in the fit, and of 0.5680 and RMSE of 26.61% in the validation, using the combined MSI and SRTM data as predictors. The volume estimation using spectral data showed satisfactory results, highlighting the importance of topography in the prediction of the volume of wood for the area under investigation.


INTRODUCTION
Forest inventory is an activity that uses the sampling or census procedures for the quantification of qualitative and quantitative characteristics of forests, thus subsidizing decision-making to be done in the environmental areas and strategic levels (Fridman et al., 2014;Silva & Santana, 2014;Vibrans et al., 2010;Mello et al., 2009).
Forest surveys are considered complex, in native forests in particular, because of their heterogeneity (Andrade et al., 2015). Determining the characteristics of natural forests by employing the traditional forest inventory techniques is a laborious and often time-consuming procedure. However, the prediction of forest development is reiterated to be of paramount importance for forest planning and management (Mura et al., 2018). Therefore, identifying alternatives that give information complementing those obtained through the use of the traditional techniques already developed in the forest survey is a vital requirement.
In this sense, the utilization of remote sensing techniques enables an easier and faster method of acquisition and organization of the pertinent information in the inventory (Alba et al., 2017). Remote sensing techniques is been applied in forest studies looking to facilitate the characterization of forest formations in terms of the quantification of forest stocks (Watzlawick et al., 2009). In order to optimize forest activities, new techniques such as the use of satellite images have been employed in the prediction models to dendrometric variables as height, basal area and wood volume (Almeida et al., 2014), and they have been a support to get information from areas of difficult access in an easy way. Besides that, it is important to highlight the advantage of financial viability when using the method (Watzlawick et al., 2009;Miguel et al., 2015).
Surveys that use orbital data in studies to estimate the dendrometric variables from native or planted forests generally do not appraise the topographic factor of the area. However, it has been suggested that in regions of uneven relief, this factor needs to be considered in the modeling because the different relief features are the conditioning factors that affect the vegetation development, its productive behavior and the creation of differentiated environments, related to the topographic effects of the region (Bispo et al., 2009). Thus, the Shuttle Radar Topography Mission (SRTM) data and its derivations in the geomorphometric variables are preferred for vegetation studies.
In this context, studies have been successfully developed using remote sensing data associated with relief data (Bispo et al., 2016) or using the data provided by remote sensing to estimate the wood volume in different formations (Maack et al., 2016;Magnussen et al., 2018;Matasci et al., 2018), to predict the biomass and carbon (Fassnacht et al., 2018;Knapp et al., 2018;Rajashekar et al., 2018), to estimate the tree height (Cabo et al., 2018;Plowright et al., 2017), as well as the leaf area index (Varvia et al., 2018;Wang et al., 2017).
The aim of this study was to assess the use of spectral information from the satellite MSI Sentinel-2 sensor and data on relief derived from the SRTM mission in the construction of models that could estimate the wood volume in a fragment of a Seasonal Semideciduous Forest.

Location and characterization of the study area
The area where the study was conducted (Figure 1) is the Private Natural Heritage Reserve (RPPN) Cafundó (20°43' S and 41°13' W), a part of the Boa Esperança farm, located in the municipality of Cachoeiro de Itapemirim, Espírito Santo. This area includes about 517 hectares of native forest vegetation (Archanjo et al., 2012).
The RPPN Cafundó vegetation is classified as a Semidecidual Submontane Seasonal Forest (IBGE, 2012). According to Köppen, the local climate is classified as Aw, tropical with dry winter (Alvares et al., 2013) and precipitation from 1200 to 1300 mm.
The soil is classified as Dystrophic Yellow Red Latosol. This region shows areas with sparse and soft elevations, the appearance of rocky outcrops in different places, and the relief shows wavy to strongly undulated features called the "Sea of hills" (IBGE, 1987).

Forest inventory
Sampling was performed between May and August 2017. It was used a total of 25 plots ( Figure 1) demarcated in the study of Archanjo et al. (2012) in 2007, in which the goal was to analyze the floristic and phytosociological structure of the reserve.
Systematic sampling was done with the fixed area plots of 20 m × 50 m (1000 m 2 ) each, being 350 m equidistant from one another, totalizing 2.5 ha sampled. In each plot it was measured the diameter at 1.30 m of soil (DBH) of all the trees with a diameter equal to or greater than 5 cm. Bifurcated trees had their branches included, as long as they were alive and satisfied the inclusion criteria defined earlier.
The DBH was measured using diametric tape, while the total heights were assessed with a telescopic ruler (boom height up to 15 m) and Suunto hypsometer (shafts above 15 m).
In order to estimate the wood volume, the six species showing the highest importance value index (24.36%) were cubed. The wood volume was obtained by rigorous sampling, adopting the Huber method in a non-destructive manner, with the shaft diameter of the trees being measured with the aid of the summit at heights of 0.50 m, 1.00 m, 1.30 m, 1.80 m, and from 1.80 m height at each 1.00 m the Criterion RD1000 dendrometer was used, until the beginning of the canopy. Later, the Schumacher-Hall model was adjusted for each of the six species and a general equation, combining the six species into a single equation.

Obtaining remote sensing data
The MSI Sentinel-2 satellite images were acquired from the United States Geological Survey (USGS) website, containing band 2 (blue), 448 at 546 nm; band 3 (green), 538 at 583 nm; band 4 (red), 646 at 684 nm; and band 8 (near infrared), 763 at 908 nm, all with 10-meter spatial resolution and 12 bit radiometric (0 at 4095 gray levels). The selected image was free of clouds and related to May 2017, which coincided with the initial data sampling date performed in the field. Then, georeferencing of the spectral bands was done with the final projection SIRGAS 2000, zone 24 Sul and the coordinates in Universal Transverse Mercator (UTM). For the radiometric and atmospheric correction of the images, the Dark Object Subtraction (DOS) algorithm was used. The digital numbers (ND) were first converted to spectral radiance and then to reflectance (ρ). The radiometric and atmospheric calibration parameters were obtained from the MTD_MSIL1C file available in the image download.
The SRTM was obtained from the USGS, which provides a Terrain Digital Elevation Model file, with information on the altitude (in meters) and spatial resolution of 90 m. The SRTM data were re-sampled from 90 m to 10 m using the bilinear resampling method. Later, it was proceeded the neighborhood operation achieving the following geomorphometric variables: orientation of the slopes, slope and roughness, both with a final resolution of 10 m.
All preprocessing and data processing steps were performed using the QGIS software 2.18.

Explanatory variables used
After the data were pre-processed, we extracted the explanatory variables of each of the spectral bands and the data derived from the SRTM using R. The information extraction was done from the polygons of the 25 plots in the forest inventory.
Three perspectives were considered when adjusting the regression models: (1) only the data received from the MSI sensor were considered as explanatory variables; (2) only the SRTM mission data; and (3) the combined two data from both sources. In the first instance, the explanatory variables were obtained from the MSI sensor in the spectral bands 2, 3, 4 and 8, and six vegetation indexes: normalized difference vegetation index (NDVI) (Rouse et al., 1973), simple ratio vegetation index (SR) (Jordan, 1969), soil moisture content (SAVI) (Huete, 1988), modified soil vegetation index (MSAVI) (Qi et al., 1994), improved vegetation index (EVI) (Justice et al., 1998), and the transformed vegetation index (TVI) (Rouse et al., 1973); in the second moment, the explanatory variables were came drawn from the SRTM, with slope (%), altitude (m), roughness and slope orientation (°); finally, in the third modeling perspective, the MSI and SRTM data were combined and used as the explanatory variables.
The field sampling data were consolidated with the remote sensing data, using the geographic coordinates of each of the plots vertices, employing the GPS model Garmin GPSMAP 76CSx with an error of around 10 m.

Estimation of the volume of wood by regression analysis
The explanatory variables were selected and the data were modeled with the software R version 3.3.3. The independent variables of the model were selected using package leaps. To the selection, it was used the exhaustive search method, which tests and compares all possible combinations of the explanatory variables.
After selecting the explanatory variables, the model was adjusted (1): For each of the three modeling perspectives, all possible combinations of the equation were considered and the one with the best statistical results was selected.
To validate the models, the data cross-validation technique called Leave-one-out, a special case of the K-partitions technique, was used. At this stage of data processing, the Statistics and Machine Learning Toolbox™ package of MATLAB software version R2017a was used.

Evaluation of the fit and validation of equations
To assess the performance of the adjusted equations chosen to estimate the wood volume, the adjusted coefficient of determination ( 2 R ) (Expression 2) and root mean square error (RMSE%) (Expression 3)

RESULTS
The traditional inventory performed by systematic sampling using the 25 fixed area plots produced mean volume of 276.46 m 3 in 0.1 ha and a total volume of 98,971.03 m 3 in 358 ha. The relative sampling error obtained was 14.30%. Table 1 displays the adjusted equations used to estimate the wood volume, explanatory variables selected and adjusted parameters for each of the equations used. All parameters were significant (p < 0.05) by the t test, demonstrating that the variables selected explained the variations observed in the wood volume in the Submontane Semidecidual Seasonal Forest of RPPN Cafundó. Table 2 shows the adjusted and cross-validation statistics of the adjusted equations selected to estimate the wood volume. One significant fact is that the cross-validation results of the selected equations were not as accurate as those found for the adjustment, indicating a deficiency of the models in generalizing.
From the perspective of accuracy, the equation referring to the combination of the variables drawn from MSI and SRTM variables revealed the best performance. Figure 2 shows the graphical analyses of the estimated versus observed wood volumes and the residual dispersion for the adjustment and cross-validation data.
The graphs clearly show no bias of the data. In relation to the residue graphs for the cross-validation, a substantial residual dispersion of the data in relation to the zero-error line is evident.

DISCUSSION
According to Pandit et al. (2018), the data from the Sentinel-2 satellite show good accessibility and possess a greater number of high-resolution spectral bands compared to the commercial data of moderate resolutions, for example, ASTER data. This implies that it is feasible for use in forestry studies, particularly for large areas and where financial resources are a constraint, because this data is available at no cost.
Therefore, the use of spectral information from the MSI Sentinel-2 satellite sensor has increased significantly in forest studies, for example, in the works developed by Fernández-Manso et al.   Earlier studies utilizing remote sensing techniques combined with forest inventory data in the prediction of the wood volume are available, where each sensor combined with the intrinsic features of the forest stands in the study reveals different types of responses. For instance, the research works of Maack et al. (2016), Magnussen et al. (2018), Matasci et al. (2018), Saarela et al. (2015), and Takagi et al. (2015).
A few studies demonstrate the influence of the conditions of the topographic terrain on vegetation behavior, which together with the soil, climatic, geological and anthropogenic interventions produce flora and fauna with characteristics unique to a specific ecosystem (Bispo, 2007), directly influencing vegetation development. Castillo et al. (2017) quantified the biomass in a mangrove region by combining the red spectral bands (bands 4, 5 and 7) from Sentinel-2 with the SRTM elevation data, and obtained an RMSE value of 28.47 Mg. ha -1 .
In a study conducted in the Tapajós National Forest using the SRTM variables (altitude, slope, slope orientation, as well as the horizontal and vertical curvatures) to estimate the basal area, canopy opening and tree height, it was evident that altitude was the variable which gave the greatest explanatory capacity for the vegetation structure (Bispo et al., 2016).
Using the SRTM data combined with data from the Hyperion sensor of the EO-1 satellite to estimate wood volume and to analyze whether the shading caused by the relief exerts an influence on this quantification, Canavesi et al. (2010) found R 2 values varying from 0.589 to 0.709 and the validation process with an average error of 200 m 3 per hectare. According to the authors, these results confirm that the topographic conditioners of the study area influence the quantification of the volume of wood, and therefore, the relations between the biophysical and radiometric parameters. Bispo (2012) emphasized the importance of using geomorphometric variables as an additional data source in biomass modeling in Central Amazonia, with the inclusion of the data drawn from remote sensing. Among the three adjusted models, a considerable improvement was obtained in the model by using the combined sensor and SRTM data. The model with the isolated polarimetric data obtained an R 2 value of 0.35, while the second model with the isolated geomorphometric data (elevation and slope) resulted in an R 2 value of 0.57 and the third, with the combined data sources, produced an R 2 value of 0.74 and an error of 33.15 t.ha -1 or 15.78%.
On testing the use of the data drawn from the Sentinel-2 MSI sensor for predicting the volume of wood, Chrysafis et al. (2017) found an R 2 value of 0.13 and RMSE of 97.95 m 3 .ha -1 with the use of the NDVI index, and for EVI the R 2 value was 0.31 and RMSE was 87.25 m 3 .ha -1 . In two Italian forest areas, located in Tuscany and Lazio, using the Sentinel-2 satellite images to analyze the growth in volume of wood, a mean RMSE of less than 19% was obtained for the areas studied (Mura et al. 2018).
A study using the Landsat TM data to estimate the wood volume for different forest formations showed an R 2 value of 0.31 and RMSE of 56% (Hyyppä et al., 2000) and RMSE of 47.6% (Mäkelä & Pekkarinen, 2004). Using the reflectance values from the ETM + images from Landsat, it was obtained an 2 R equal to 0.71 and RMSE of 74.7 m 3 .ha -1 (Hall et al., 2006), and an 2 R 0.43 and RMSE equal to 97.4 m 3 .ha -1 (Mohammadi et al., 2010).
For an area of Cerrado stricto sensu, using the multispectral images of the Landsat 8 OLI sensor to estimate the volume of wood, a value of 2 R equal to 0.49 was obtained (Santos et al., 2017). In studies on Eucalyptus sp. stands using the TM images from Landsat 5, Berra et al. (2012) found a 2 R variations from 0.61 to 0.68, while Barros et al. (2015) found R 2 values ranging from 0.12 to 0.38.
The precision measures of the adjusted equations selected, the RMSE in particular, were not very favorable, showing an error greater than 20% on average in the adjustment and 26% in the validation. This statement needs to be carefully understood because, compared to several results reported in similar works mentioned above, this result would not be considered bad. On the other hand, in the measurement 20% is still considered a high error, also the greatest errors obtained in the validation indicate that the model does not present good capacity for generalization.
The traditional inventory performed, despite being systematic, considered the population as homogeneous, that is, with no stratification in it. In this sense, the concept of producing post-stratification and considering this variable in the process of constructing the model can, from the perspective of precision, raise the performance of the estimates produced. This is because, as evidenced Gonçalves AFA, Fernandes MRM, Silva JPM, Silva GF, Almeida AQ, Cordeiro NG et al. from the results, the adjusted equation resulting from the combined data from the MSI and SRTM presented a more accurate result when compared to the separately adjusted data from the MSI and SRTM.
In general, whenever the need to perform forest inventories arises, two inescapable questions emerge, cost and accuracy. The remote sensing technique presented in this work can offer great assistance because of its ability to cut the costs and produce results in lesser time, when dealing with larger areas. Still, uncertainty regarding the accuracy and the degree to which the results obtained can be trusted persists.
As this methodology is relatively new and has great potential in terms of cost reduction, starting with an error margin in the 20% range on average, it is not inescapably bad. In fact, this research has been a challenge that has inspired many researchers to connect with the advantages of the remote sensing techniques mentioned earlier, thus increasing their accuracy levels.
In this context, some of the research alternatives to achieve this goal include evaluation of other sensors, assets or liabilities to achieve better relations. Among the active sensors, the LiDAR, widely used for conducting forest surveys, is worthy of mention, being the high cost its main limitation. Another prospect will be to test lower cost technologies, like Unmanned Aerial Vehicles, or probably the use of hyperspectral images and higher spatial and radiometric resolutions, searching within each of these alternatives for a satisfactory combination of lower cost and greater accuracy.

CONCLUSION
The combined data from the MSI Sentinel-2 satellite sensor with the SRTM data revealed the best prediction of the volume of wood per hectare for the Atlantic Forest vegetation analyzed.
The geomorphometric characteristics of the study site should not be disregarded during the spectral characterization of the forest dendrometric variables.
The blue band, vegetation indexes EVI, NDVI and MSAVI, as well as the slope and altitude variables, possess the potential for use in the construction of forest inventory in native forests possessing characteristics similar to the one investigated in this study.