Airborne laser scanning applied to eucalyptus stand inventory at individual tree level

Abstract: The objective of this work was to evaluate the application of airborne laser scanning (ALS) to a large-scale eucalyptus stand inventory by the method of individual trees, as well as to propose a new method to estimate tree diameter as a function of the height obtained from point clouds. The study was carried out in a forest area of 1,681 ha, consisting of eight eucalyptus stands with ages varying from four to seven years. After scanning, tree heights were obtained using the local maxima algorithm, and total wood stock by summing up individual volumes. To determine tree diameters, regressions fit using data measured in the inventory plots were used. The results were compared with the estimates obtained from field sampling. The equation system proposed is adequate to be applied to the tree height data derived from ALS point clouds. The tree individualization approach by local maxima filters is efficient to estimate number of trees and wood stock from ALS data, as long as the results are previously calibrated with field data.


Introduction
Studies involving forest airborne laser scanning (ALS) have increased in recent decades due to their potential to extract information on terrain and biomass (Giongo et al., 2010).However, despite being an advanced technology, errors or bias in estimates are often reported and there is no consensus as to the best methodology to be employed when using ALS data.
In general, these methodologies can be classified into two widely applied approaches, based on: measurements at stand level, such as the area-based approach (ABA); and tree individualization (TI).
The ABA is based on the metrics extracted from the point clouds generated by ALS, which are correlated with forest characteristics (Bouvier et al., 2015;Guerra-Hernández et al., 2016).In this methodology, plots distributed in the field are scanned, and their data are used to fit models to estimate characteristics of interest of the entire forest (Naesset, 2002).In contrast, the TI approach is based on algorithms that identify each individual of the stand by using segmentation methods or local maxima filters, allowing to extract data related to tree height and canopy area from canopy digital surface models (Hyyppä et al., 2001(Hyyppä et al., , 2003;;Koch et al., 2006;Kaartinen et al., 2012;Vauhkonen et al., 2012;Zawawi et al., 2015).The tree data must then be individually processed to produce informationusually number of trees and wood stocks of the stands -for forest managers.
Several studies have shown high accuracy for the ABA in estimating forest parameters, with errors lower than 15% (Silva et al., 2015(Silva et al., , 2016)), and others good accuracy for the TI method in control plots (Kwak et al., 2007;Oliveira et al., 2012).However, some authors observed a certain tendency of TI approaches to underestimate canopy surface heights or number of trees in the stands (Zandoná et al., 2008;Zawawi et al., 2015).
A peculiarity of the ALS survey is its inability, to date, to directly obtain diameters at breast height (dbh) or at 1.30 m for trees flown over, because the sent pulses are intercepted by the canopies before reaching the base of the trunks.Since this variable plays a key role in forest management (Campos & Leite, 2013) and given that the ABA does not deal with trees individually, a good method to estimate this variable through ALS data becomes necessary to estimate tree volume or support other diameter studies.
The most practical alternative to solve this problem is estimating dbh using tree height data through regression models based on, for instance, the heightdiameter relationships used to estimate tree height in a traditional inventory (Soares & Tomé, 2002;Campos & Leite, 2013).In this case, it should be highlighted that, besides the accuracy in the height obtained by ALS, the chosen model is also of fundamental importance to measure dbh without the risk of residual distribution trends or inconsistencies in the natural behavior of the dbh data.Although several models have been proposed for this (Hyyppä et al., 2001;Chen et al., 2007;Cao & Dean, 2013;Oliveira et al., 2014b), there is no consensus about their application in different situations since their aim is only fitting well to data; therefore, there is a need to develop a model with a biological pattern.
In this context, a methodologically reliable and consistent individual tree approach will allow obtaining the vertical structure and wood stock of a forest by using a single survey, with the equations from previously analyzed samples.Therefore, usual inventory methods could be replaced by airborne scanning, which offers greater agility and is carried out similarly to the total enumeration inventory or census (Oliveira et al., 2014b).
Given the peculiarities of TI methods, their impact on large-scale eucalyptus plantations is still unknown and should be investigated in order to develop alternatives to bypass their idiosyncrasies.
The objective of this work was to evaluate the application of airborne laser scanning (ALS) to a large-scale eucalyptus stand inventory by the method of individual trees, as well as to propose a new method to estimate tree diameter as a function of the height obtained from point clouds.

Materials and Methods
Pre-felling inventory data of eight eucalyptus forest areas -four in a high forest system and four in coppice -were used.Each area had one or more subunits, i.e., stands, defined by different clones and/or spacing and with ages varying from four to seven years, totaling 1,681 ha.These areas belong to Celulose Nipo-Brasileira S/A, a company located in a region characterized by rough terrain, called Vale do Aço, in the state of Minas Gerais,Brazil (19º18'45"S,42º19'44"W).
These areas were chosen because of their uniform and properly conducted stands, free from damage caused by wind, pests and diseases, or fire.Twenty-two clones of the Eucalyptus grandis W.Hill x Eucalyptus urophylla S.T.Blake hybrid were planted in these areas, spaced at 3.0x3.0or 3.0x3.3m.
A total of 128 rectangular-shaped sampling plots of approximately 300 m 2 (Table 1) were used, located randomly in each one of the eight areas and corrected according to the slope of the terrain (Soares et al., 2011).In each plot, outside bark diameter at breast height (dbh) was measured for all trees.Total height (Ht) was also determined for the first three trees whose dbh was measured; then, the trees were felled, and their volumes were estimated by the Smallian method (Soares et al., 2011).These data were used to fit, for each area, height-diameter and volume equations to estimate the height and volume of non-measured trees and also to obtain the diameter and volume of trees by ALS.
The following three models were fit for each area.The model of Schumacher & Hall (1933) was chosen to estimate tree volume: ln( ) where ln is the Naperian logarithm, b i are the model parameters, dbh is tree diameter at 1.30 m, and Ht is tree height.
The diameter equations were fit using the exponential model: dbh = e b +b Ht 0 1 which is compatible with the height model of Henriksen (1950): These last two models were fit with the R software (R Core Team, 2017), and their use is justified by the fact that they follow a biological pattern similar to that of the data and have a consistent relationship with each other, as shown below: where ln is the Naperian logarithm, e is Euler's number, b i are the model parameters, dbh is tree diameter at 1.30 m height, and Ht is tree height.
To evaluate the quality of the fit to the methodology developed for obtaining tree diameter, a graphical analysis of observed versus estimated values and residual histograms was performed considering all inventory data together.The accuracy of the values estimated by each equation was determined by the root mean square error (RMSE) and bias for each area separately: where n is the number of trees, is the mean of observed values, is the observed value for tree i, and is the estimated value for tree i.
For laser scanning, the Cessna 206 model aircraft (Cessna Aircraft Company Inc., Wichita, KS, USA) was used, with a cruising speed of 198 km h -1 at a flight height of 618 m, a 60° field of view, a track width (swath) of 713 m, a footprint of 0.31 m, and a scanning frequency of 300 kHz, acquiring an average of five points per square meter.
The ALS data were pre-processed with the Inpho UASMaster software (Trimble Inc., Sunnyvale, CA, USA) in order to classify the points for the return sequence.With the ArcGIS, version 10.2.2, software (Environmental Systems Research Institute, Redlands, CA, USA), the terrain and crown points (first return) were interpolated through the triangulated irregular network by the nearest neighbor to generate the following digital elevation models (DEM) with 50cm cells: digital terrain model (DTM); canopy digital surface model (DSM); and normalized forest model (NF), generated by subtracting the values of the DSM from those of the DTM.
To locate tree canopies, the following softening filter proposed by Hyyppä et al. (2001) was applied to transform the NF into a smoothed surface (Figure 1 A  and B): The local maxima filter was than applied in a twostep sequence.In the first step, it was used to assign the highest value among the cells in a window search of 5x5 cells.In the second, a conditional tool was used to identify which cells of this new filtered surface would be the same as those of the smoothed model; these cells were then converted to vector points (Figure 1 C) to be used in the extraction of the tree height values contained in the original NF.Therefore, the attribute table of this last vector file could be exported to be used as an ALS inventory database of each area, which is a list of the vectored point (tree top) with its respective height value.
Then, the outlier height values were excluded from the database of each area.These values were produced from isolated peaks in the NF located in open regions (Figure 1 C) or on the edges of the plots caused by shrubs, broken trees or branches identified by the local maxima filter, or even by an abrupt variation in the surface generated during triangulation.However, because the database is extensive, a case-by-case investigation of these points becomes operationally unfeasible.Consequently, all values below 5 m were excluded, assuming that trees outside the investigated age range would not represent a considerable economic value, as well as those above 40, 35, 30, and 25 m for areas with 7, 6, 5, and 4 years of age, respectively.These limit values were based on the tree heights obtained from field sampling.
Next, dbh and volume were estimated using the equations fit based on the three trees in the inventory plots of each area.Then, the wood stock of each area was obtained as the sum of all tree volumes.
Because the inventory plots are not georeferenced and the area flown over is extensive, it is unfeasible to perform a tree-by-tree comparison of the estimated height and volume generated from ALS with those observed for each sample.For this reason, the ALS estimates were compared with the final results obtained from field sampling.
In order to evaluate estimated heights, distribution histograms were drawn from ALS and field sampling data using a 2-m amplitude for classes, and the bilateral Kolmogorov-Smirnov (K-S) test was performed.

Results and Discussion
The estimates for the height and volume equations have well-distributed errors (Figure 2), with a frequency near the 0% interval, and the estimated values show good adherence to the observed ones.However, the residual dispersion indicated a slight tendency to overestimate the initial diameter values.This reflects in precision estimates (Table 2), as the bias was low for the hypsometric (≤0.06 m) and volumetric (≤0.1 m 3 ) equations, but between 0.0 and 0.16 cm for the diametric ones.The RMSE also indicated a low relative error of the estimates in all cases (≤16%).It should be noted that the model of Henriksen (1950) can present negative values for extremely small diameters, which was not observed in the present study.These indicators point to the efficiency of the proposed equation system in obtaining tree diameter and height, maintaining a biologically consistent pattern when estimating values outside the limits of the fitted data.This characteristic is an advantage compared with other empirical or asymptotical models, as found in Cao & Dean (2013) and Oliveira et al. (2014b), due to the lower risk of discrepant estimates in the case of extrapolations, especially in relation to asymptotic models, which could produce values tending to infinite.
After processing the inventory using both the ALS and traditional forest inventory methods, it was verified that scanning underestimated number of trees and wood stock, even surpassing confidence intervals (Table 3).The error percentage for number of stems varied from -8 to -35%, showing a low correlation with stand age (Figure 3 A); this indicates that the factor did not significantly interfere with the performance of the local maxima filter regarding the analyzed ages.
The obtained results contrast with those presented by Oliveira et al. (2014a), who found greater counting errors of -8.9% for seven-year-old plots, without detecting a simple relationship between age and accuracy; this pattern was also not evidenced in the present study.However, considering that tree canopies tend to increase over time (Wink et al., 2012) and may even overlap, greater detection errors are likely to occur with increasing stand age; therefore, further studies are needed to investigate this.In addition, due to the   = 0 1 2 .dbh, diameter at breast height, in centimeters; Ht, tree height, in meters; V, tree volume in cubic meters; b 0 and b 1 , model parameters; and RMSE, root mean square error.Pesq. agropec. bras., Brasília, v.53, n.12, p.1373-1382, Dez. 2018 DOI: 10.1590/S0100-204X2018001200010 rough terrain, the horizontal distances between trees on sloped areas are reduced and the spacing is smaller than in flat regions.In this case, the local maxima filter window may include more than one canopy at a time, in order to omit those of lower height.Therefore, additional studies should be conducted to evaluate the effect of different cell sizes and search windows on stands in sloped areas.
Regarding volume, the errors varied between -13 and -56%, also exceeding the confidence intervals of the traditional inventory (Table 3).However, contrary to the observed for number of trees, there was a strong correlation between volume error and age (Figure 3 B), tending to decrease as the age of the stand increased.This decrease is more evident for the high forest system, which shows a much higher percentage of errors than the coppice system for younger ages, but a lower one for advanced ages.This can be explained by the variation in stand volume over time; since younger stands have a smaller wood stock, a small variation in the estimate would to a greater amount than that in the more stocked stands.This could also explain the difference between the error trends for the two evaluated systems, which present different wood inventories and confidence intervals that vary from area to area.
Despite the relative errors, tree number and estimated volume were directly proportional to those estimated from field sampling (Figure 3 C and D); therefore, the ALS estimates can be corrected by applying the correction factors established by the comparative analysis with field samples.This novel result allows forest managers to bypass the problems related to the local maxima approach or an eventual underestimation of tree height by ALS, which is usually reported in the literature (Zandoná et al., 2008;Oliveira et al., 2014b).
Furthermore, these relationships show that ALS underestimates, in absolute terms, tree number or wood stock as the reference values increase.In this case, the reduction in the number of trees detected by the local maxima filter contributes greatly to errors in the estimation of wood stock in each area, since total  volume is obtained by the sum of the volumes of all individual trees.Another factor that could interfere with wood stock estimation is the accuracy of tree height obtained by ALS.The first analysis showed that the K-S test had no effect only on area P6 (Table 4); that is, except in this area, ALS generated a horizontal structure different from that from field sampling.This result could be better understood through the analysis of histograms, which indicates a discrepancy between the two surveys for most of the studied areas (Figure 4).It should be pointed out that the height distributions derived from ALS for the older areas (P7 and C7) shifted to the right of those from the field plots, indicating a higher frequency of trees with greater heights.The opposite occurred for the other areas, indicating the detection of more trees with heights lower than those obtained from field sampling.
It should be noted that the displacement of the distribution histograms does not directly indicate whether the height of individual trees was underor overestimated, but only represents the vertical structure of the stand as a whole.This means that if the distributions happen to be equal, such as in P6, there is a strong evidence that scanning is consistent in representing the vertical structure of the forest.Another important observation, based on the boxplot graphics (Figure 4), is that the height limits identified by field sampling were exceeded by those derived from ALS, even after the removal of outlier data.
These results show that the TI method failed to accurately represent the characteristics of the forest and that the main cause of the errors was the omission of a large number of trees by the application of the local maxima filter.
This type of error is common and varies from study to study, according to the structure of the stand and is highly influenced by the method of individualization adopted (Kaartinen et al., 2012).Vauhkonen et al. (2012), for instance, evaluated several canopy detection methods for different forest types and found an average accuracy of 54 to 75% for native European forests and of 86% for a eucalyptus stand in Brazil.A similar precision of 82.8% was reported by Zandoná et al. (2008) in dense plantations of 40-year-old Pinus sp., when associating the local maxima filter with a canopy segmentation method.In another work with young plantations, Oliveira et al. (2012) obtained errors between 1.15 and 3.42% in the canopy counts of two eucalyptus plots using the local maxima method with a 5x5 cell window.The accuracy of the latter work can be attributed to the small areas of 21.71 and 21.76 ha evaluated, the larger tree spacing of 4x3 m, and planting uniformity due to the location of the stands in a flat topography region in the south of the state of Bahia, Brazil.
Several factors can lead to errors in the estimation of wood stock, particularly the number of trees detected.As shown in the present study, the larger the number of trees in an area, the greater the omission errors and the greater the deviation from the actual wood stock of the forest (Table 3 and Figure 3 C and D).In addition, the height distribution of the detected trees is important since a small difference in the number of taller trees leads to greater variations in wood stock; this happens because the individual tree volume increases exponentially in relation to height according to a specific pattern for each site (Campos & Leite, 2013).This behavior explains both the deviation of -27% in the estimate of total volume in the P6 area (Table 3), despite its vertical structure being statistically similar to that of the reference (Table 4), as well as the smaller deviations of -16 -24% in the volume estimate in the P7 and C7 areas, respectively, whose vertical structures are different from those obtained from field sampling.
Another source of error is the estimation of tree height from point clouds.Although this is related to the adopted TI methodology, a greater effect is observed on the processes that precede the development of the digital NF.If heights are determined incorrectly, their value could be assigned to a correctly identified tree, propagating this error to the database.Some of the main factors that influence this type of error are the  method of interpolation used to generate the DEM, the density of the points adopted in scanning, and the lack of correction and calibration of the DEM (Kaartinen et al., 2012;Oliveira et al., 2012;Jakubowski, 2013;Hansen et al., 2015).However, there is evidence that laser scanning tends to underestimate tree height (Gaveau & Hill, 2003;Zandoná et al., 2008;Oliveira et al., 2014b).Furthermore, alterations in canopy surfaces caused by winds should also be taken into account.According to Ataíde et al. (2015), trees on sloped terrains are more susceptible to the action of winds, especially in regions with narrow valleys.In this situation the air flow is more intense, causing the canopies of the trees to come closer in a way that makes it difficult to individualize them in the surface model; their heights can also be reduced due to stem leaning.Regarding topography, studies have shown that the higher the variation, the greater the chances of deviation in scanning accuracy (Hodgson et al., 2003;Naesset, 2004).Consequently, since the areas evaluated in the present work are in rough terrain and without a wide scan calibration, it is possible that these errors were transmitted to the volume estimates.
Another point that should be highlighted is the delimitation of the forest to be inventoried, particularly in the case of homogeneous plantations.This means that, during the pre-processing of data, it is necessary for the polygons corresponding to the plots to be reliable representations of the planted areas to be used in the DEMs, otherwise, heights corresponding to unwanted objects, such as native forests or regeneration trees, roads, firebreaks, transmission antennas, fences, among others, may be included in the inventory.Therefore, it is important for those polygons to be refined with the aid of an up-to-date image of the site or a pseudo-image generated by the scan.

Conclusions
1.The proposed equation system is adequate to be applied to the height data derived from airborne laser scanning (ALS).
2. The tree individualization approach by the local maxima filter is efficient to assess forest tree number and wood stock from ALS data, as long as its results are previously calibrated with the field inventory.

Figure 1 .
Figure 1.Digital model image of the normalized forest before (A) and after (B) the application of the smoothing filter and after the local maxima filter (C).

Figure 2 .
Figure 2. Histograms and residual dispersion and graphs of the relationship between observed and estimated values for tree diameter at breast height (dbh), height, and volume.

Figure 3 .
Figure3.Relationship between the age of the area and the error related to number of trees (A) and wood stock (B) for the high forest and coppice systems, as well as relationship between the number of trees (C) and wood stock (D) estimated from field sampling and airborne laser scanning (ALS).

Figure 4 .
Figure 4. Distribution of tree heights estimated from field sampling and airborne laser scanning (ALS) in high forest and coppice areas.

Table 1 .
Number of sampling plots per forest area.

Table 2 .
Fitted equations and respective precision statistics.

Table 3 .
Number of trees (NT) and wood stock (WS) estimated from field sampling (FS) and airborne laser scanning (ALS).

Table 4 .
Results of the Kolmogorov-Smirnov test for two samples(1).