Comparing the Performance of Ground Filtering Algorithms for Terrain Modeling in a Forest Environment Using Airborne LiDAR Data

The aim of this study was to evaluate the performance of four ground filtering algorithms to generate digital terrain models (DTMs) from airborne light detection and ranging (LiDAR) data. The study area is a forest environment located in Washington state, USA with distinct classes of land use and land cover (e.g., shrubland, grassland, bare soil, and three forest types according to tree density and silvicultural interventions: closed-canopy forest, intermediate-canopy forest, and open-canopy forest). The following four ground filtering algorithms were assessed: Weighted Linear Least Squares (WLS), Multi-scale Curvature Classification (MCC), Progressive Morphological Filter (PMF), and Progressive Triangulated Irregular Network (PTIN). The four algorithms performed well across the land cover, with the PMF yielding the least number of points classified as ground. Statistical differences between the pairs of DTMs were small, except for the PMF due to the highest errors. Because the forestry sector requires constant updating of topographical maps, open-source ground filtering algorithms, such as WLS and MCC, performed very well on planted forest environments. However, the performance of such filters should also be evaluated over complex native forest environments.


INTRODUCTION
Airborne Light Detection and Ranging (LiDAR) is an active remote sensing technology that combines laser and positioning measurements to survey and map with high accuracy objects over a given surface (Carter et al., 2012).Airborne LiDAR has been a prominent technology in the representation of surfaces, especially in the generation of digital terrain models (Montealegre et al., 2015).
The Digital Terrain Model (DTM) derived from airborne LiDAR data is a representation of the bare surface, in other words, a surface free of any manmade and/or natural objects.DTMs are very useful for the delineation of watersheds and stream network extraction and are also a good source of data to perform geological and morphological analyses, as well as road planning (Sulaiman et al., 2010).However, the main challenge to obtain an accurate representation of a bare terrain is the efficiency of correctly classifying the ground returns out of the entire LiDAR dataset.According to Susaki (2012), the main errors that occur during the filtering of ground returns in a LiDAR dataset can be classified into two categories: i) type I or omission error, when the returns belonging to the ground are classified as other objects, such as vegetation or buildings; and (ii) type II or commission error, when small objects are erroneously classified as ground points.In addition, the accuracy of digital terrain modeling also depends on several other factors: a) sensor parameters and flight characteristics, such as the LiDAR pulse density (pulses•m -2 ), which is a consequence of factors such as elevation and flight speed (Ruiz et al., 2014); b) characteristics of the land surface, such as topography and presence of dense and complex vegetation coverage (Sithole & Vosselman, 2004); and c) the processing techniques used to generate the DTM, such as the ground filtering algorithm and subsequently the interpolation method, as well as the spatial resolution of the DTM (Liu, 2008).
Although there are several options to select algorithms for filtering ground returns from airborne LiDAR datasets, few studies have addressed the performance of these algorithms for DTM generation, especially across a gradient of different types of land use and land cover commonly found in planted forest environments.Considering this theme, the main objective of this study was to evaluate the performance of four ground filtering algorithms used to generate DTMs.

Study area
The study was conducted in a forest area belonging to Capitol State Forest located in western Washington state, USA.The study area covers approximately 122 hectares (Figure 1) and, according to Andersen et al. (2005), the forest is composed of species of conifers, such as douglas-fir (Pseudotsuga menziesii (Mirb.)Franco), western hemlock (Tsuga heterophylla (Raf.)Sarg.), and western red cedar (Thuja plicata Don.).

LiDAR data acquisition and processing
The LiDAR data are available at the Pacific Northwest Research Station (PNRS) portal of the US Forest Service (USFS).The LiDAR dataset is freely available for educational purposes.The LiDAR point cloud was downloaded from the official site of the LiDAR FUSION/LDV data processing and visualization program (McGaughey, 2015).The Saab TopEye system was shipped on a helicopter that flew over the entire study area (Figure 1).According to the data description, the LiDAR survey was performed in 1999 using a discrete feedback sensor.LiDAR data collection was performed with the following specifications: flying height of 200 m; flying speed of 25 m•s -1 ; swath width of 70 m, forward tilt of ± 8 degrees; laser pulse density of 3.5 points•m -2 ; laser footprint of 40 cm, and laser pulse rate of 7000 Hz.For each return, the number of pulses, number of returns per pulse, corresponding Universal Transverse Mercator (UTM) coordinates, elevation, acquisition angle, and return intensity were recorded.

Land Use and Land Cover description
Concomitant with the LiDAR survey, aerial photographs were also acquired.These images were orthorectified by the USFS and resampled to spatial resolution of 30cm.Aiming to evaluate the performance of four ground filtering algorithms in different classes of land use and land cover, a visual classification was performed (Figure 1).The following classes were considered: Class I: Shrubland Grassland is a class in which the vegetation is dominated by grasses rather than shrubs or small trees, although they may occur sparsely in some locations.Bare soil corresponds to areas with exposed terrain.In general, this class consists of areas in which clear cutting or harvesting of the trees occurs.

Ground filtering algorithms for the filtering procedures and classifying ground returns
Four different ground filtering algorithms were selected to generate DTMs.The four ground filtering algorithms used in this study are described ahead:

i) Weighted Linear Least Squares (WLS).
The WLS algorithm is available in GroundFilter t o o l i m p l e m e n t e d i n F U S I O N / L D V.
The FUSION v.3.30 (McGaughey, 2015) is free software developed and maintained by the US Forest Service and used for LiDAR data processing and visualization.The WLS algorithm was first proposed by Kraus & Pfeifer (1997), and it combines both filtering and interpolation procedures.The chosen window size (cell size) for WLS was seven meters; ii) Multi-scale Curvature Classification (MCC).
The MCC algorithm is available in the MCC-LiDAR software.The MCC was also developed and maintained by the USFS.The MCC was first proposed by Evans & Hudak (2007), and it aims to generate DTMs from LiDAR data.The MCC algorithm operates through the elimination of points exceeding a given curvature threshold calculated over an interpolated surface known as spline (Evans & Hudak, 2007).Adjustment is performed with scale and curvature parameters.
After testing several parameters, we chose the parameters of scale and curvature of 1.5 and 0.3, respectively; iii) Progressive Morphological Filter (PMF).
The PMF algorithm was first proposed by Zhang et al. (2003), and is currently available in ALDPAT software.The ALDPAT was developed by the National Center for Airborne Laser Mapping (NCALM).It is also free software to process LiDAR datasets.The PMF filters the ground returns using the elevation differences among cells of a moving window calculated through morphological operations such as erosion and dilation (Montealegre et al., 2015;Zhang et al., 2003).In this study, after testing several parameters, the following settings were adopted: window size of 3m, initial searching radius of 1m, and declivity threshold of 0.08;   (Isenburg, 2015).LASTools is a compilation of tools for LiDAR data processing.Unlike the three aforementioned algorithms, LASTools is the only commercial algorithm.However, it can be freely used for small datasets.Based on the methodology proposed by Axelsson (2000), the PTIN divides the LiDAR dataset in small blocks.First, the algorithm selects the lowest points in each block in order to start the filtering procedure.After that, a triangulated irregular network (TIN) is built with these selected points.Next, an imaginary point is added to the point cloud, making a TIN densification.Finally, the TIN is progressively densified until all points are classified as ground or objects on this ground (Montealegre et al., 2015).
In this study, the LASground tool was executed using a window size (step size) of 5m.

Generating and comparing digital terrain models -DTMs
GridSufaceCreate, a tool implemented in FUSION/LDV, was used to create DTMs after the point cloud classification.This tool uses random points to create a grid surface on which the value of each cell is the mean elevation of all points within it (McGaughey, 2015).DTMs were generated with 1m resolution, compatible with the point density from the collected LiDAR data.The workflow to generate the DTMs is presented in Figure 2.
For comparison purposes, elevation values from 20,000 points randomly selected in the study area were extracted from the DTMs generated by each algorithm to calculate the differences and statistics between them.Comparison between the data obtained by the classifiers was based on the absolute and relative root mean square error (RMSE), according to Equation 1 and Equation 2, and the mean absolute error (MAE), according to Equation 3.

( ) ( )
where x i is the elevation value from DTM 1, y i is the elevation value from DTM 2, ˆ x is the mean elevation from DTM 1, and ˆ y is the mean elevation from DTM 2. The elevation values for each land cover class generated by the algorithms were also compared with each other by the Kolmogorov-Smirnov test using the ks.test function from the R software (R Development Core Team, 2016).

RESULTS AND DISCUSSION
The WLS, MCC, and PTIN ground filtering algorithms returned a similar classification of terrain points, whereas the PMF showed a distinct pattern in relation to the others (Figure 3).The PMF filtering generated a homogeneous density of points throughout the area, whereas the filtering output from the other algorithms resulted in regions with higher density of points on roads or open spaces and areas with lower   ,910,755 (39.31%), 1,716,276 (35.31%), 1,327,550 (27.31%) and 121,071 (2.50%) points, respectively.The PMF algorithm classified a smaller number of points as ground than the other algorithms, corroborating the findings by Sulaiman et al. (2010).
The WLS and MCC filtered point clouds presented high density of points, mainly in the eastern region of the study area, where the classes of exposed soil and low density forest land use are predominant.In this region, greater density of ground points was obtained, because interception of the laser pulse by the vegetation is reduced.The DTMs were created for each algorithm using the GridSurfaceCreate tool and the filtered ground points.Figure 4 shows the created models as well as two terrain profiles for each model, one taken in the north-south direction and another in the east-west direction of the study area.In general, the generated DTMs are very similar.By the Kolmogorov-Smirnov test, the elevation distributions from the DTMs generated by the algorithms were statistically similar (Ks: D < 0.01; p-valor > 0.8) at a significance level Difference maps were generated by subtracting the different DTMs to highlight the similarities and divergences between the algorithms (Figure 5).The DTM generated by the PMF is the most distinct from the others, mainly in relation to those generated by the WLS and MCC algorithms.The differences in elevation between the DTMs generated by the PMF and the other algorithms are usually positive; this means that the PMF tended to underestimate the DTM elevation values.In contrast, the MCC algorithm presented results very similar to those of the WLS and PTIN algorithms, and the WLS presented only small differences compared with the PTIN.It is also possible to highlight that most differences between WLS and MCC are negative.In other words, the MCC algorithm presents higher elevation values, contrasting with the results generated by the PMF.
Regarding the land use and land cover classes, greater difference was observed in the DTMs from class I (Shrubland), followed by classes V (Grassland), VI (Bare Soil), and III (Open canopy forest).We observed that the differences in DTMs vary between classes and algorithms, according to the results of RMSE and MAE (Table 1).
The Shrubland area (Class I) presented the highest RMSE values, with 0.37m, 0.36m, and 0.35m, respectively, for the comparison of PMF in relation to MCC, WLS, and PTIN.In general, the PMF algorithm presented the greatest differences in relation to the others, both for RMSE (m and %) and MAE, and among  The PMF algorithm tends to underestimate terrain elevation, as observed for the relation with the other DMTs.In addition, some authors have observed that one of the main problems of the PMF algorithm is that it assumes a constant slope on the area (Hui et al., 2016), a very difficult condition to be observed in forest environments, whether native forest or commercial plantations.The same assumption is valid for the study area, which presents flat regions in the center and abrupt inclinations in the outermost areas.We observed that the largest errors occurred in the areas with the highest slopes, regardless of the land use and land cover class.
The lowest absolute RMSE values were observed in the bare soil, in the comparison between WLS and MCC with RMSE of 0.10 m, and between MCC and PTIN with 0.12 m, respectively.In general, considering both the grassland and bare soil classes, the differences between the DTMs from the MCC, PTIN and WLS algorithms were small, but they were greater when compared with the DTM from the PMF algorithm.As for the MAE values, the only comparison that presented positive values was between WLS and MCC for the  first two land use and land cover classes, whereas for the other classes, the values were null.The differences between the other algorithms for all other classes were negative in all cases, which means that MCC and WLS estimate elevation values slightly higher than the other algorithms.Although there are variations between the DTMs, the results are very similar, with a maximum 0.06% MAE and; therefore, the algorithms present only punctual differences.

CONCLUSIONS AND RECOMMENDATIONS
The four ground filtering algorithms (WLS, MCH, PMF, and PTIN) were able to properly eliminate objects above the ground in a forest environment, as the terrain models presented very small divergences.Considering that the forest sector requires updated topography data for planning and management purposes, we conclude (4.72ha); Class II: Closed canopy forest (canopy cover ≥ 70%; 35.85ha);Class III: Open canopy forest (canopy cover ≤ 40%; 19.75ha); Class IV: Intermediate canopy forest (40% ≥ canopy cover ≤ 70%; 35.57ha);Class V: Grassland (3.89ha); and Class VI: Bare soil (21.93ha).Shrubland is a vegetation community characterized by shrubs, also including small sparse trees.The open, intermediate and closed canopy forest classes are related to tree density and canopy coverage after silvicultural interventions.

Figure 1 .
Figure 1.Study area location in the USA (A), orthorectified aerial photography of Capitol Forest and land cover classes (B).

Figure 2 .
Figure 2. Flowchart of the LiDAR data processing.
Comparing the Performance of Ground... Floresta e Ambiente 2018; 25(2): e20160150 density of points, such as the regions of closed canopy forest.From the 4,859,996 original points, the WLS, MCC, PTIN and PMF algorithms classified as ground 1
Comparing the Performance of Ground... Floresta e Ambiente 2018; 25(2): e20160150 these differences, the smallest one occurred in relation to the PTIN algorithm.

Figure 5 .
Figure 5. Differences in elevation for the DTMs in classes of land cover.
The PTIN algorithm is available in the LASground tool found in LASTools

Table 1 .
RMSE and MAE of pared DTMs across land use and land cover types.= number of observations; RMSE = Root mean square error; MAE = Mean absolute error; SD = Standard deviation. N