Monitoring the understory in eucalyptus plantations using airborne laser scanning

: In eucalyptus plantations, the presence of understory increases the risk of fires, acts as an obstacle to forest operations, and leads to yield losses due to competition. The objective of this study was to develop an approach to discriminate the presence or absence of understory in eucalyptus plantations based on airborne laser scanning surveys. The bimodal canopy height profile was modeled by two Weibull density functions: one to model the canopy, and other to model the understory. The parameters used as predictor in the logistic model successfully discriminated the presence or absence of understory. The logistic model composed by g canopy , g understory , and g understory showed higher values of accuracy (0.96) and kappa (0.92), which means an adequate classification of presence of understory and absence of understory. Weibull parameters could be used as input in the logistic regression to effectively identify the presence and absence of understory in eucalyptus plantation.


Introduction
Forest plantations provide edaphic and microclimatic conditions that benefit the development of weeds in the understory (Vasic et al., 2012). Competition for site resources (water, light, and nutrients) resulting from the presence of understory in early stages of the forest, causes forest productivity loss (Carrero et al., 2018;Rubilar et al., 2018). In late stages, the understory presence results in problems for silviculture treatments, monitoring activities (i.e. inventory), harvesting operations, and risks of fires (Souza et al., 2007).
Understory control is usually applied during the early stages of the forest to improve the establishment of trees, ensuring full access to growth factors (Kogan et al., 2002;Liechty and Fristoe, 2013;Vargas et al., 2018;Zhou et al., 2018). The most common practices to control understory development in forest plantations are hoeing and herbicides application (Silva et al., 2012;Toledo et al., 2000).
Assessment of understory intensity in forest plantations is commonly based on visual and empirical monitoring. Recent studies have investigated the capability of remote sensing (RS) for monitoring (Hamraz et al., 2017a, b;Martinuzzi et al., 2009;Sumnall et al., 2016) and classification of understory (Hung et al., 2014). Several approaches have been used to estimate forest physical characteristics (i.e. volume, biomass, stand density) from LiDAR point cloud, such as the regression model (Sumnall et al., 2016), probability density function modeling (Coops et al., 2007), and machine learning (Singh et al., 2015). Discrete-return light detection and ranging (LiDAR) is capable to characterize vertical forest structure, including understory layers (Hamraz et al., 2017a, b;Sumnall et al., 2016). The canopy height profile (CHP) describes the vertical structure by the return frequency distribution within the canopy profile, from the ground to the maximum height. This study aims to develop a machine learning based on logistic regression to classify the occurrence or absence of understory based on the CHP extracted from airborne laser scanning data.

Materials and Methods
Four forest plantation sites in the Rio Doce basin, Minas Gerais State, Brazil ( Figure 1) were considered in this study. The sites belong to a private company that produces cellulose from eucalyptus fiber, and comprise legal reserves, areas of permanent protection, and eucalyptus plantation. Legal reserves and areas of permanent protection are regions of native vegetation protected by law (Machado and Anderson, 2016). This study focuses on plantation areas.
The municipalities of Marliéria, Dionísio, and Açucena have a tropical climate with dry winters (Köppen -Aw). The municipality of Antônio Dias has a humid subtropical climate with dry winters and temperate summers (Cwb). The climate of Nova Era and Bela Vista of Minas is characterized by hot summers and dry winters (Cwa). All sites have annual rainfall ranging from 1361 mm to 1520 mm and mean annual temperatures ranging from 19.8 ºC to 21.8 ºC (Alvares et al., 2013).
Area I comprised 96 ha of 23-year-old eucalyptus stands. Area II consisted of 21-year-old eucalyptus stands, covering 47 ha. Area III was composed by 17-year-old eucalyptus stands planted in 222 ha. Finally, Area IV comprised 242 ha of 5-year-old eucalyptus stands. The highest values for tree height and diameter at breast height (DBH) were found in the older stands (areas I and II). The younger stands are in Area IV, resulting in lower heights and DBH values (Table 1).

Research Article Understory in eucalyptus plantation
Sci. Agric. v.78, n.1, e20190134, 2021 In the GIS environment, we sampled randomly 50 circular plots of 16.93 m radius (900 m²) in each site, resulting in 200 plots. None of the plots overlapped. The sites were visited and the presence or absence of understory was visually determined by walking through the stands, following the company standards. During regular company operations (e.g. ant control, fertilization, or forest inventory), workers visually identify understory presence and report it to the responsible sector. The company assigns a specialized group to the area in order to assess understory intensity and species community. Due to limitations of LiDAR technology further discussed, this study did not aim to evaluate the species community of the understory.
The airborne laser scanning (ALS) campaign was conducted in 2014, in the same period of field inspection. The ALS clouds presented an average return density of 13.51 pts m -2 , 9.26 pts m -2 , 9.23 pts m -2 , 7.87 pts m -2 , for Areas I, II, III, IV, respectively.
We applied an outlier filter in the original ALS clouds to remove the returns outside the range of four standard-deviation (McGaughey, 2015). The ground returns were classified by Kraus and Pfeifer (1998;2001) algorithm considering an 8-meter window size (Andrade et al., 2018) and parameters recommended by McGaughey (2015). Digital terrain models (DTM) with 1-meter resolution was created by averaging the ground points within each pixel. The ALS clouds were normalized by subtracting each return elevation by the corresponding DTM pixel (Popescu and Wynne, 2004).
From the normalized clouds, we clipped the same 200 plots inspected in the field and computed the return distribution along with the canopy height (CHP). To reduce the influence of the growth stage (young stands with smaller trees and old stands with taller trees) in the logistic modeling, we rescaled the CHP to the range between zero (ground) and one (maximum height). By processing the data, we visually detected that the normalized CHP showed a common division in the half of the normalized height, producing consistently bimodal profiles (Figure 2). Due to this behavior, we established the value 0.5 as a breaking point to split the CHP into two parts: canopy and understory. One Weibull model fitted the canopy (returns above 0.5) and other the understory (return below 0.5) (Figure 2). The Weibull function of two parameters (Equation 1) is commonly used to model the CHP due to its flexibility in the representation of different distribution shapes (Coops et al., 2007;Silva et al., 2015).
where: the Weibull probability density function is a function of the lidar-plots heights (x), shape (g) and scale (b) parameters. The Weibull coefficients (g canopy , b canopy , g understory , and b understory ) was used as predictor variables in the logistic modeling, and the understory classified in the field (presence or absence) as the predicted variable (Equation 2). To validate the model, we used the stratified ten-fold cross-validation method (Singh et al., 2015).
where: logit(P) is the odds ratio, P is the probability of understory occurrence, g canopy , b canopy , g understory , and b understory obtained from the Weibull distribution modeling are the predictor variables, and a's are the logistic regression parameters.
Logistic regression is a machine-learning algorithm that estimates the probability of an event occurring (Dangeti, 2017). The logistic regression is based on classical statistics assumptions (Singh et al., 2015) and returns a statistical report that makes interpretation easier (Cutler et al., 2007). The logistic regression parameters were statistically analyzed using a z-test to confirm their ability to classify the occurrence of the understory in the eucalyptus stands. The non-significant predictors were removed and the logistic regression parameters were estimated again. The significance of the logistic regression parameters indicates the capacity of predictors to distinguish the presence or absence of understory (Smart et al., 2012). The modeling was evaluated through Akaike Information Criterion (AIC), analysis of variance, accuracy, and the kappa agreement index.

Results
The CHM from plots with dense understory showed a bell-shaped curve for both canopy and understory. For plots with no understory, the canopy revealed a bell-shaped curve, whereas the understory showed a negative exponential curve (Figure 3).
The logistic coefficient associated with the b canopy variable in model 1 (Table 2) was not significant (p > 0.05). After removing the non-significant variable, the logistic model 2 (Table 2) was significant (p < 0.01) to discriminate the presence or absence of understory. Each variable was significantly important (p < 0.001) to reduce deviance compared to the model with only the intercept. Additionally, the lower AIC of model 2 (79.22) reinforces its better fit in comparison to model 1 (80.06). The coefficients of model 2 showed that the probability of understory occurrence decreased as  v.78, n.1, e20190134, 2021 b understory and g canopy increased. Differently, increasing g understory contributed to increase the probability of understory presence (Table 2). Model 2 showed higher values of accuracy (0.96) and kappa (0.92), which means an improvement in the discrimination power (Table 2). Model 2 can be used to adequately classify as presence (specificity = 0.95) and absence (sensitivity = 0.95) of understory. The confusion matrix showed 95 plots correctly classified with understory presence, and 97 were correctly classified with understory absence. Five plots resulted in false positives and three in true positives, based on model 2.

Discussion
The return density of lidar point clouds (ranging from 7.8 to 13.5 pts m -2 ) captured the bimodal distribution that usually characterizes a forest CHP (Mund et al., 2015). The CHP differentiated the presence and absence of understory mainly by the shape of the lower part of CHP (scaled height < 0.5 in Figure 2).
A multi-layered forest blocks the laser energy pathway to the ground. In eucalyptus plantations with no understory, a multimodal distribution can be observed (Coops et al., 2007;Görgens et al., 2016). One mode is associated to the canopy (a bell-shaped curve) and another to the ground (negative exponential curve) (Görgens et al., 2016). When understory is present in forest plantations, the lower mode generally moves up, changing the shape of the distribution curve into a bellshaped curve (Figure 2).
The Weibull parameters successfully identified the presence or absence of understory, showing 96 % of accuracy. Weibull models have been an important descriptor for the forest structure, reducing the CHP into values that could be quickly interpreted (Coops et al., 2007;Jaskierniak et al., 2011).
The b coefficient is related to the 63 rd distribution percentile (Bailey and Dell, 1973). The b understory coefficients presented higher values in presence of understory, which means that laser beams were intercepted by obstacles above ground. The g understory coefficients revealed lower values (close to one) in the absence of understory, indicating that the distribution curve approaches an exponential format (Bailey and Dell, 1973). Wing et al. (2012) found an accuracy of 22 % using ALS-derived metric and a logistic regression model to predict understory in pine forest. In another study, random forest algorithm based on ALS-derived metrics predicted the presence or absence of understory, reaching an accuracy of 83 % in a coniferous forest (Martinuzzi et al., 2009). Singh et al. (2015 found a kappa of 0.648 using a random forest to detect plant invasion in the understory of urban forests. Our accuracy and kappa were superior to previous studies. We attribute part of the improvement to modeling the CHP into two parts: canopy and understory, and to the use of Weibull parameters as regressors.  Currently, it is common in classification studies to apply modern machine learning as random forest and support vector machine. However, as suggested by Singh et al. (2015), the logistic regression is based on classical statistical assumptions and has very direct interpretation. The high values obtained for the accuracy and kappa index reinforce suitability of the logistic regression. The kappa index confirmed a good agreement between the logistic regression classification and the field truth (Silván-Cárdenas and Wang, 2006).
Identification of the understory species is important to define the proper control approach (Silva et al., 2012). However, monochromatic remote sensing, such as airborne laser scanning, is not ideal for species differentiation. Hyperspectral images have been studied to fill this gap when surveyed together with ALS (Broadbent et al., 2014;Dalponte et al., 2019).
Correct mapping of the understory is essential to avoid competition for growth factors by planning silviculture actions. The logistic regression using the Weibull parameters as input performed well and the final model seems appropriated to discriminate presence or absence of understory in eucalyptus plantation. We recommend further investigation to explore the influence of different silvicultural regimes (e.g. higher density of trees) and discriminate species using complementary remote sensing techniques (e.g. hyperspectral).