Multilevel mixed-effect models to predict wood volume in a hyperdiverse Amazon forest

Accurate wood volume predictions are critical in hyperdiverse forests because each species has specific size and shape traits. Although generic models at a multispecies level were widely used in Amazonian managed forests, they are subject to more significant bias due to interspecific variability. We used an extensive database of wood volume collected in managed forests to test the hypothesis that generic models violate the independence assumption due to that predictions vary with species-specific size. Our hypothesis was proved as residuals of the generic model were conditioned to species and specific size. The multilevel models were more accurate both in fitting and validation procedures, and accounted for variance derived from species and specific size, providing a more reliable prediction. However, we found that the size-specific models have a similar predictive ability to species-specific models for new predictions. This implies more practical estimates in hyperdiverse forests where fitting species-specific models can be complex. The findings are crucial for sustainable forest management as they allow for more reliable wood volume estimates, leading to less financial uncertainty and preventing damage to forest stocks through under or over-exploitation.


INTRODUCTION
Accurate predictions of individual tree volume are essential for forest management planning.These estimates are critical in hyperdiverse forests since each species has specific size and shape traits.Given this, local-specific volume models are required by the Brazilian forestry law (Brasil 2009) to control wood volume derived from managed forests in the Amazon (Leão et al. 2021).Allometric models in hyperdiverse forests are classified into two categories: species-specific, which often provides more accurate estimates, and generic or multispecies, which groups several species in a single equation and may lead to biased estimates due to the high interspecific allometry variability (Bojórquez et al. 2020;Magalhães et al. 2021).Some recent studies proved the greater accuracy of species-specific equations for volume predictions in Amazonian managed forests (Santos et al. 2020;Silva et al. 2022).However, it is still unclear how species may affect volume estimates, and further studies are needed to elucidate mechanisms for developing specific volume equations (Lima et al. 2020).
Another critical methodological issue is the ordinary least squares (OLS) fitting method, the primary method applied to fit allometric models (Zar 1968;Brown et al. 1989).Some error assumptions should be met by the data to fit OLS models, such as independence, normality, and homoscedasticity (Burnham and Anderson 2002;Zuur et al. 2010).These assumptions are often ignored, inducing incorrect research conclusions and biased predictions (Dutcă et al. 2018).Allometric models are not independent when data are hierarchically clustered into other variables, e.g., trees from the same species, plot/site, or age may be more similar in allometry (Kearsley et al. 2017).In these situations, models are subjected to nested sources of variability (Hall and Bailey 2001) derived from hierarchical data structure.
When OLS models violate the independence assumption in hierarchical data, another modeling approach is highly recommended (Dutcă et al. 2018).Mixed-effect models may be a suitable approach to account for the non-independence and error autocorrelation (Banin et al. 2012).This approach finds simultaneous estimators for all levels, incorporating random effects that provide hierarchical predictions (Robinson and Hamann 2011;Nascimento et al. 2020).Multilevel mixed-effects models are helpful in a wide range of forestry applications (Hall and Bailey 2001).Although significant improvements were obtained in monospecific stands, mixed models have been applied less frequently in tropical forests, where predictive gains could be high given the high species diversity and allometric variability (Colmanetti et al. 2020).Additionally, few studies used this approach for the improvement of volume estimates (Vismara et al. 2015), particularly in Amazonia.
We hypothesized that local generic models violate the independence assumption because predictions vary with species-specific size.If this hypothesis is not rejected, disregarding data hierarchical structure, two consequences are assumed for wood volume prediction in Amazonian managed forests: (1) generic models tend to underestimate the volume of large-sized species and overestimate the volume of smallsized species; and (2) this tendency can bias new predictions.In this context, this study aimed to apply multilevel mixedeffects modeling to evaluate the effects of hierarchical data on generic volume predictions, comparing the results with a generic multispecies model.

Study area and field data
The study area is Jamari National Forest (JNF), Rondônia state, southwestern Brazilian Amazon (09º30'00''S, 63º16'64''W).The local climate is classified as Kw, with a well-defined dry period in winter according to the Köppen system, with 2,400 mm of mean annual precipitation and 25 °C of mean annual temperature (Alvares et al. 2013).The vegetation is evergreen tropical rainforest, varying between open and dense forest types (Cysneiros et al. 2017a).JNF is a protected area of sustainable use, in which a part of the landscape is destined for forest management under reduced impact logging by authorized companies (Brasil 2000).We used an extensive local database of tree volume collected between 2014 and 2015 in managed forests.Wood volume (V) from 5,010 harvested trees (diameter at breast height, DBH ≥ 50 cm) was measured by Smalian's method (Machado and Figueiredo Filho 2014).The database included 21 species, with 80% (4,008) of trees used for model fitting and 20% (1,002) for validation.Datasets were randomly selected, maintaining the sampling ratio constant (80:20) for each species (Table 1).

Species-specific size
The maximum individual diameter and volume per species were used as size descriptors for clustering species in size groups.We specified three groups a priori by k-means clustering (k = 3), using the Ward method and Euclidian distance in factoextra R package (Kassambara and Mundt 2020).This procedure resulted in three hierarchical size groups (small, medium, and large) (Figure 1a), with distinct allometric patterns (Figure 1b,c).Average DBH and average wood volume were of up to 76 cm and 6.4 m 3 , respectively, for small-sized species; up to 98 cm and 10.5 m 3 for medium-sized species; and up to 120 cm and 17 m 3 for large-sized species.

Generic model
Only single predictor models were used to test our hypothesis.A generic model (GM) was fitted at a multispecies level to predict wood volume (V) as a function of diameter ACTA AMAZONICA at breast height (DBH).GM was fitted by the ordinary least squares (OLS) method.Log transformation was applied to obtain a linear relationship between volume and diameter (Figure 1c).This transformation is commonly used to describe tree allometric relationships with advantages for modeling performance (Brown et al. 1989), as it can produce more accurate estimations of parameters and confidence intervals (Xiao et al. 2011).Back-transformation biases were corrected by the correction factor given by the square of residual variance divided by two (Baskerville 1972).All statistics were computed using corrected predictions.Data normality was assessed with the Shapiro-Wilk test and variance homogeneity by the nonconstant variance score test.
Due to non-constant residuals, an explicit model for the variance was needed (Kangas et al. 2022).The variance was modeled as the power of DBH (Robinson and Hamann 2011).This provided significantly better fitting (likelihood ratio test, P < 0.001) than the previous OLS model, in which error was modeled as constant variance.The form of the final generic model was expressed as in Equation 1: where V is the stem volume (m³), DBH is the stem diameter at breast height (cm), β 0 is the intercept coefficient, β 1 is the slope coefficient, and ε ij is the error assumed as a power of DBH.This model does not consider the hierarchical structure of the data (Dutcă et al. 2018), however, several sources of variation may affect it (Robinson and Hamann 2011), including the differences among species (Hulshof et al. 2015).

Multilevel mixed models
Mixed-effects modeling account for intra and interspecific variability (Magalhães et al. 2021).These models were named multilevel (Hall and Bailey 2001) to alude at the purpose of the model to estimate hierarchical predictions simultaneously for different levels.We incorporated species and speciesspecific size as random effects, as specific size and allometric relationships (V x DBH) vary among the species (Figure 1b,c).Three hierarchical forms were compared to test the model's random structure: 1) fixed model, no random effects (Equation 1); 2) random intercept and fixed slope model (Equation 2); and 3) random intercept and slope model, both varying across levels (Equation 3).
where β 0 and β 1 are the fixed terms, respectively the intercept and slope coefficients of the general relationship; b 0 and b 1 are the random effect terms of β 0 and β 1 , under effect of i levels, that express the intraspecific effect and size-specific effect; and ε ij is the model error assumed as level-dependent.These models were tested to accommodate species and species size, assuming specific V x DBH relationships (Nascimento et al. 2020), so that predictions are provided for each hierarchical level of the database.The random structure selection was based on Akaike's information criterion (AIC) and likelihood ratio tests (LRT).In addition, we reported the marginal R 2 , which includes the variance of fixed effects, and conditional R 2 , which includes the variance of fixed and random effects (Nakagawa and Schielzeth 2013).The mixed models were fitted using the restricted maximum likelihood method (REML) to test the significance of including the random effects (Kearsley et al. 2017).

Data analysis
To test our hypothesis, the residuals of the generic model were evaluated by species and species size.ANOVA was used to test if residuals were level-dependent.Tukey's post-hoc test was used to verify how residuals differ between species and species size.The intraclass correlation coefficient (ICC) (Equation 4) was computed to evaluate the performance of multilevel models to account for error autocorrelation.ICC shows the proportion of variance derived from the differences between levels (Dutcă et al. 2018) and varies between 0 and 1: values close to 0 indicate that most of the variance derives from differences between trees within levels, and values close to 1 indicate a strong correlation between trees and that the variance results from differences between levels.
where τ 2 is the random variance caused by variation between levels; and σ 2 is the residual variance caused by difference between trees within levels.Bias (Equation 5), root mean square error (RMSE) (Equation 6), relative root mean square error (RRMSE) (Equation 7), coefficient of determination (R²) (Equation 8), and AIC values (Equation 9) were computed to compare the goodness-of-fit of the approaches.The validation dataset was used to test the models' predictive performance, using bias, RMSE, and RRMSE.We used a Tukey's test to compare the observed and estimated volume by species size to evaluate the expectation of bias propagation to new predictions.All statistical tests were performed at a 1% significance level.The models were fitted using the nlme R package (Pinheiro et al. 2021).Statistical assumptions of normality and variance homogeneity were checked for all models.All statistical analyses were performed using R version 4.0.3(R Core Team 2021).
(8) AIC = -2logL i + 2P i (9) where y i is the observed stem volume (m³), ŷ i is the predicted stem volume, ȳ i is the mean of observed stem volume, n is the number of observations (measured trees), L i is model maximum likelihood, and P i is the number of model coefficients.

Generic model
The hypothesis of independence violation for the generic model was accepted as residuals were dependent on species and species size (Tukey's test for species: P < 0.01; Tukey's test for size: P < 0.01; Figure 2).This means that the errors of the generic model were not random, but related to groups displaced from the overall mean (Figure 2).Although residuals for species may show distinct patterns, in general they followed the expected tendency of underestimation for large species and overestimation for small species.This tendency was more evident for species size, as expected for generic models in hyperdiverse forests.Residuals were significantly higher for large species and lower for small species (Tukey's test: P < 0.01), suggesting that generic models can be more accurate for medium-sized species and less accurate for smaller and larger species.

Multilevel mixed models
The mixed-effects model structure (Table 2) confirmed the occurrence of species-specific size allometry (Figure 1) as intercept and slope coefficients varied significantly with species and species size (Supplementary Material, Table S1).All coefficients of mixed models were significant (P < 0.01), with high variation between levels.Species-and size-specific models with random intercept and slope were obtained after the structure selection in a mixed modeling approach (Table 2).These were the most parsimonious models, significantly improving volume modeling (likelihood ratio tests: P < 0.01).Improvements were also obtained to account for the residual autocorrelation and deal with independence violation (low ICC), suggesting that multilevel mixed models can provide more reliable volume predictions in hyperdiverse forests.

Model comparison
The accuracy significantly differed among modeling approaches in fitting and validation procedures (Table 3).Compared to the generic model, the fit of multilevel mixed models improved the volume explanation by 8.4% (species-specific model) and 6.3% (size-specific model), while accuracy increased by 4.9% (0.38 m³) and 3.6% (0.28 m³), respectively.The species-specific model showed the best goodness of fit.Still, the size-specific model showed a lower residual autocorrelation (Table 2), appropriate for dealing with hierarchical data structure.When applied to the validation dataset, multilevel mixed models increased accuracy by 4.1% (0.33 m³) and 3.0% (0.25 m³), respectively.The validation procedure confirmed the second expected consequence derived from our hypothesis (that tendency of the generic model is propagated to new predictions) (Figure 3).The  ACTA AMAZONICA generic model significantly underestimated the volume of medium and large species and overestimated that of small species (Tukey's test: P < 0.01).Species-and size-specific models provided predictions significantly adherent to the observed volume (P > 0.01).

DISCUSSION
Most timber volume models fitted in the Brazilian Amazon are multispecies, although they do not consider the hierarchical structure of the data.This study showed that generic models are less recommendable for volume stock assessments in hyperdiverse forests because predictions depend on species-specific traits.Our results also demonstrated that multilevel mixed models improve volume predictions because they can account for the effects of species and species size.Additionally, species-and size-specific models presented a similar performance for new predictions.This carries practical applications to managing hyperdiverse forests, where occasionally it is not easy to fit species-specific models.

Non-independence of generic models
Generic multispecies models have been widely used for volume estimation in Amazonian managed forests (Tonini and Borges 2015;Gimenez et al. 2016;Romero et al. 2020) because they require only DBH measurements to be applied (Colmanetti et al. 2020).However, our results showed that the predictions of generic models cannot be considered independent in hyperdiverse forests because data are grouped within species.Due to the hierarchical data structure, the variance among trees is associated with intra-and interspecific variability, particularly relative to size alometry (Dutcă et al. 2018).Consequently, the residuals generated by the generic model were autocorrelated.As models that ignore clustered data often report biased results (Dutcă et al. 2018), the volume predictions varied among our size groups as expected, with underestimations in large-sized species and overestimations in small-sized species, propagating these biases to new predictions.

Improvements using multilevel mixed modeling
Including random effects improved model performance, especially in dealing with independence assumptions that compromise modeling inference (Hall and Bailey 2001).Mixed models improve volume predictions because they account for intraspecific variability, as a given DBH value can differ largely in its associated volume among levels.Therefore, our fitted mixed models accommodated the variance produced by species and species size, correcting prediction errors (Dutcă et al. 2018).Our results agree with previous studies that found that hierarchical models by species provide more accurate estimates than generic models (Bojórquez et al. 2020;Nascimento et al. 2020;Magalhães et al. 2021;Abreu et al. 2022).Compared to traditional modeling, mixed-effects fitting usually requires fewer sampled trees, reducing the sampling cost (Colmanetti et al. 2020).In addition, this approach easily allows additional hierarchies to be incorporated into fitted models (Lam et al. 2017).Mixed models also exclude the need to fit an equation for each species once a single model provides hierarchical predictions at several levels (Nascimento et al. 2020;Abreu et al. 2022).

Implications for volume prediction
Similarity among models in their species-and size-specific predictive ability suggest some practical implications for volume prediction in hyperdiverse forests.First, fitting mixed models are usually restricted to the same set of species, which is challenging in hyperdiverse forests (Colmanetti et al. 2020).Second, rare species, which are common in the Amazon forest (Schulze et al. 2008) are limiting for species-specific model fitting (Cysneiros et al. 2017a).Grouping species by size may solve this issue, but a large sample size is required to reliably classify new species into a size group (Cysneiros et al. 2017b).However, mixed models are efficient only when random effects are expressive (Vismara et al. 2015), i.e., if a random effect is similar to the population mean no benefit will be observed in using these models, and the fixed-effects models may be more appropriate (Colmanetti et al. 2020).Furthermore, even using species-specific equations, parameters can vary depending on the location (Colmanetti et al. 2020), which must be controlled when using equations across different sites.
The validation procedure showed that mixed models were more accurate than the tested generic model.Although the prediction deviation among models was numerically small, it represents a substantial wood volume.From an overall wood volume of 8,128 m³ in the validation dataset, 535 m³ were underestimated by the generic model, against 146 m³ by the species-specific and 134 m³ by the size-specific models.Considering the average Amazon logwood price as US$ 164 per m³ (ITTO 2023), the biased predictions of the generic model represent a financial uncertainty of approximately US$ 87,740, which represents only 20% (the validation dataset) of all trees harvested over two years at our study site.Notably, the bias was higher for large species, which is crucial for forest planning as large timber species are spatially clustered in the study area (Péllico Netto et al. 2017).
Allometric predictions are frequently based on variables measured in forest inventories for wood stock assessment in large areas (McRoberts and Westfall 2016;Leão et al. 2021) thus biased wood volume estimates can generate significant financial uncertainty and lack of control over remaining and managed stocks (Rolim et al. 2006;Leão et al. 2021).Sustainable forest management depends on reliable estimates of wood volume at tree and stand level, especially in hyperdiverse tropical forests.Further studies are needed to quantify the consequences of biased predictions for forest management financial and tactical planning, and improve the ACTA AMAZONICA estimates that regulate virtual credits for wood in Amazonian forests (Andrade et al. 2023).

CONCLUSIONS
Our study showed that generic multispecies models for volume prediction violate the independence assumption in hyperdiverse forests and can fail to make new predictions based on species and species size, prooving the expected consequences deriving from our hypothesis.The tendency to underestimate the volume of large-sized species and overestimate the volume of small-sized species was propagated to new predictions.Multilevel mixed models accounted for interspecific variability and reduced error autocorrelation, providing more accurate predictions of timber wood volume and reducing financial uncertainty and potential damage to forest stocks.
ACTA AMAZONICA Table S1.Fixed and random coefficients of the generic model (GM), species-specific multilevel model (MMsp), and size-specific multilevel model (MMsz) used to predict wood volume as a function of DBH for a dataset of 21 timber tree species from Jamari National Forest (Rondônia, Brazil).See Table 1 for the correspondence of species name acronyms used in MMsp levels with random effects.

Figure 2 .Figure 3 .
Figure2.Residuals for species and tree size groups of the generic model fitted at multispecies level to predict wood volume as a function of DBH for a dataset of 21 timber tree species from Jamari National Forest (Rondônia, Brazil).Different letters above the averages of size group residuals indicate statistically significant differences according to Tukey's test.In the boxplots, the box represents the interquartile range (IQR) of the residuals from the first quartile (Q1) to the third quartile (Q3); the solid line is the median of the residuals; the points are outlier residuals; and the whiskers are the range of the residuals outside the IQR (minimum and maximum values, respectively).

Table 2 .
Selection of mixed-effect model structure to predict wood volume as a function of DBH for a dataset of 21 timber tree species from Jamari National Forest (Rondônia, Brazil).R²m is the variance explained by fixed effects; all effects explain R²c; AIC is the Akaike information criterion; ICC is the intraclass correlation; LRT is the likelihood ratio test; GM = generic model; MMsp = species-specific multilevel model; MMsz = size-specific multilevel model.