FITTING A TAPER FUNCTION TO MINIMIZE THE SUM OF ABSOLUTE DEVIATIONS Lana

Multiple product inventories of forests require accurate estimates of the diameter, length and volume of each product. Taper functions have been used to precisely describe tree form, once they provide estimates for the diameter at any height or the height at any diameter. This study applied a goal programming technique to estimate the parameters of two taper functions to describe individual tree forms. The goal programming formulation generates parameters that minimize total absolute deviations (MOTAD). These parameters generated by the MOTAD method were compared to those of ordinary least squares (OLS) method. The analysis used a set of 178 trees cut from cloned eucalyptus plantations in the Southern part of the state of Bahia, Brazil. The values of the estimated parameters for the two taper functions resulted very similar when the two methods were compared. There was no significant difference between the two fitting methods according to the statistics used to evaluate the quality of the generated estimates. OLS and MOTAD resulted equally precise in the estimation of diameters and volumes outside and inside bark.


INTRODUCTION
The quality of a good forest management plan relies on the precision level of the biometric system used to estimate future tree volumes.The biometric system contains independent variables which are measurable tree characteristics such as diameter, height, form and basal area usually measured to monitor forest growth (Scolforo, 1993).The term "taper" is applied to the rate of decrease in diameter along the trunk.Taper functions provide estimates for the diameter at any height or the height at any diameter.
Tree form and size determines different outputs and taper-functions have been used to precisely Sci.Agric.(Piracicaba, Braz.), v.63, n.5, p.460-470, September/October 2006 describe these tree characteristics (Ahrens & Holbert, 1981;Husch et al., 1972;Lima, 1986;Assis, 2000).The vertical integration of production activities in forest companies, where outputs from one production stage become input to the next stage, turns precise tree volume estimation even more relevant (Ahrens & Holbert, 1981;Assis, 2000).Therefore, taper functions become the primary tool for estimating the volume at any part of the trunk, by means of the mathematical integration of the section area along the tree axle.
The parameters of a model representing the form of a tree are determined by specific fitting techniques, among which ordinary least squares has been the most frequently used fitting method.The objective of this work is to apply Goal Programming (GP) techniques as a fitting method to estimate the coefficients of two polynomial models used as taper functions and to compare the results of this fitting method with the estimates produced with the least square fitting method.This study contributes to the development of taper function fitting methods and to the analysis of such processes.

METODOLOGY Area Characterization
Data used in this study proceed from plantations of cloned Eucalyptus grandis × Eucalyptus urophylla located in the South of Bahia, municipality of Eunápolis, Brazil (16 o 17'59"S; 39 o 28'42"N; altitude 168 m).The regional climate (Köppen) is of the Af type, hot and humid tropical, without dry seasons, with annual average temperature of 23.1 o C and average rainfall of 1250 mm year -1 .

Tree Volume Definition
One hundred seventy eight E. grandis × E. urophylla trees, felled at age 5, with heights (H) varying from 20 to 30 m and diameter at breast height (DBH) varying from 9.23 to 23.00 cm had their volumes rigorously determined.Data, collected for each tree, included circumference at breast height (CBH), total height (H), log length (h) and circumference of the log's largest base inside bark (CIB i ) and outside bark (COB i ) for every log i.
The measurements of CIB and COB along tree trunks were made every meter, starting at 0.3 m from the soil (stump), resulting in a total of 4,333 observations.The volume of the section corresponding to the top of the tree was calculated taking the cone formula as a guide and the volumes of the other sections were calculated based on the Smalian formula.The total volume (outside and inside bark) was obtained by the sum of the volumes of the different sections of the tree.

Model fitting approach
Two different polynomial models were used in this study to relate all diameters taken along the trunk and respective heights with DBH and DAB (diameter at the base of the tree) and H.A detailed description of these models follows.

i) Model 1
The polynomial model 1 (M1) can be represented, mathematically, as follows: where: d = diameter at height h from the soil (cm); L = H -h; β i = parameters to be estimated; ε = estimation error.Isolating d, a taper function is obtained to estimate the correspondent diameter at any height on the tree, if DAB, H and L are given.
Considering the sectional area (A) of a tree with diameter d (m; at height h) equal to (π/40000) d 2 and the integration of this section along length L, we obtain the compatible volume equation: Substituting (1) into (2) and integrating we obtain: Setting L 2 = 0 (top of the tree) and L 1 = H (base of the tree), the volume equation for the whole tree becomes: and the equation used to estimate volumes at heights h i = 3, 6 and 12 m is: Polynomial model 2 (M2) can be represented, mathematically, as follows: Isolating d, we obtain the taper function to estimate the correspondent diameter at any height in the tree, if DBH, H and h are given.
, the model becomes: As in M1, integration of section (π/40000)d 2 along length L (equation 3) results in the following compatible volume equation: Again, setting h 2 = H (top) and h 1 = 0 (tree base), the volume equation for the whole tree becomes: Equations used to estimate volumes at heights h i = 3, 6 and 12 m are: Data stored formed the "base" file, which was processed by a SAS © routine to generate linear regression estimates and by LINDO © to process the goal programming model.

β β β β β i Parameter Estimation
Estimating β i parameters involves the generation of the minimum possible error (ε).The minimum loss function can be defined as the differences between the observed data and the data estimated by the model.
In the ordinary least squares (OLS) method, the loss function to be minimized is set as the sum of squared residuals.Squaring the residues avoids residue canceling but weights more heavily large residues and emphasizes their importance (Batista, 1998).
In goal programming, the loss function used to estimate the β i parameters is the sum of absolute residuals and the method is referred to as the MOTAD (minimization of total absolute deviations) method.The MOTAD method also avoids residue canceling, but differently to the OLS method large residues have the same importance as small residues (Batista, 1998).Ignizio & Cavalier (1994) discuss the use goal programming as an alternative tool for developing predictive function.According to these authors, the OLS method is more frequently employed simply because it is easy to be applied and because it generates confi-Sci.Agric.(Piracicaba, Braz.), v.63, n.5, p.460-470, September/October 2006 dence intervals based on the assumption that errors are normally distributed with equal variances, an assumption that sometimes may not hold.Alcântara et. al (2003) points out that the MOTAD method overcomes a deficiency in the OLS method when outliers are present in the data set due to MOTAD's lower sensitivity to extreme values.

Model fitting with the OLS method
The models adopted in the present study can be written as multiple linear regression models with three predictor variables, as follows: where: The condition β 0 + β 1 + β 2 = 1 was imposed to inforce coherence.This is needed because (L/H) 2 , (L/H) 3 and (L/H) 4 are equal to 1 when L is equal to H, and also d equals to DAB, resulting consequently (d/DAB) 2 equal to 1.The same reasoning can be used for M2.

Model fitting with the MOTAD method
The MOTAD method formulates the minimization of the sum of absolute deviations as a goal programming problem (GP), a mathematical formulation for constrained multiple objectives.Deviations to these objectives are minimized, generating solutions close to certain aspiration levels.Aspiration levels are in fact goals associated with the multiple objectives and, therefore, the name goal programming.
The Simplex algorithm for solving linear programming problems is normally used in the solution of GP problems.These problems can also be formulated under the same hypothesis, limitations and conditions of linear programming: linearity, divisibility and deterministic characteristic (Lee et al., 1990).
The first application of GP to constrained regression was formulated by Charnes et al. (1955).Two deviations, DP i and DN i , are created to each pair of observations (X i ,Y i ).DP i represents a positive deviation and DN i represents a negative deviation.So, the linear model In the MOTAD version of the GP problem applied to the taper function fitting process, i identifies each observation in a set of N measurements, the coefficients of the linear model are the main decision variables and the GP formulation becomes: For M1, the constraints can be represented as: and for M2, the expression becomes where: i = i th tree log and i th line in the model; − are coefficients of the parameters to be estimated; and β 0 , β 1 and β 2 are the parameters to be estimated.

Comparing the OLS and MOTAD methods
The coefficient of determination (R 2 ), the root of the mean square error (Syx) and the dispersion of residuals were used to compare the results of the fitting process.Precision and accuracy analysis were also made based on the estimative generate by each fitting method and according to four statistics used by Parresol et al. (1987) and Assis et al. (2002): bias ( D ), standard deviation of differences (SD), sum of squared Both models and the fitting methods were compared and also ranked for the quality of the estimates obtained for volume outside and inside bark (V ob and V ib ).The model with the worst value for each of the calculated characteristic scored 1, otherwise the score was 2. The sum of scores for each model determined its final performance score.

Model fitting
The tested models fitted adequately to data given that both models resulted in coefficients of determination above 96% (Table 1).As expected, models fitted with the MOTAD method produced coefficients of determination slightly lower than models fitted with the OLS method.The root of the mean square error ranged from 10.58% to 13.10% and resulted very similar when comparing the two fitting methods.The t and F tests were significant at 5%, showing a strong correlation between the dependent variable (d/DAB or d/DBH) and the independent variables For diameters close to DBH and above, M2 shows residuals more clustered around zero (graphs c, d, g and h, Figure 1).Graphs (a), (b), (e) and (f) in Figure 1 show that outside bark M1 taper function with the dependent variable (d/DAB) 2 represents better the form of the trunk at the base of the tree.At the base of the tree, the dependent variable (d/DBH) 2 is overestimated by M2 when fitting outside bark data (graphs c and d, Figure 1) and underestimated by M2 when fitting inside bark data (graphs g and h, Figure 1).

Loss Function Minimization
Table 2 shows very small differences between the two models and the two fitting methods.Models fitted for inside bark diameters produced smaller residuals sums.
Table 1 -Statistics and estimated coefficients for models given by equations ( 1) and (7).As expected, linear regression effectively resulted in the minimum sum of squared deviations while goal programming produced the minimum sum of absolute deviations.

Diameter, V ib and V ob estimation
Table 3 shows the values of the four selected statistics to evaluate the quality of estimates for diameter at different heights: bias ( D ), standard deviation of differences (SD), sum of squared relative residuals (SSRR) and residuals percentage (RP).
The models showed to be equally precise in the estimation of diameters outside bark at relative heights equal to 3, 6 and 12 m.This could be observed for the OLS method, as well as for the MOTAD method.Considering the total height of the trees, M1 was slightly superior to M2.
Low values were observed for bias.Although low and considering the estimated diameters at 3, 6 and 12 m heights, this statistic shows that M1 tended to slightly overestimate the diameters outside and inside bark; M2 tended to slightly underestimate the diameters outside bark and overestimate the diameters   Table 4 presents the score values for each diameter estimation model at different heights (3, 6, 12 m, and total height), for each statistic separately and for the final total score.
The statistics calculated for the volume estimation at different heights are shown in Tables 5 and 6.
The OLS and MOTAD methods fitted models 1 and 2 with similar precision.M1 showed consistently better results than M2 for inside bark volume estimation at any height.For outside bark, M1 resulted better only for heights 3 and 6 m.For volumes up to 12 m, both fitting models resulted similar, and for total tree outside bark volume M2 showed more precise.
Both polynomial functions fitted well the data along the main part of the tree except for the base  ate the opportunity to evaluate the sensitivity of the MOTAD fitting method to situations where the variance highly increases when measurement values also increase.

Figure 1 -
Figure1-Dispersion of residuals for the two models (1 and 2) and for the two fitting methods (MOTAD and OLS).
and for the top height diameters: M2 slightly overestimates inside bark while underestimating outside bark, and M1 underestimates both outside and inside.

Table 2 -
Sum of squared deviations (SSD) and sum of absolute deviations (SAD) for OLS and MOTAD fitting methods for models given by equations (1) and (7).

Table 3 -
Bias (D), standard deviation of difference (SD), sum of squared relative residuals (SSRR), and residue percentage (RP) for diameter estimation at different heights.

Table 4 -
Ranking attributes (2=best; 1=worst)for diameter estimation according to some performance statistics: bias (D), standard deviation of difference (SD), sum of squared relative residuals (SSRR) and residue percentage (RP)

Table 5 -
Bias (D), standard deviation of difference (SD), sum of squared relative residuals (SSRR) and residues percentage (RP) for volume estimation at different heights.