DISTRIBUTION OF TOTAL HEIGHT, TRANSVERSE AREA AND INDIVIDU L VOLUME FOR Araucaria angustifolia (Bert.) O. Kuntze

This study aimed to test probability density functions for the distribution of variables total height, transverse area and individual volume, considering three different class intervals. Data were obtained from the measurement of diameter (DBH) and total height and from estimation of the individual volume of 338 pine trees in a fragment of Mixed Ombrophylous Forest with an area of 15.24 ha, which is located in Jardim Botânico campus of UFPR, Curitiba-PR. Ten functions were fitted, including commonly used models for diameter distribution as well as other recently developed models applied to forest science. Selection criteria included Kolmogorov–Smirnov adherence test, standard error of estimate in percentage and adjusted coefficient of determination. Three class intervals were used as obtained by Sturges, Dixon & Kronmal, and Velleman criteria. The Normal function for variable height, and the Weber function for distribution of transverse area and individual volume, provided the best fit, considering the three class intervals adopted. The models fitted better for larger size class intervals as obtained by Sturges rule.


INTRODUCTION
Within the Mixed Ombrophylous Forest domain, species Araucaria angustifolia has great prominence on account of its singular appearance which stands in contrast to other trees of the same biome and makes it typical and exclusive of the above forest formation (CARVALHO 1994).The abundance and quality of this Brazilian pine tree in past landscapes, combined with its diversified wood use, led to extensive exploration and consequently to rapid disappearance of large areas of araucaria forest in primary vegetation formations, originating many such fragmentsnotably as a result of regeneration after the exploration period.
Knowledge of diameter, height, transverse area and volume distributions is a prime requirement to ensure good forest management.Since the pioneering work of Barros et al. (1979) in Brazil to test probability density functions in order to fit diameter distribution in tropical forests, many studies have followed to investigate diameter distribution Distribution of total height, transverse area... using probability density functions (pdfs) for a variety of forest typologies, particularly plantations.
However, works involving height, transverse area and volume distributions are few and scarce and include the works of Alves Júnior et al. (2007), Gomide (2009), Silva et al. (2003) and Weber (2006).And no research has been conducted on the distribution of these variables for Araucaria angustifolia.This work additionally investigated class intervals providing best adherence test to fit the probability of density functions.
This work is thus mainly intended to test models that will describe the probability distribution of variables total height, transverse area and individual volume for species Araucaria angustifolia, in a fragment of Mixed Ombrophylous Forest, considering three different class intervals.

Study site
This work was conducted in a fragment of Mixed Ombrophylous Forest known as Capão da Engenharia Florestal and situated in the Jardim Botânico campus of UFPR.The Capão fragment covers an area of 15.24ha-12.96ha of which consist of Mixed Ombrophylous Forest while 2.28ha consist of sparse brush (capoeira rala) extending along the streamlet which in turn is adjacent to the entire south border of the Capão.Swamps and woody grasses predominate in the capoeira area.
The study site sits between coordinates 25º26'50"-25º27'33"S and 49º14'16"-49º14'33"W, and the terrain is 890 to 915 meters above sea level.The area has a humid subtropical, mesothermal climate, with mild summers and frequent frosts in the winter, the average temperature being 17°C and the annual precipitation being 1,500 mm, corresponding to a Cfb climate according to Koppen classification.

Data used
A total of 349 pine trees were counted within the fragment, and measurements were taken of the circumference 1.30 m above the ground (CBH) and total height of all trees.A measuring tape was used for the CBH measurement and a Vertex III hypsometer was used for the height measurement.To estimate the total volume of the araucaria trees, volume equations that had been previously developed for a Brazilian pine inventory by the Paraná Forest Research Foundation-FUPEF (1978) were used, the reason being the close proximity and similar characteristics of that area to the study area.

Frequency distribution of data
According to Hoaglin et al. (1983), to actually select the number of classes while organizing a data set, the number of observations (N) and some common sense as to how to arrange them should both be taken into account.
To determine the number of classes and the 'ideal' class interval for the data set, three empirical mathematical rules were used as mentioned by Hoaglin et al. (1983) As far as the forestry field is concerned, equation 1 above is the most commonly disseminated and one of the best known formulas to establish the number of classes of a data set (MACHADO & FIGUEIREDO FILHO 2006).According to Hoaglin et al. (1983), for samples with a large number of data, N >100, this rule will determine a much reduced number of classes, with larger class intervals and high frequencies, being thus recommended where N < 50.Hoaglin et al. (1983) argued that Dixon & Kronmal rule is perfectly effective and practical for constructing histograms, being more suitable where N > 100.Velleman rule on the other hand is suggested for average sized samples, 50 < N < 100, generating better histograms for this interval in comparison to the Dixon & Kronmal rule.

Probability density functions
Ten probability density functions were tested to obtain distribution of estimated frequencies for variables total height (h), transverse area (g) and volume (v) (Table 1).Parameters were found for each function, for the different class intervals found by the rules used.
To determine the parameters of the probability density functions and estimate the number of trees, software applications Table Curve 2D and MS EXCEL 2007 were used.

MACHADO, S. do A. et al.
Legend: f(x)=density function of variable x; x=random variable; x max = maximum value of x; x min =minimum value of x; = parameter mean of x; = standard deviation of x; = variance of x; =constant pi (3.1416...); =euler's constant (2.7182...); a, a 1 , a 2 , a 3 , a m , b, c, c 1 , c 2 , d, h, e, l, d, g =parameters to be estimated; n= exponent of quadros polynomial; l 1 =upper limit of class to fit function g 1 (x); l 2 = upper limit of last class where g 2 (x) provides good fit; k= integral of g 1 (x)+g 2 (x)+g 3 (x).Distribution of total height, transverse area... Parameters of the 2-and 3P-Weibull, Weber, Péllico and Quadros distributions were estimated by means of an interaction of initial parameter values, with least square fitting using the Levenberg Marquardt algorithm.
With an MS EXCEL 2007 spreadsheet, the parameters and exponents of the two exponential functions ) ( 1 x g and ) ( 3 x g forming the Quadros pdf were determined by the SOLVER tool using a Simplex algorithm.
The Maximum Likelihood method was used to determine the coefficients for the Johnson SB function.To estimate the parameters for Beta, Gamma, Normal and Log-Normal functions, MS EXCEL 2007 spreadsheets were created using the method of moments.

Best fit selection criteria
In order to compare and select the best model to represent the distribution of the relevant variables, calculations were made to obtain the adjusted coefficient of determination, known as Schlaegel index (R 2 aj ), the standard error of estimate in percentage (Syx % ) and the Kolmogorov-Smirnov adherence test (D = 0.05).
After fitting the tested functions, data were arranged in rank order, score 1 being attributed to the function best fitting each of the statistics, score 2 being attributed to the second best fitting function, and so on.The function aiming at the lowest sum of scores, considering all statistics (D calc , Syx % and R 2 aj ), for all class intervals (Sturges, Dixon & Kronmal and Velleman), was placed 1st in the overall ranking.Ranking was done independently for height, transverse area and volume.
Complementarily, the frequency curves resulting from fitted functions were generated on the frequency histograms for each of the variables being studied for the best rule of class interval determination.

Data dispersion
Basic statistics of data dispersion for total height (h), transverse area (g) and individual volume (v), as well as the respective class ranges obtained by the Sturges, Dixon & Kronmal, and Velleman rules, are illustrated in Table 2. Based on these values, it was possible to obtain three different class intervals and frequency distributions for each variable under analysis, and thus enable fitting the probability density functions (pdfs).
The class intervals obtained by each of the rules provided a basis for calculating the respective number of classes by Sturges (9 classes), Dixon & Kronmal (25 classes) and Velleman (37 classes).
According to Sokal & Rohlf (1969), to ensure a smoother distribution with better adherence test fit, it is necessary to condense the relevant data into a reduced number of classes, a fact that has been confirmed in this study.Figure 1 illustrates that the histograms generated by the Dixon & Kronmal and Velleman rules have irregular distribution, producing discrepancies and discontinuities in the observed frequencies.
Those methods of data arrangement originate worse-performing statistics for fit and accuracy, in comparison to histogram fittings generated by the Sturges rule.For this reason, the Sturges rule was selected for the three relevant variables in order to best represent data inventory and model fittings in the form of histograms.
While comparing descriptive models of diameter distribution, Barros et al. (1979) observed that for a larger class interval the tested equations provided greater accuracy in frequency estimations.The findings in this study align with the results of Barros et al. (1979) in that increasing the class interval results in greater function fitting accuracy.

Distribution of total height
Statistical values of fit and accuracy and the frequencies estimated by the 10 probability models for distribution of total height, as determined by the three rules, are illustrated in Table 3 in rank order according to the predefined assessment method.
For all models, except for the Péllico function which failed to show good fit for this distribution at the different intervals, the D calc value was lower than the D tab value, for a of 0.05, demonstrating that in all models and at the different class intervals the Kolmogorov-Smirnov test was positive in relation to adherence test.
Table 3 shows that models behaved differently in the estimation of frequency distributions.Considering the three class intervals being used, the best function for the frequency distribution of total height was the Normal followed by 3P-Weibull and Johnson SB at 0.55 of minimum height found.
However, considering the class interval generated by the Sturges rule alone, the best ranking function was the Johnson SB function, followed by Quadros, 3P-Weibull, Normal, Weber, Beta, Log-Normal, 2P-Weibull and Gamma functions.It should be noted that ranking was based on D cal , Syx % and R 2 aj statistics, as explained under Methods.While applying classic models for the height distribution in natural regeneration of Ocotea odorifera, Weber ( 2006) observed that Weibull, Gamma, Beta, Exponential and Normal models failed to provide good fit, Normal function being the worst of all methods.In this study, the Normal function was the most satisfactory among all tested functions, considering the three class intervals adopted (rules).Distribution of total height, transverse area... From Table 3 it can be seen that the statistics of fitted functions are much better for the Sturges rule, followed by Dixon & Kronmal and Velleman rules.This fact clearly indicates that the class interval had a marked effect on function fitting.In comparing Figure 1 histograms to those in Figure 2, 3 and 4, it was noted that the smaller the class interval the more irregular the frequencies by class.
From Figure 2 it is possible to observe how function fit describe the observations for the frequency distribution, according to Sturges rule, for the totality of data.Considering the frequency distribution of total height for larger class interval, as obtained by the Sturges rule, the Johnson SB pdf proved superior to the others both in fit and accuracy values and in the overlap of frequency distribution curves fitted to the observations (Figure 2).The figure illustrates that the frequency distribution curves conformed well to the frequencies histogram by the Sturges rule, except for the Gamma pdf.As can be noted from Table 4, all probability functions fitted to the observed frequencies, except for the Normal pdf which failed to show adherence by the Kolmogorov-Smirnov test, with a D calc value higher than the D tab value at all class intervals.

Distribution of transverse area
The Weber and Péllico models provided the best fit, showing good performance for asymmetric distributions when considering all three rules.Considering model fit by the Sturges rule alone, it was noted that the best performing function is Quadros function, followed by 3P-Weibull and Weber.Still on transverse area, it was also noted that the smaller the class interval, the worse the pdf statistics.Bartoszek et al. (2004), while evaluating diameter distribution for Mimosa escabrella trees at different ages, sites and densities, observed that for a positively skewed distribution in plots of site I at age 3.9 years, the Johnson SB function proved more efficient and flexible, being ideal for virtually all site, age and density combinations.
The above finding of Bartoszeck et al. (2004) for diameter distribution among native Mimosa escabrella crops was not observed for distribution of transverse area among araucaria in the fragment studied here.As has been observed on analysis of Figure 1, the Sturges rule was used for graphic representation of the data set.Figure 3 illustrates frequency distributions as estimated by various functions fitted to the observed frequency histograms.
The Quadros, 2-and 3P-Weibull and Weber models behaved very similarly when plotted on the histogram originated by the Sturges rule.These models were noted to have fitted well to the observed frequencies, although their fit statistics are somewhat different, according to Table 4.

CONCLUSIONS
The Sturges rule provided more accurate fits.Overall, increasing the number of classes and consequently decreasing the class interval led to reduced accuracy in the estimation of the number of trees per hectare, despite little affecting the Kolmogorov-Smirnov test values; Considering the Sturges rule alone, the probability density functions providing the best fit for total height were the Johnson SB at 0.55 of minimum height and the 3P-Weibull functions; while for transverse area the Quadros and 3P-Weibull functions provided the best fit; and for individual volume the Quadros and Weber functions provided the best fit, followed by the 3P-Weibull function; Analysis of the moment coefficients of skewness showed that the distribution of total height was very close to the Normal, while for variables transverse area and volume it was noted that the curves are positively skewed in all three class intervals.

Figure 1 -
Figure 1 -Histograms of total height (h), transverse area (g) and individual volume (v) as obtained by Dixon & Kronmal and Velleman rules.

Figure 3 -
Figure 3 -Behavior of distribution curves fitted to the frequency histogram using the Sturges rule for variable transverse area.

Figure 2 -
Figure 2 -Behavior of frequency distribution curves fitted to the frequency histograms the Sturges rule.

Figure 4 -
Figure 4 -Fit curves of 10 probability models tested on the frequencies distribution observed for variable individual volume.

Table 2 -
Basic statistics of variables total height (h), transverse area (g), individual volume (v) and class ranges as obtained by Sturges, Dixon & Kronmal and Velleman rules.
MACHADO, S. do A. et al.

Table 3 -
Statistical values of fit and accuracy for each function tested, for three different data arrangements.Valores das estatísticas de ajuste e precisão, para as funções testadas para os três diferentes arranjos de dados.
Table 4 presents statistics of 10 tested probability density functions for the frequency distribution of transverse area with class intervals obtained by Sturges, Dixon & Kronmal and Velleman rules.

Table 4 -
Statistical values of fit and accuracy for functions tested at different class intervals.

Table 6 -
Skewness and kurtosis for the distribution of total height, according to the different distribution rules.Assimetria e curtose para a distribuição da variável altura total analisadas nas diferentes regras de distribuições utilizadas.