Consistency of the results of path analysis among sugarcane experiments

The aim of this research was to evaluate the consistency of path analyses in sugarcane experiments based on genetic, phenotypic and genotypic correlations. Forty-four analyses were made with a view toward quantifying the direct and indirect effects of stalk height (SH), stalk diameter (SD) and number of stalks (NS) on sugarcane weight (SW). NS had the greatest direct effect on SW in all the analyses with the use of genetic and phenotypic correlations and in 12 analyses with use of the genotypic correlations. SD had a high direct effect on SW, going beyond NS in only one experiment, while SH had the lowest direct effect on SW in most of the experiments. The results showed greater consistency with the use of genetic and phenotypic correlations. In the balanced experiments, the phenotypic and genetic correlations showed equivalent results. NS is the main determinant of changes in sugarcane production.


INTRODUCTION
The use of correlation coefficients is relevant for quantification of the associability between two variables (Falconer and Mackay 1996).However, correlation might not be a measure of cause and effect, and direct interpretation of its magnitudes may result in mistakes in selection strategy.High correlation in some cases may be the result of the effect of a third variable or of a group of variables on the response variable (Silva et al. 2012).Path analysis allows the correlation coefficients to be broken down into direct and indirect effects on a main variable (Tyagi and Lal 2007).Therefore, the use of this methodology in sugarcane breeding programs is highly worthwhile in its initial stages because the methodology aims at indicating the most adequate characteristics so that indirect selection be made of the most productive families since quantifying their production is quite a slow job due to the large number of genotypes assessed in these stages (Barbosa and Silveira 2012).In addition, knowledge of a consistent pattern of relationship between the component variables of production may present new perspectives on estimating sugarcane weight (SW), which is usually practiced in studies where the plots are not harvested and sugarcane weight is estimated in accordance with Chang and Milligan (1992).
In experiments with sugarcane, some authors have used only the phenotypic correlation in path analysis (Sukhchain andSaini 1997, Tyagi andLal 2007).Others, however, in addition to phenotypic correlation, have also used genotypic correlation obtained as based on decomposition of variance and covariance components (Ferreira et al. 2007, Espósito et al. 2012, Silva et al. 2012).For Silva et al. (2009), genetic correlations (Pearson correlation) obtained from the genotypic mean values predicted from mixed models (REML/ BLUP), are more efficient in determination of the direct and indirect effects of the traits.
It is known that the genotypic values predicted by means of mixed model analysis lead to more precise and accurate inferences (Barbosa et al. 2004), increasing the efficiency of breeding programs.By means of BLUP, the phenotypic values are corrected to the environmental effects and are weighted by the heritability of the trait, which is estimated by the REML procedure.In this case, it is presumed that path analysis would be most effective when based on predicted genotypic values than when applied to phenotypic values.Kang et al. (1983) affirm that, from the practical point of view, the path coefficients obtained from the genotypic correlations are more important for deciding on the best selection criteria.However, according to Resende (2002), Crop Breeding andApplied Biotechnology 13: 113-119, 2013 Brazilian Society of Plant Breeding.Printed in Brazil

ARTICLE
BP Brasileiro traits that are correlated genotypically but not phenotypically might not have practical value for selection because this is generally based on phenotype.Thus, one observes that for defining the correlations to be used in path analyses, three possibilities are manifest in the literature: the phenotypic correlation obtained via Pearson correlation on the phenotypic values, and the two types of correlation of genetic origin, one obtained from analysis of variance and covariance, and the other via Pearson correlation of the genotypic values predicted through mixed models.Each author uses the former or the latter type of correlation through believing that the analyses based on these measures may lead to more trustworthy conclusions, or through simple ease of application or use of software.Nevertheless, there is no consensus on the best strategy.The aim of this study is to evaluate the consistency of the results of path analysis in different sugarcane experiments through the use of genetic, phenotypic and genotypic correlations, as well as identifying the variable which has the greatest effect on SW, with a view toward optimizing the selection process of sugarcane families in the field.

MATERIAL AND METHODS
For the present study, 110 full-sib families and 100 halfsib families were used, derived from crosses undertaken in the years 2006 and 2010, respectively, at the Serra do Ouro Experimental Station of the Universidade Federal de Alagoas (Federal University of Alagoas), located in the municipality of Murici, AL, Brazil.The plantlets originating from biparental crosses and polycrosses, after acclimatization, were sent for setting up the experiment in the experimental area of the Sugarcane Research and Breeding Center (Centro de Pesquisa e Melhoramento de Cana-de-Açúcar -CECA) belonging to the Federal University of Viçosa, in the municipality of Oratórios, MG, Brazil, with latitude 20º 25' S, longitude 42º 48' W, altitude 494 m asl and LVE soil.
The 110 full-sib families were distributed in five experiments, each one with 22 families.The experiments were set up in 2007 in a randomized block design with five replications.Each plot was composed of 20 plants distributed in two five-meter length furrows, spaced at 1.40 m.The data from plant-cane were collected in June 2008 and from ratoon-cane in June 2009.
The 100 half-sib families were distributed in five experiments, each with 20 families.The experiments were set up in 2010 in a randomized block design with six replications.Each plot was composed of 10 plants, distributed in a 5 m length furrow, with between furrow spacing of 1.40 m.The ratoon-cane data were collected in June 2012.
In all the experiments, the traits evaluated were: average stalk height (SH) in meters, measuring one stalk from each plant, from the base of the stalk up to the first leaf with dewlap visible; stalk diameter (SD) in centimeters, with the sampling made at the third internode, counted from the base of the stalk to the apex, measuring a stalk from each plant with a caliper rule; and number of stalks per plot (NS).
The dependent variable, sugarcane weight (SW, in t ha -1 ), was obtained in a different manner depending on the type of experiment considered.In the full-sib families, SW was obtained from weighing all the plants of the plot using a spring scale and applying the expression: SW = (total weight of the plot x10)/tp, in which tp is the plot size in m².In the experiments of half-sib families, the SW was estimated with the weighing of a 10 stalk sample from the plot, using a spring scale and applying the expression: SW = (NS x average weight of the stalk x 10)/tp.
The phenotypic and genotypic correlations were obtained based on the decomposition of variance and covariance components, while the genetic correlations were obtained from mean values corrected by REML/BLUP, and the statistical model adopted for full-sib families is presented as follows: y = Xr + Za + Wf + e, in which y is the vector of phenotypic data; r is the vector of the effects of block (fixed) added to the overall mean value; a is the vector of individual additive genetic effects (random); f is the vector of dominance effects (random); e is the vector of errors (random); and X, Z and W refer to the matrices of incidence that relate the fixed and random effects of the model to the data.
For half-sib families, the following statistical model was adopted: y = Xr + Zg + e, in which y is the vector of phenotypic data, r is the vector of the effects of block (fixed) added to the overall mean value, g is the vector of genotypic effects (random), e is the vector of errors (random).Capital letters represent the matrices of incidence for the aforementioned effects.
Before carrying out path analysis, a diagnosis of multicollinearity was carried out in the 45 correlation matrices among the explanatory variables (X' X), i.e., in the three correlation matrices (phenotypic, genotypic and genetic) for each one of the 15 experiments evaluated.The degree of multicollinearity was established based on the condition number (Montgomery and Peck 1992).Afterwards, path analysis was carried out according to the general causal diagram (Ferreira et al. 2007), in which sugarcane production is determined by its components (SH, SD and NS) by means of direct and indirect effects.The genotypic mean values were estimated via mixed models, using the software Selegen -REML/BLUP (Resende 2007), while analyses of variance, correlation estimates, diagnosis of multicollinearity and path analyses were performed with the assistance of the software R (R Development Core Team 2012).

RESULTS AND DISCUSSION
The mean values for SW in full-sib families were from 70 to 85 tons in the plant-cane stage, with production differences of up to 39 t ha -1 between the best and worst family of the same experiment.In the ratoon-cane stage, the mean value of production was from 104 to 113 t ha -1 , and the greatest amplitude in the same experiment was only 2.69 t ha -1 .In half-sib families in the ratoon-cane stage, the mean values of SW were from 87 to 91 tons, with differences in production of up to 76 t ha -1 between the best and worst family of the same experiment.The descriptive analyses for SW are shown in Table 1.In general, the statistics showed similar values between the experiments of full-sib families in the ratoon-cane and plant-cane stages.
Estimates of the correlation coefficients between SW and its components were positive, relatively high and significant (P < 0,05) by the t test (Table 2), suggesting that the increase in any of the components would cause a mean increase in SW.It is fitting to highlight that the significance of the genotypic correlations was evaluated by the expression: following Espósito et al. (2012).NS showed the greatest correlation with SW in 11 genetic
In this study, NS exhibited the greatest importance in determining SW, since the correlations between the two traits were of high magnitude.However, as studies of correlations do not give the relative importance of the production components on the main variable, path analyses were carried out for the purpose of making a breakdown of the correlation coefficients into direct and indirect effects.
Analysis of the condition number indicated weak multicollinearity (Nc < 100) in the correlation matrix of the explanatory variables in 44 of the 45 analyses performed.Only the genotypic correlation matrix of experiment II exhibited moderate multicollinearity (Nc = 138.4)(Table 3), thus not being used in the analysis because, under these conditions, the variances associated with the estimators of path analysis may reach excessively high values, resulting in estimates that are not very reliable (Montgomery and Peck 1992).
The phenotypic, genotypic and genetic path coefficients efficiently explained the variations in SW, as indicated by the high values of the coefficient of determination of the model (R² ≥ 0.80), as well as the small residual effect (p ε ≤ 0.45).This result indicates the excellent contribution of these explanatory variables to sugarcane production (Tables 3, 4 and 5), confirming the possibility of practicing selection based on these variables.
The NS trait exhibited the greatest direct effect on SW in 12 of the analyses with the use of genotypic correlations, and in all analyses with the use of genetic and phenotypic analyses, and, in general, the values obtained were similar between the experiments in the plant-cane and ratoon-cane stages and between full-sib and half-sib families (Tables 3,  4 and 5).Sukhchain and Saini (1997), Silva et al. (2009) and Espósito et al. (2012) also obtained high direct effects of NS on SW, suggesting selection of clones for SW based in this trait.The direct effects (genetic, genotypic and phenotypic) of SD on SW were in general high, ranging from 0.11 to 0.79; however, its direct effects were only greater than the effects of NS on SW in experiment VIII and as of the genotypic correlation (Table 4).A similar result was obtained by Ferreira et al. (2007), who also obtained a greater direct of SD on SW with the use of genotypic correlation.However, with the use of the phenotypic correlation of the same experiment, the NS trait was that which exhibited the greatest direct effect on SW.The SH trait exhibited lower direct effects (genetic, genotypic and phenotypic) on SW, ranging from -0.33 to 0.57.The value of 0.57 was estimated in experiment IX, and, in this analysis, SH was the variable with the greatest direct effect on SW (Table 4).
In the present study, the values obtained from the genetic and phenotypic correlations in the experiments of half-sib families were identical.However, in the experiments of full-sib families, the genetic and phenotypic correlations exhibited differences, although the values of these correlations were consistent.The data of full-sib families are unbalanced because the participation of the parents in the biparental crosses was not equitable, as observed in the pedigree used in the analysis.This imbalance has led to different precisions in prediction of the genotypic values, even with the use of the REML/BLUP procedure (Barbosa et al. 2004).Nevertheless, pedigree information in full-sib families has shown more accurate estimates of genetic parameters (Nunes et al. 2008, Atkin et al. 2009), indicating that under these conditions, the genetic correlations estimated by mixed models (REML/BLUP) are more precise that the correlations estimated by analysis of variance (Durel et al. 1998).As the experiments of half-sib families are balanced, the genetic and phenotypic correlations did not exhibit differences, consequently generating the same results in path analyses (Table 5), showing that the REML/ BLUP procedures and analysis of variance are equivalent in balanced experiments.
Conflicting values were obtained as a result of the use of genotypic correlations, which may have occurred due to overestimated correlations between the traits evaluated, as identified in the correlation matrix of experiment II, which exhibited correlations above one (Table 2) and, consequently, problems with multicollinearity.The values of genotypic correlations between all the pairs of traits were greater than the phenotypic and genetic correlations in all the experiments, agreeing with the theoretical and practical results obtained by other authors (Cheverud 1988, Ferreira et al. 2007, Espósito et al. 2012).For Cheverud (1988), high genotypic correlation without the same pattern of phenotypic correlation occurs due to the lack of precision in the estimates of the genetic components used in obtaining the genotypic correlation matrix.Although the results were more consistent with the use of genetic and phenotypic correlations, genotypic path analyses exhibited greater coefficients of determination and lower residual effects, explaining 100% of the variation in SW in eight analyses, with six being in experiments of full-sib families and two in half-sib families.

Table 1 .
Overall mean value, mean square of the genotype (MSG), experimental coefficient of variation (CV%), maximum and minimum values and amplitude of the genotypic mean values estimated by REML/BLUP for the sugarcane weight trait (SW, in t ha -1 ) in 110 full-sib families (Exp I -X) and 100 half-sib families (Exp XI -XV) in sugarcane

Table 3 .
Path analysis as of phenotypic (r f ), genotypic (r g ) and genetic (r gblup ) correlation of the production components: number of stalks (NS), stalk diameter (SD) and stalk height (SH) on sugarcane weight (SW) in the experiments with full-sib families in the plant-cane stage R² = coefficient of determination; p ε = residual effect; Nc = condition number (Greatest eigenvalue/Lowest eigenvalue); Det = determinant of the matrix X' X.

Table 4 .
Path analysis as of phenotypic (r f ), genotypic (r g ) and genetic (r gblup ) correlation of the production components: number of stalks (NS), stalk diameter (SD) and stalk height (SH) on sugarcane weight (SW) in the experiments with full-sib families in the ratoon-cane stage

Table 5 .
Path analysis as of phenotypic (r f ), genotypic (r g ) and genetic (r gblup ) correlation of the production components: number of stalks (NS), stalk diameter (SD) and stalk height (SH) on sugarcane weight (SW) in the experiments of half-sib families in the ratoon-cane stage R² = coefficient of determination; p ε = residual effect; Nc = condition number (Greatest eigenvalue/Lowest eigenvalue); Det = determinant of the matrix X' X.