Consistency of the results of path analysis among sugarcane experiments

Consistência dos resultados da análise de trilha entre experimentos com cana-de-açúcar

Abstracts

The aim of this research was to evaluate the consistency of path analyses in sugarcane experiments based on genetic, phenotypic and genotypic correlations. Forty-four analyses were made with a view toward quantifying the direct and indirect effects of stalk height (SH), stalk diameter (SD) and number of stalks (NS) on sugarcane weight (SW). NS had the greatest direct effect on SW in all the analyses with the use of genetic and phenotypic correlations and in 12 analyses with use of the genotypic correlations. SD had a high direct effect on SW, going beyond NS in only one experiment, while SH had the lowest direct effect on SW in most of the experiments. The results showed greater consistency with the use of genetic and phenotypic correlations. In the balanced experiments, the phenotypic and genetic correlations showed equivalent results. NS is the main determinant of changes in sugarcane production.

Saccharum spp.; correlations; genetic breeding


O objetivo dessa pesquisa foi avaliar a consistência das análises de trilha em experimentos com cana-de-açúcar, a partir de correlações genéticas, fenotípicas e genotípicas. Foram realizadas 44 análises visando quantificar os efeitos diretos e indiretos da altura de colmos (AC), diâmetro de colmos (DC) e número de colmos (NC) sobre massa de cana (MC). O NC apresentou o maior efeito direto sobre MC em todas as análises com o uso de correlações genéticas e fenotípicas e em 12 análises com uso das correlações genotípicas. O DC apresentou efeito direto elevado sobre MC, superando NC em apenas um experimento, enquanto que AC apresentou menor efeito direto sobre MC na maioria dos experimentos. Os resultados apresentaram maior consistência com o uso de correlações genéticas e fenotípicas. Nos experimentos balanceados as correlações fenotípicas e genéticas apresentaram resultados equivalentes. O NC é o principal determinante das alterações na produção de cana.

Saccharum spp.; correlações; melhoramento genético


ARTICLE

Consistency of the results of path analysis among sugarcane experiments

Consistência dos resultados da análise de trilha entre experimentos com cana-de-açúcar

Bruno Portela BrasileiroI,* * E-mail: brunobiogene@hotmail.com ; Luiz Alexandre PeternelliI; Márcio Henrique Pereira BarbosaII

IUniversidade Federal de Viçosa (UFV), Departamento de Estatística, 36.570-000, Viçosa, MG, Brazil

IIUFV, Departamento de Fitotecnia

ABSTRACT

The aim of this research was to evaluate the consistency of path analyses in sugarcane experiments based on genetic, phenotypic and genotypic correlations. Forty-four analyses were made with a view toward quantifying the direct and indirect effects of stalk height (SH), stalk diameter (SD) and number of stalks (NS) on sugarcane weight (SW). NS had the greatest direct effect on SW in all the analyses with the use of genetic and phenotypic correlations and in 12 analyses with use of the genotypic correlations. SD had a high direct effect on SW, going beyond NS in only one experiment, while SH had the lowest direct effect on SW in most of the experiments. The results showed greater consistency with the use of genetic and phenotypic correlations. In the balanced experiments, the phenotypic and genetic correlations showed equivalent results. NS is the main determinant of changes in sugarcane production.

Key words: Saccharum spp., correlations, genetic breeding.

RESUMO

O objetivo dessa pesquisa foi avaliar a consistência das análises de trilha em experimentos com cana-de-açúcar, a partir de correlações genéticas, fenotípicas e genotípicas. Foram realizadas 44 análises visando quantificar os efeitos diretos e indiretos da altura de colmos (AC), diâmetro de colmos (DC) e número de colmos (NC) sobre massa de cana (MC). O NC apresentou o maior efeito direto sobre MC em todas as análises com o uso de correlações genéticas e fenotípicas e em 12 análises com uso das correlações genotípicas. O DC apresentou efeito direto elevado sobre MC, superando NC em apenas um experimento, enquanto que AC apresentou menor efeito direto sobre MC na maioria dos experimentos. Os resultados apresentaram maior consistência com o uso de correlações genéticas e fenotípicas. Nos experimentos balanceados as correlações fenotípicas e genéticas apresentaram resultados equivalentes. O NC é o principal determinante das alterações na produção de cana.

Palavras-chave: Saccharum spp., correlações, melhoramento genético.

INTRODUCTION

The use of correlation coefficients is relevant for quantification of the associability between two variables (Falconer and Mackay 1996). However, correlation might not be a measure of cause and effect, and direct interpretation of its magnitudes may result in mistakes in selection strategy. High correlation in some cases may be the result of the effect of a third variable or of a group of variables on the response variable (Silva et al. 2012). Path analysis allows the correlation coefficients to be broken down into direct and indirect effects on a main variable (Tyagi and Lal 2007). Therefore, the use of this methodology in sugarcane breeding programs is highly worthwhile in its initial stages because the methodology aims at indicating the most adequate characteristics so that indirect selection be made of the most productive families since quantifying their production is quite a slow job due to the large number of genotypes assessed in these stages (Barbosa and Silveira 2012). In addition, knowledge of a consistent pattern of relationship between the component variables of production may present new perspectives on estimating sugarcane weight (SW), which is usually practiced in studies where the plots are not harvested and sugarcane weight is estimated in accordance with Chang and Milligan (1992).

In experiments with sugarcane, some authors have used only the phenotypic correlation in path analysis (Sukhchain and Saini 1997, Tyagi and Lal 2007). Others, however, in addition to phenotypic correlation, have also used genotypic correlation obtained as based on decomposition of variance and covariance components (Ferreira et al. 2007, Espósito et al. 2012, Silva et al. 2012). For Silva et al. (2009), genetic correlations (Pearson correlation) obtained from the genotypic mean values predicted from mixed models (REML/BLUP), are more efficient in determination of the direct and indirect effects of the traits.

It is known that the genotypic values predicted by means of mixed model analysis lead to more precise and accurate inferences (Barbosa et al. 2004), increasing the efficiency of breeding programs. By means of BLUP, the phenotypic values are corrected to the environmental effects and are weighted by the heritability of the trait, which is estimated by the REML procedure. In this case, it is presumed that path analysis would be most effective when based on predicted genotypic values than when applied to phenotypic values.

Kang et al. (1983) affirm that, from the practical point of view, the path coefficients obtained from the genotypic correlations are more important for deciding on the best selection criteria. However, according to Resende (2002), traits that are correlated genotypically but not phenotypically might not have practical value for selection because this is generally based on phenotype.

Thus, one observes that for defining the correlations to be used in path analyses, three possibilities are manifest in the literature: the phenotypic correlation obtained via Pearson correlation on the phenotypic values, and the two types of correlation of genetic origin, one obtained from analysis of variance and covariance, and the other via Pearson correlation of the genotypic values predicted through mixed models. Each author uses the former or the latter type of correlation through believing that the analyses based on these measures may lead to more trustworthy conclusions, or through simple ease of application or use of software. Nevertheless, there is no consensus on the best strategy. The aim of this study is to evaluate the consistency of the results of path analysis in different sugarcane experiments through the use of genetic, phenotypic and genotypic correlations, as well as identifying the variable which has the greatest effect on SW, with a view toward optimizing the selection process of sugarcane families in the field.

MATERIALS AND METHODS

For the present study, 110 full-sib families and 100 half-sib families were used, derived from crosses undertaken in the years 2006 and 2010, respectively, at the Serra do Ouro Experimental Station of the Universidade Federal de Alagoas (Federal University of Alagoas), located in the municipality of Murici, AL, Brazil. The plantlets originating from biparental crosses and polycrosses, after acclimatization, were sent for setting up the experiment in the experimental area of the Sugarcane Research and Breeding Center (Centro de Pesquisa e Melhoramento de Cana-de-Açúcar - CECA) belonging to the Federal University of Viçosa, in the municipality of Oratórios, MG, Brazil, with latitude 20º 25' S, longitude 42º 48' W, altitude 494 m asl and LVE soil.

The 110 full-sib families were distributed in five experiments, each one with 22 families. The experiments were set up in 2007 in a randomized block design with five replications. Each plot was composed of 20 plants distributed in two five-meter length furrows, spaced at 1.40 m. The data from plant-cane were collected in June 2008 and from ratoon-cane in June 2009.

The 100 half-sib families were distributed in five experiments, each with 20 families. The experiments were set up in 2010 in a randomized block design with six replications. Each plot was composed of 10 plants, distributed in a 5 m length furrow, with between furrow spacing of 1.40 m. The ratoon-cane data were collected in June 2012.

In all the experiments, the traits evaluated were: average stalk height (SH) in meters, measuring one stalk from each plant, from the base of the stalk up to the first leaf with dewlap visible; stalk diameter (SD) in centimeters, with the sampling made at the third internode, counted from the base of the stalk to the apex, measuring a stalk from each plant with a caliper rule; and number of stalks per plot (NS).

The dependent variable, sugarcane weight (SW, in t ha-1), was obtained in a different manner depending on the type of experiment considered. In the full-sib families, SW was obtained from weighing all the plants of the plot using a spring scale and applying the expression: SW = (total weight of the plot x10)/tp, in which tp is the plot size in m2. In the experiments of half-sib families, the SW was estimated with the weighing of a 10 stalk sample from the plot, using a spring scale and applying the expression: SW = (NS x average weight of the stalk x 10)/tp.

The phenotypic and genotypic correlations were obtained based on the decomposition of variance and covariance components, while the genetic correlations were obtained from mean values corrected by REML/BLUP, and the statistical model adopted for full-sib families is presented as follows: y = Xr + Za + Wf + e, in which y is the vector of phenotypic data; r is the vector of the effects of block (fixed) added to the overall mean value; a is the vector of individual additive genetic effects (random); f is the vector of dominance effects (random); e is the vector of errors (random); and X, Z and W refer to the matrices of incidence that relate the fixed and random effects of the model to the data.

For half-sib families, the following statistical model was adopted: y = Xr + Zg + e, in which y is the vector of phenotypic data, r is the vector of the effects of block (fixed) added to the overall mean value, g is the vector of genotypic effects (random), e is the vector of errors (random). Capital letters represent the matrices of incidence for the aforementioned effects.

Before carrying out path analysis, a diagnosis of multicollinearity was carried out in the 45 correlation matrices among the explanatory variables (X' X), i.e., in the three correlation matrices (phenotypic, genotypic and genetic) for each one of the 15 experiments evaluated. The degree of multicollinearity was established based on the condition number (Montgomery and Peck 1992). Afterwards, path analysis was carried out according to the general causal diagram (Ferreira et al. 2007), in which sugarcane production is determined by its components (SH, SD and NS) by means of direct and indirect effects. The genotypic mean values were estimated via mixed models, using the software Selegen – REML/BLUP (Resende 2007), while analyses of variance, correlation estimates, diagnosis of multicollinearity and path analyses were performed with the assistance of the software R (R Development Core Team 2012).

RESULTS AND DISCUSSION

The mean values for SW in full-sib families were from 70 to 85 tons in the plant-cane stage, with production differences of up to 39 t ha-1 between the best and worst family of the same experiment. In the ratoon-cane stage, the mean value of production was from 104 to 113 t ha-1, and the greatest amplitude in the same experiment was only 2.69 t ha-1. In half-sib families in the ratoon-cane stage, the mean values of SW were from 87 to 91 tons, with differences in production of up to 76 t ha-1 between the best and worst family of the same experiment. The descriptive analyses for SW are shown in Table 1. In general, the statistics showed similar values between the experiments of full-sib families in the ratoon-cane and plant-cane stages.

Estimates of the correlation coefficients between SW and its components were positive, relatively high and significant (P < 0,05) by the t test (Table 2), suggesting that the increase in any of the components would cause a mean increase in SW. It is fitting to highlight that the significance of the genotypic correlations was evaluated by the expression: following Espósito et al. (2012). NS showed the greatest correlation with SW in 11 genetic correlation matrices (0.66 ≤ rgblup ≤ 0.89), in 13 phenotypic correlation matrices (0.66 ≤ rf ≤ 0.94) and in 10 cases in which the genotypic correlations were used (0.70 ≤ rg ≤ 0.96). In most experiments, SH was the trait with the second greatest genetic (0.51 ≤ rgblup ≤ 0.91), phenotypic (0.51 ≤ rf ≤ 0.98) and genotypic (0.52 ≤ rg ≤ 0.98) correlations with SW, followed by SD which also presented high genetic (0.10 ≤ rgblup ≤ 0.77), phenotypic (0.51 ≤ rf ≤ 0.90) and genotypic (0.12 ≤ rg ≤ 0.90) correlations with SW (Table 2).

In this study, NS exhibited the greatest importance in determining SW, since the correlations between the two traits were of high magnitude. However, as studies of correlations do not give the relative importance of the production components on the main variable, path analyses were carried out for the purpose of making a breakdown of the correlation coefficients into direct and indirect effects.

Analysis of the condition number indicated weak multicollinearity (Nc < 100) in the correlation matrix of the explanatory variables in 44 of the 45 analyses performed. Only the genotypic correlation matrix of experiment II exhibited moderate multicollinearity (Nc = 138.4) (Table 3), thus not being used in the analysis because, under these conditions, the variances associated with the estimators of path analysis may reach excessively high values, resulting in estimates that are not very reliable (Montgomery and Peck 1992).

The phenotypic, genotypic and genetic path coefficients efficiently explained the variations in SW, as indicated by the high values of the coefficient of determination of the model (R2 ≥ 0.80), as well as the small residual effect (pε ≤ 0.45). This result indicates the excellent contribution of these explanatory variables to sugarcane production (Tables 3, 4 and 5), confirming the possibility of practicing selection based on these variables.

The NS trait exhibited the greatest direct effect on SW in 12 of the analyses with the use of genotypic correlations, and in all analyses with the use of genetic and phenotypic analyses, and, in general, the values obtained were similar between the experiments in the plant-cane and ratoon-cane stages and between full-sib and half-sib families (Tables 3, 4 and 5). Sukhchain and Saini (1997), Silva et al. (2009) and Espósito et al. (2012) also obtained high direct effects of NS on SW, suggesting selection of clones for SW based in this trait. The direct effects (genetic, genotypic and phenotypic) of SD on SW were in general high, ranging from 0.11 to 0.79; however, its direct effects were only greater than the effects of NS on SW in experiment VIII and as of the genotypic correlation (Table 4). A similar result was obtained by Ferreira et al. (2007), who also obtained a greater direct of SD on SW with the use of genotypic correlation. However, with the use of the phenotypic correlation of the same experiment, the NS trait was that which exhibited the greatest direct effect on SW. The SH trait exhibited lower direct effects (genetic, genotypic and phenotypic) on SW, ranging from -0.33 to 0.57. The value of 0.57 was estimated in experiment IX, and, in this analysis, SH was the variable with the greatest direct effect on SW (Table 4).

In the present study, the values obtained from the genetic and phenotypic correlations in the experiments of half-sib families were identical. However, in the experiments of full-sib families, the genetic and phenotypic correlations exhibited differences, although the values of these correlations were consistent. The data of full-sib families are unbalanced because the participation of the parents in the biparental crosses was not equitable, as observed in the pedigree used in the analysis. This imbalance has led to different precisions in prediction of the genotypic values, even with the use of the REML/BLUP procedure (Barbosa et al. 2004). Nevertheless, pedigree information in full-sib families has shown more accurate estimates of genetic parameters (Nunes et al. 2008, Atkin et al. 2009), indicating that under these conditions, the genetic correlations estimated by mixed models (REML/BLUP) are more precise that the correlations estimated by analysis of variance (Durel et al. 1998). As the experiments of half-sib families are balanced, the genetic and phenotypic correlations did not exhibit differences, consequently generating the same results in path analyses (Table 5), showing that the REML/BLUP procedures and analysis of variance are equivalent in balanced experiments.

Conflicting values were obtained as a result of the use of genotypic correlations, which may have occurred due to overestimated correlations between the traits evaluated, as identified in the correlation matrix of experiment II, which exhibited correlations above one (Table 2) and, consequently, problems with multicollinearity. The values of genotypic correlations between all the pairs of traits were greater than the phenotypic and genetic correlations in all the experiments, agreeing with the theoretical and practical results obtained by other authors (Cheverud 1988, Ferreira et al. 2007, Espósito et al. 2012). For Cheverud (1988), high genotypic correlation without the same pattern of phenotypic correlation occurs due to the lack of precision in the estimates of the genetic components used in obtaining the genotypic correlation matrix. Although the results were more consistent with the use of genetic and phenotypic correlations, genotypic path analyses exhibited greater coefficients of determination and lower residual effects, explaining 100% of the variation in SW in eight analyses, with six being in experiments of full-sib families and two in half-sib families.

The SH and SD traits exhibited direct effects on SW less than the residual effects in 22 analyses. Among the traits, SH was the one with least influence on production, followed by SD, showing the lesser importance of these traits in the selection process. On the other hand, NS exhibited direct effect values greater than residual effects in all the experiments, once more showing that NS is the main determinant of the variations in SW.

Individual selection of clones (mass selection) applied in the initial stages of programs, in spite of being commonly practiced, has proven to be inefficient (Kimbeng and Cox 2003, Resende and Barbosa 2006) and the selection of families followed by individual selection of clones is recommended. Thus, the selection of the best families in sugarcane may be performed as based on the number of stalks (NS), whose heritability based on mean values of families has been greater than the heritability with individual plants, as shown by Kimberg and Cox (2003), Barbosa et al. (2005) and Oliveira et al. (2008). In addition, its direct effect on SW is of great magnitude. This practice may lead to a reduction in time and costs in the evaluation and selection process between and within families, increasing the efficiency of this process in the initial phases of sugarcane breeding programs since indirect selection through less complex traits with greater heritability and ease of measurement may result in greater genetic progress in relation to the use of direct selection.

The results of path analysis obtained in the plant-cane and ratoon-cane stages show greater consistency with the use of genetic and phenotypic correlations, with phenotypic correlation being equivalent to genetic correlation only in balanced experiments. The number of stalks trait is the main determinant of alterations in the sugarcane weight trait.

ACKNOWLEDGMENTS

The authors thank the CNPq, CAPES and FAPEMIG for the financial support for research projects.

Received 20 October 2012

Accepted 25 March 2013

  • Atkin FC, Dieters MJ and Stringer JK (2009) Impact of depth of pedigree and inclusion of historical data on the estimation of additive variance and breeding values in a sugarcane breeding program. Theoretical and Applied Genetics 119: 555-565.
  • Barbosa MHP, Resende MDV, Peternelli LA, Bressiani JA, Silveira LCI, Silva FL and Figueiredo ICR (2004) Use of REML/BLUP for the selection of sugarcane families specialized in biomass production. Crop Breeding and Applied Biotechnology 4: 218-226.
  • Barbosa MHP, Resende MDV, Bressiani JA, Silveira LCI and Peternelli LA (2005) Selection of sugarcane families and parents by Reml/Blup. Crop Breeding and Applied Biotechnology 5: 443-450.
  • Barbosa MHP and Silveira LCI (2012) Breeding and cultivar recommendations. In Santos F, Borém A and Caldas C (eds.) Sugarcane: bioenergy, sugar and ethanol: technology and prospects MAPA/ACS: UFV/DEA, Brasília, p. 301-318.
  • Chang YS and Milligan SB (1992) Estimating the potential of sugarcane families to produce elite genotypes using univariate cross prediction methods. Theoretical and Applied Genetics 84: 662-671.
  • Cheverud JM (1988) A comparison of genetic and phenotypic correlations. Evolution 42: 958-968.
  • Durel CE, Laurens F, Fouillet A and Lespinasse Y (1998) Utilization of pedigree information to estimate genetic parameters from large unbalanced data sets in apple. Theoretical and Applied Genetics 96: 1077-1085.
  • Espósito DP, Peternelli LA, Paula TOM and Barbosa MHP (2012) Análise de trilha usando valores fenotípicos e genotípicos para componentes do rendimento na seleção de famílias de cana-de-açúcar. Ciência Rural 42: 38-44.
  • Falconer SD and Mackay TFC (1996) Introduction to quantitative genetics Longman, Edinburgh, 464p.
  • Ferreira FM, Barros WS, Silva FL, Barbosa MHP, Cruz CD and Bastos IT (2007) Relações fenotípicas e genotípicas entre componentes de produção em cana-de-açúcar. Bragantia 66: 605-610.
  • Kang MS, Miller JD and Tai PYP (1983) Genetic and phenotypic path analysis and heritability in sugarcane. Crop Science 23: 643-647.
  • Kimberg CA and Cox MC (2003) Early generation selection of sugarcane families and clones in Australia: a review. Journal of the American Society of Sugar Cane Technologists 23: 20-39.
  • Montgomery DC and Peck EA (1992) Introduction to linear regression analyses John Wiley & Sons, New York, 527p.
  • Nunes AJR, Ramalho MAP and Ferreira DF (2008) Inclusion of genetic relationship information in the pedigree selection method using mixed models. Genetics and Molecular Biology 31:73-78.
  • Oliveira RA, Daros E, Bespalhok-Filho JC, Zambon JLC, Ido OT, Weber H, Resende MDV and Zeni-Neto H (2008) Seleção de famílias de cana-de-açúcar via modelos mistos. Scientia Agraria 9: 269-274.
  • R Development Core Team (2012) R: a language and environment for statistical computing. Available at <http://www.R-project.org/> Accessed in April 2012.
  • Resende MDV (2002) Genética biométrica e estatística no melhoramento de plantas perenes Embrapa Informação Tecnológica, Brasília, 975p.
  • Resende MDV and Barbosa MHP (2006) Selection via simulated individual BLUP based on family genotypic effects in sugarcane. Pesquisa Agropecuária Brasileira 41: 421-429.
  • Resende MDV (2007) Selegen-Reml/Blup: sistema estatístico e seleção genética comutadorizada via modelos lineares mistos Embrapa Florestas, Colombo, 358p.
  • Silva FL, Pedrozo CA, Barbosa MHP, Resende MDV, Peternelli LA, Costa PMA and Vieira MS (2009) Análise de trilha para os componentes de produção de cana-de-açúcar via blup. Revista Ceres 56: 308-314.
  • Silva PP, Soares L, Costa JG, Viana LS, Andrade JCF, Gonçalves ER, Santos JM, Barbosa GVS, Nascimento VX, Todaro AR, Riffel A, Grossi-de-Sa MF, Barbosa MHP, Sant'Ana AEG and Ramalho Neto CE (2012) Path analysis for selection of drought tolerant sugarcane genotypes through physiological components. Industrial Crop and Products 37: 11-19.
  • Sukhchain SD and Saini GS (1997) Inter-relationships among cane yield and commercial cane sugar and their component traits in autumn plant crop of sugarcane. Euphytica 95: 109-114.
  • Tyagi AP and Lal P (2007) Correlation and path coefficient analysis in sugarcane. The South Pacific Journal of Natural Science 25: 1-9.

Publication Dates

  • Publication in this collection
    20 Aug 2013
  • Date of issue
    July 2013

History

  • Received
    20 Oct 2012
  • Accepted
    25 Mar 2013
Crop Breeding and Applied Biotechnology Universidade Federal de Viçosa, Departamento de Fitotecnia, 36570-000 Viçosa - Minas Gerais/Brasil, Tel.: (55 31)3899-2611, Fax: (55 31)3899-2611 - Viçosa - MG - Brazil
E-mail: cbab@ufv.br