The parametric restrictions of the Griffing diallel analysis model : combining ability analysis

The expressions general combining ability (GCA) and specific combining ability (SCA) have been used since the 1940s (Sprague and Tatum, 1942) to designate properties of endogamic families and inbred lines under selection in hybrid production programs. Since then, maize breeders have been aware of the need to evaluate endogamic families involved in the selfing process, normally S3 progenies, in order to reduce the number of inbred lines. For this, the families are normally crossed with a tester (a singleor double-cross hybrid or an open-pollinated population) and the progenies are then evaluated experimentally. Outstanding endogamic families, capable of generating superior progenies, i.e., with an elevated GCA, would continue to be selfed. The inbred lines produced would also be evaluated based on crossing between themselves, a system called diallel. The aim would be to identify pairs of inbred lines that produce the best hybrids. This second stage of evaluation is known as the SCA test. Specific methods for estimating the effects of GCA and SCA or the variances of these effects have been described. These methodologies generally consist of an analysis of variance of data from progenies obtained by diallel, whether complete (Griffing, 1956a,b), partial (Geraldi and Miranda Filho, 1988), circulant (Kempthorne and Curnow, 1961), or otherwise. The main characteristics of these methods are their generality, since they can be used for any species, and the easiness of analysis and interpretation. Among several methods of combining ability analysis, that described by Griffing (1956b) is probably the most used. Easy computer handling, guaranteed by the availability of appropriate formulas, and the care the author and others took to discuss in detail the value of the effect variance estimates or the effect estimates (Griffing, 1956a; Cruz and Vencovsky, 1989) in breeding programs, have contributed to the widespread use of this model. Neverthless, not every aspect of this method has been evaluated in detail. If the diallel’s parents are not a sample from a population, i.e., when the model is fixed, then the parametric restrictions associated with the statistical model must be addressed. Does the model really have to be restricted? Do the imposed restrictions satisfy the genetic parameters? Do the restrictions make analysis and interpretation easier? The objective of this study was to answer these and other questions.


INTRODUCTION
The expressions general combining ability (GCA) and specific combining ability (SCA) have been used since the 1940s (Sprague and Tatum, 1942) to designate properties of endogamic families and inbred lines under selection in hybrid production programs.Since then, maize breeders have been aware of the need to evaluate endogamic families involved in the selfing process, normally S 3 progenies, in order to reduce the number of inbred lines.For this, the families are normally crossed with a tester (a single-or double-cross hybrid or an open-pollinated population) and the progenies are then evaluated experimentally.Outstanding endogamic families, capable of generating superior progenies, i.e., with an elevated GCA, would continue to be selfed.The inbred lines produced would also be evaluated based on crossing between themselves, a system called diallel.The aim would be to identify pairs of inbred lines that produce the best hybrids.This second stage of evaluation is known as the SCA test.
Specific methods for estimating the effects of GCA and SCA or the variances of these effects have been described.These methodologies generally consist of an analysis of variance of data from progenies obtained by diallel, whether complete (Griffing, 1956a,b), partial (Geraldi and Miranda Filho, 1988), circulant (Kempthorne and Curnow, 1961), or otherwise.The main characteristics of these methods are their generality, since they can be used for any species, and the easiness of analysis and interpretation.
Among several methods of combining ability analysis, that described by Griffing (1956b) is probably the most used.Easy computer handling, guaranteed by the availability of appropriate formulas, and the care the author and others took to discuss in detail the value of the effect variance estimates or the effect estimates (Griffing, 1956a;Cruz and Vencovsky, 1989) in breeding programs, have contributed to the widespread use of this model.Neverthless, not every aspect of this method has been evaluated in detail.If the diallel's parents are not a sample from a population, i.e., when the model is fixed, then the parametric restrictions associated with the statistical model must be addressed.Does the model really have to be restricted?Do the imposed restrictions satisfy the genetic parameters?Do the restrictions make analysis and interpretation easier?The objective of this study was to answer these and other questions.

MATERIAL AND METHODS
The data used here are those reported by Gardner and Eberhart (1966) (Table I).

Parametric values of the components of the Griffing (1956b) model
We consider a polygenic system with k genes, each with two allelic forms and no epistasis, that are responsible for determining a quantitative character in a diploid species with sexual reproduction.Independent of the reproduction system of the species (cross-pollination or selfpollination), the genotypic mean of a population can be expressed as do not satisfy the parametric values, and the same inferences should be established from the analyses using the model with restrictions that satisfy the parametric values of SCA effects and that suggested by Griffing.A consequence of the restrictions of the Griffing model is to allow the definition of formulas for estimating the effects, their variances and the variances of contrasts of effects, as well as for calculating orthogonal sums of squares.
(method 2, model 1), the restrictions relative to the specific combining ability (SCA) effects, ∑ ∑ s jj' = 0 and s jj + ∑ s jj' = 0, for all j, where m i is the mean of the genotypic values of the homozygotes relative to locus i, p ij is the frequency in population j of the locus i gene that increases the trait expression, a i is the difference between the genotypic value of the homozygote with highest expression and m i , and d i is the deviation due to dominance relative to locus i.Note that p ij is equal to 1 or to 0 if the species is autogamous.In the case of allogamous species, the population is in Hardy-Weinberg equilibrium and is not endogamic.If population j is one of the N parents of a diallel, the genotypic mean of the hybrid produced by crossing populations j and j' is The mean of the hybrids in which parent j participates, including population j, is where p i is the average frequency in the parents of the diallel of the locus i gene that increases character expression.
The diallel mean is The effect of GCA of population j corresponds to If the parents are open-pollinated populations, the greater the value of the GCA effect of a population, the greater the frequencies of the genes that increase the trait expression and the greater the differences between the gene frequencies of the population and the average frequencies in the diallel's parents.If the parents are inbred lines or pure lines, the greater the value of the GCA effect of a population, the greater the number of genes that increase the trait expression and, consequently, the greater the number of positive differences between the gene frequency of the population and the average frequency in the diallel's parents.Therefore, the effect of GCA is an indicator of the superiority of the population and of its divergence relative to the diallel's parents, thus providing the same information as the parameters 'population effect' (v j ) and 'variety heterosis' (H j ) of the Gardner and Eberhart (1966) model.The correlation between the GCA effect and v j approaches close to 1 as the degree of dominance approaches zero.For one gene and populations with p ij values equal to 0, 0.1, 0.2,  0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1, the correlation values are 0.87 and 1 when |d/a| = 2 and d/a = 0, respectively.
The genotypic means of population j and the hybrid of parents j and j' can be expressed as where s jj is the effect of SCA of a population with itself, and s jj' is the effect of SCA of populations j and j'.
When there is negative unidirectional dominance, the s jj values are positive.If the deviations due to dominance are positive, the s jj values are negative.When the SCA effect of a population with itself is null, the population has the same gene frequencies as the average frequencies in the group of the diallel's parents.Furthermore, the higher the absolute value of s jj , the greater the differences between the gene frequencies in the population and the average frequencies in the diallel's parents.Therefore, s jj is also an indicator of the population's divergence relative to the parental group.For one gene and populations with p ij values equal to 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1, the correlation between the absolute values of g j and s jj is 0.965 for any degree of dominance (d/a ≠ 0).The s jj values thus provide the same inferences as the estimates of the parameters 'average heterosis' and 'variety heterosis' of the Gardner and Eberhart (1966) model.Although s jj is a measure of the population divergence relative to the diallel's parents and 'variety heterosis' is an indicator of the differences between the gene frequencies in the population and the average frequencies in the other genitors, the correlation between the absolute values of these parameters for one gene is 1, independent of the populations and of the degree of dominance (d/a ≠ 0).
The parametric restrictions of Griffing's model When there is negative unidirectional dominance, the lowest values of s jj' identify the populations with the greatest differences in gene frequencies between themselves and in relation to the average frequencies in the diallel's parents.The highest values identify the populations with the smallest differences in gene frequencies, but with different gene frequencies relative to the average frequencies in the diallel's parents.When there is positive unidirectional dominance, the lowest s jj' values are associated with populations with the smallest differences in gene frequencies, but which have differences in gene frequencies relative to the average frequencies in the diallel's parents.The highest values indicate populations with the highest differences in gene frequencies between themselves and in relation to the average frequencies in the genitor group.Independent of the direction of the dominance effects, s jj' values close to the average value indicate populations with small differences in gene frequencies between themselves and relative to the average frequencies in the parental group.If the s jj' value equals zero, the gene frequencies in one of the populations are equal to the average frequencies in the diallel's parents.The average value of the SCA effects of different populations is Therefore the SCA of populations j and j' is an indicator of the divergence between them and of their divergence from the diallel's parents, in a manner similar to the parameter 'specific heterosis' (S jj' ) of the Gardner and Eberhart (1966) model.For one gene and populations with p ij values equal to 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1, the correlation between the effect of SCA of two populations and specific heterosis is 0.998, independent of the degree of dominance (d/a ≠ 0).This and the other results given above show that the methods of Griffing (1956b) and of Gardner and Eberhart (1966) are not complementary but indeed absolutely equivalent in terms of inference.
The phenotypic means in a diallel table can be defined as Y jj = M .. + 2g j + s jj + e jj Y jj' = M .. + g j + g j' + s jj' + e jj' where e jj and e jj' are the average values of the residues associated with the observations for population j and the hybrid of parents j and j', respectively.
The equations above define a statistical model that is necessarily restricted, since E(g j ) = 0 and E(s jj' ) = 0 (j fixed), in contrast to statistical models with fixed effects, where the researcher decides whether to impose parametric restrictions or not.Thus, the restrictions associated with the previously defined model for combining ability analysis are for all j, giving N + 1 linearly independent restrictions.This model is therefore not the same as that defined by Griffing (1956b) (method 2, model 1), whose restrictions are (i) ∑ g j = 0, (ii) ∑ ∑ s jj' = 0 and (iii) s jj + ∑ s jj' = 0, for all j, giving N + 1 linearly independent restrictions.
Note that restrictions (ii) and (iii) in the latter case do not satisfy the parametric values of the SCA effects, since considered by Griffing (1956b)?The answer is simply because they allow one to obtain formulas for the estimation of the effects and the effect variances and for the calculation of the sums of squares of the GCA and SCA.The availability of formulas is an indispensable condition for the widespread use of methods in quantitative genetics, and allows the elaboration of software for data processing by professionals who are not specialized in this field and in the theory of linear models.The model described here is of full column rank and, therefore, it is possible to develop formulas for estimating the effects and their variances and for computing the sums of squares.As will be shown, when there is dominance, the sum of squares attributable to the null hypothesis for GCA effects is not orthogonal to the sum of squares due to the hypothesis of nullity of the SCA effects.
Despite the differences between the two models, the hypotheses that can be tested coincide and include: 1. H 0(1) -equality of the treatment means (parents and hybrids): to test this hypothesis is the same as testing that there are no gene frequency differences between the genitors (p ij = p i for all i and j). 2. H 0(2) -nullity of the GCA effects (g j = for all j): in the case of rejection of the hypothesis H 0(1) , testing this hypothesis is the same as testing that the GCA effects of populations with different genetic structures are null.Of course, the rejection of H 0(2) means that there are differences in the gene frequencies between the parents; acceptance of the hypothesis does not imply the contrary.3. H 0(3) -nullity of the SCA effects (s jj' for all j and j'): if there are gene frequency differences between the parents, testing this hypothesis is the same as testing that there is no dominance (d i for all i).

RESULTS AND DISCUSSION
Although there is a difference between the analyses of variance for grain yield of six corn populations and their and, therefore, ∑ ∑ s jj' = ∑ s jj .Why, then, were they hybrids using the model described here and that proposed by Griffing (1956b), in relation to the sum of squares attributable to the null hypothesis of the GCA effects (Table II), the inferences remain the same, namely, there are differences in the gene frequencies between the populations and there is dominance in the polygenic system under analysis.
The differences between the estimates of the GCA effects and the SCA effects of each population with itself in the two models did not change the inferences since the correlations between the estimated values of g j and of s jj are, respectively, 0.9955 and 1.Nevertheless, the variances of the estimates of the effects and the contrasts between the effects of the model defined in this study are greater (Table III).Population 6 has the genes that increase yield at the highest frequency, and is the most divergent relative to the parental group.This population ought, therefore, to be selected for an intrapopulational improvement program.Population 4 is the second best, but not the second most divergent, and can also be chosen for intrapopulational improvement.The negative s jj values indicate unidirectional positive dominance.
Although the estimates of the SCA effects of different populations in the two models are different, the inferences that can be established are the same, since the correlation between the estimated values is 0.985.Differences between the variance values of the effects and the contrasts were also observed (Table IV).The variance estimates of s jj' effects and of the contrasts between these effects are lowest for the model developed above.Populations 3 and 4, 2 and 6, 1 and 2, and 3 and 6 showed the greatest differences in gene frequencies between themselves and in relation to the average frequencies in the diallel's parents.Thus, for interpopulational improvement programs, the second pair ought to be selected, because of the superiority of population 6.There was little divergence between parents 1 and 6, and 1 and 5, and from them relative to the group of parents, since their SCA effects come close to the average value (0.865).There were only slight differences in gene frequency between populations 1 and 3, and 2 and 3, although they were divergent in relation to the diallel's parents.The frequencies in these populations ought to be lower than the average frequencies.V(g j ) = 0.8218 V(g j ) = 0.7396 V(g j -g j' ) = 1.9722V(g j -g j' ) = 1.7750 V(s jj ) = 6.4097V(s jj ) = 3.8036 V(s jj -s j'j' ) = 12.6222 V(s jj -s j'j' ) = 7.1000 ^^^^^^T able IV -Estimates of the SCA effects of different populations (s jj' ) for the diallel analyses using the model defined in this study (values above the diagonal) and the Griffing (1956b)

CONCLUSIONS
The statistical models for combining ability analysis of a population group are obligatorily restricted.The restrictions ∑ ∑ s jj' = 0 and s jj + ∑ s jj' = 0, for all j, of the model proposed by Griffing (1956b) (method 2, model 1) do not satisfy the parametric values of SCA effects.Although there are differences between the analysis according to the model described here and that suggested by Griffing (1956b), the inferences should be the same.A consequence of the restrictions of the Griffing (1956b) model is to allow the definition of formulas for estimating the effects, their variances and the variances of contrasts of effects, as well as for the calculation of orthogonal sums of squares.In conclusion, it is generally quite safe to use the Griffing model.

Table II -
Griffing (1956b)ance of grain yield (bushels/acre) of six corn populations and their hybrids, based on the model with restrictions that satisfy the genetic parameter values and theGriffing (1956b)model (values in parentheses for general combining ability, GCA).

Table III -
Griffing (1956b) GCA effects (g j ), the SCA effects of a population with itself (s jj ), the variances of these effects and of the variances of contrasts between themselves for the model defined in this study and for theGriffing (1956b)model.
model (values below the diagonal), as well as estimates of variances of the effects and of contrasts between them.