The parametric restrictions of the Gardner and Eberhart diallel analysis model : heterosis analysis

It was studied the parametric restrictions of the diallel analysis model of Griffing, method 2 (parents and F1 generations) and model 1 (fixed), in order to address the questions: i) does the statistical model need to be restricted? ii) do the restrictions satisfy the genetic parameter values? and iii) do they make the analysis and interpretation easier? Objectively, these questions can be answered as: i) yes, ii) not all of them, and iii) the analysis is easier, but the interpretation is the same as in the model with restrictions that satisfy the parameter values. The main conclusions were that: the statistical models for combining ability analysis are necessarily restricted; in the Griffing model (method 2, model 1), the restrictions relative to the specific combining ability (SCA) effects, and for all j, do not satisfy the parametric values, and the same inferences should be established from the analyses using the model with restrictions that satisfy the parametric values of SCA effects and that suggested by Griffing. A consequence of the restrictions of the Griffing model is to allow the definition of formulas for estimating the effects, their variances and the variances of contrasts of effects, as well as for calculating orthogonal sums of squares.


INTRODUCTION
The importance of quantitative genetic analysis for breeders is reflected by the large number of publications in which this approach is used.The diallel genetic design and its various modifications have been used by breeders to evaluate the potential of populations for intrapopulational improvement and the usefulness of parents in interpopulational breeding programs, and to select inbred lines in hybrid development programs.Although several strategies for diallelic analysis have been proposed, few of them are commonly applied.The best-known methods are those developed by Jinks and Hayman (1953) and Hayman (1954aHayman ( ,b, 1958)), both exclusively for homozygous parents, that by Griffing (1956a,b), valid for any species, that by Kempthorne and Curnow (1961), for circulant diallel cross, that by Gardner and Eberhart (1966), normally used when the parents are open-pollinated populations, and those by Miranda Filho and Geraldi (1984) and Geraldi and Miranda Filho (1988), which are adaptations of the Gardner and Eberhart and the Griffing methods, respectively, for partial diallels.Of these, the Griffing (1956b) and Gardner and Eberhart (1966) methods are doubtless the most frequently applied.
The main reasons that justify the widespread use of the Griffing (1956b) method are its generality, since the parents can be clones, pure lines, inbred lines, or populations of a self-pollinated, cross-pollinated or intermediate species, and the ease of analysis and interpretation; the latter also characterizes the method developed by Gardner and Eberhart (1966).The genetic interpretation of parameters in the Gardner and Eberhart and the Griffing models and the relationship between them have been discussed by Vencovsky (1970) and Cruz and Vencovsky (1989), thereby making the methods more accessible to breeders.However, the parametric restrictions associated with these statistical models have not yet been examined in depth.Does the model necessarily have to be restricted?Do the restrictions satisfy the genetic parameters?Do the restrictions help to make the analysis and interpretation easier?The objective of this study was to answer these and other related questions.

MATERIAL AND METHODS
The same diallel analyzed by Gardner and Eberhart (1966) was considered here (Table I), although it is not an ideal database because the tests on variety heterosis and specific heterosis are not significant at the 5% level.
Parametric values of the components of the Gardner and Eberhart (1966) model We consider a polygenic system, with k genes, each with two allelic forms and no epistasis, and N nonendogamic populations in Hardy-Weinberg equilibrium involved in a diallel.The genotypic mean of population j (j = 1, 2 , ... , N) is where m i is the mean of the genotypic values of the homozygotes relative to locus i, p ij is the frequency in population j of the locus i gene that increases trait expression, The parametric restrictions of the

Abstract
The parametric restrictions of the diallel analysis model of Gardner and Eberhart (analysis II) were studied in order to address the following questions: i) does the statistical model really have to be restricted?ii) Do the restrictions satisfy the genetic parameter values?iii) Do the restrictions make analysis and interpretation easier?Objectively, the answers to these questions are: i) no, ii) not all, and iii) they facilitate the analysis, but the interpretation is the same as for the unrestricted model.The main conclusion was that the restrictions of the Gardner and Eberhart's estimators of the variety heterosis effects, the specific heterosis effects and their variances, differ from those of the unrestricted model.Any analysis using the unrestricted model and that of Gardner and Eberhart should lead to the same inferences, at least for those based on assessment of the population effects expressed as deviations from the average value, the heteroses, the average heterosis and the variety heteroses (the correlation between the estimates of the two models is 1).The limiting factor for the use of the unrestricted model is the lack of formulas for computing the sums of squares and for estimating the estimable function variances.
Departamento de Biologia Geral, Universidade Federal de Viçosa, 36571-000 Viçosa, MG, Brasil Gardner and Eberhart model related to specific heterosis effects, Σ S * jj' = 0 (j' ≠ j), for all j, do not satisfy the parametric values.The N j' = l a i is the difference between the genotypic value of the homozygote with highest expression and m i , d i is the deviation due to dominance relative to locus i, and v j is the effect of population j.Interpreting M j is the same as interpreting v j because m is a constant.The mean or the effect of a population is an indicator of its superiority relative to other populations in terms of the frequency of the favorable genes.If there is no predominant overdominance, the higher the genotypic mean or the effect of a population, the greater the frequencies of the genes that increase trait expression.
The average of the genotypic means of the diallel's parents is where p i is the average frequency of the locus i gene that increases trait expression, p 2 i is the average genotypic frequency of the homozygote of greatest expression relative to locus i, and v . is the average of the population effects.
The genotypic mean of the hybrid between populations j and j' is And the heterosis expressed in the hybrid is If there is dominance, the heterosis in the hybrid of parents j and j' indicates the degree of divergence between them.The higher the value of a heterosis, the greater the differences in gene frequency between the populations.
The mean of the heteroses expressed in the hybrids of the genitor j (variety heterosis of parent j) is where p i.(j) is the average frequency of the locus i gene that increases trait expression in the diallel's parents, except the genitor j, and p 2 i.(j) is the average genotypic frequency of the homozygote with highest expression relative to locus i in the diallel's parents, except the genitor j.
The heterosis of a population is null when the gene frequencies in the population are equal to the average frequencies in the other diallel's parents.Therefore, the higher the absolute value of the heterosis of a population, the greater the differences between the gene frequencies in the population and the average frequencies in the other diallel's parents, i.e., the higher the divergence compared to the other genitors.
The average heterosis is is the genotypic mean of the hybrids.
If there are differences in the gene frequencies between the diallel's parents, the average heterosis should be used to evaluate the existence and predominant direction of deviations due to dominance.If the null hypothesis is confirmed, this proves to be the absence of dominance or bidirectional dominance.A value above zero means that the dominance effects are predominantly positive (the genes with some degree of dominance are those that increase trait expression).Unidirectional negative dominance is indicated if H is less than 0.
The genotypic mean of the hybrid between genitors j and j' can be expressed as where S jj' is the specific heterosis of populations j and j', as defined by Gardner and Eberhart (1966).Finally, The discussion about specific heteroses is redundant for heterosis analysis involving the assessment of divergence between genitor pairs, although it provides more information than the H jj' values.For one gene and populations with p ij values of 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1, the correlation between H jj' and S jj' is approximately 0.84 for any degree of dominance (d/a ≠ 0).If there is unidirectional negative dominance, the lowest S jj' values identify the populations with the greatest gene frequency differences between themselves and in relation to the average frequencies in the diallel's parents.The highest values identify populations with the smallest gene frequency differences, but with gene frequencies different from the average frequencies in the diallel's parents.When there is unidirectional positive dominance, the lowest S jj' values are associated with populations having the smallest differences in gene frequencies, but which show differences in their gene frequencies relative to the average frequencies in the diallel's parents.The highest values indicate populations with the greatest differences in gene frequencies between themselves and in relation to the parental group.Independent of the direction of the dominance effects, specific heterosis values close to the average value (S .. = -2H) or near zero, when expressed as deviations from the average effect, indicate populations with small gene frequency differences between themselves and in relation to the average frequencies in the parental group.
The specific heterosis of populations j and j' is therefore an indicator of the divergence between them (as H jj' ) and of their divergence in relation to the diallel's genitors.The analysis of heteroses, variety heteroses or specific heteroses is also redundant when assessing the average heterosis for the predominant direction of deviations due to dominance.If genes with some degree of dominance increase trait expression, the heteroses and variety heteroses are predominantly positive and the specific heteroses are all or nearly all negative.Negative heteroses, variety heteroses and positive specific heteroses indicate unidirectional negative dominance.
Based on the previous results, the phenotypic means of population j and the hybrid from populations j and j' are where e j and e jj' are the average errors associated with the phenotypic means of parent j and the hybrid between populations j and j'.
The statistical models above are not those defined by Gardner and Eberhart (1966) because there are no restrictions associated with them.These equations therefore define an unrestricted model.Normally, the unrestricted model is used to provide estimates of population genotypic means, of heteroses, of average heterosis, and of variety and specific heteroses, while the restricted model of Gardner and Eberhart (1966) is used to estimate population effects expressed as deviations from the mean effect, heteroses, average heterosis, as well as the effects of variety and specific heteroses.Nevertheless, the interpretations are absolutely equivalent, as will be shown.
The genotypic means of a population and a hybrid can be defined as shown below since E(S jj' ) = S .. = -2H: where v * j = M j -M . is the effect of population j expressed as deviation from the average population effects, H * j is the effect of the heterosis of population j, and S * jj' is the effect of the specific heterosis of populations j and j'.
Since M ., H and S .. are constants, there is no difference at all between the interpretation of the estimates of the population genotypic means, the variety heteroses and the specific heteroses in the unrestricted model, and the interpretation of the estimates of the population effects expressed as deviations from the average, the variety heterosis effects and the specific heterosis effects in the restricted model.
Thus, the statistical models that describe the phenotypic means in the diallel table can be expressed as The restricted model defined by these equations is not the same as that proposed by Gardner and Eberhart (1966) since the parametric restrictions are different.The restrictions necessarily associated with the restricted model described here are (i) Σ v * j = 0, (ii) Σ H * j = 0, and (iii) Σ Σ S * jj' = 0 since E(v * j ) = E (H * j ) = E(S * jj' ) = 0, yielding three restrictions.
The restrictions of the Gardner and Eberhart (1966) model are giving N + 2 linearly independent restrictions.The difference between the two restricted models lies in the restrictions shown in item (iv), which do not satisfy the parametric values of S * jj' , since Σ S * jj' = H * j (j' ≠ j), for all j.If the restrictions in (iv), of which N -1 is linearly independent relative to the first three, are not coherent to the parametric values of the effects of specific heterosis, why were they considered by Gardner and Eberhart (1966)?
The answer is that without them, the normal equation system X'Xβ° = X'Y is, as in the unrestricted model, consistent and undetermined, which makes it impossible to define formulas for estimating the variances of the effects and the contrasts of effects and for calculating the sums of squares.If the formulas could not be defined, the methodology would have a more limited use, especially for software developers who are normally not specialists in quantitative genetics and linear models.The problem of missing formulas in the original paper, another fact that certainly limited its use, was corrected in a later article (Gardner, 1967).
Since the restricted and unrestricted models described here are the same, the estimable functions in one are the estimable functions in the other.This means that in the unrestricted model, the population effects expressed as deviations from the mean effect can be analyzed, as can be the effects of variety and specific heteroses.In the restricted model, population genotypic means and variety and specific heteroses can be analyzed.However, there are differences between these models and that proposed by Gardner and Eberhart (1966), not in relation to the analysis of variance nor to the testable hypotheses or to the estimable functions, but related to the estimators of several estimable functions and their variances.In any applied model, the tested hypotheses in the analysis of variance are: 1. H 0(1) : equality of the treatment means (parents and hybrids); testing this hypothesis implies testing that there are no differences in the gene frequencies between genitors (p ij = p i for all i and j). 2. H 0(2) : equality of the population means (M j = M .for all j) or equality of the population effects (v j = v .for all j) or nullity of the population effects expressed as deviations from the average effect (v * j = 0 for all j); if the hypothesis H 0(1) was rejected, testing this hypothesis implies testing the equality of the effects of populations with different genetic structures.The rejection of H 0(2) indicates gene frequency differences between the diallel's parents, but its acceptance does not indicate the contrary.
3. H 0(3) : nullity of the heteroses; if there are differences in the gene frequencies between populations, testing this hypothesis is equivalent to testing that there is no dominance (d i = 0 for any i).4. H 0(4) : nullity of the average heterosis; a redundant test in relation to H 0(3) although the statistics are associated with different degrees of freedom for the numerator.5. H 0(5) : equality of the variety heteroses (H j = H for all j) or nullity of the variety heterosis effects (H * j = 0 for all j); if there are differences between the gene frequencies of the parents and if there is dominance, testing this hypothesis is to test that the degree of divergence of each population in relation to the other diallel's parents is a constant.6. H 0(6) : equality of the specific heteroses (S jj' = S .. = -2H for all j and j') or nullity of the specific heterosis effects (S * jj' = 0 for all j and j'); if there are differences between the gene frequencies of the parents and if there is dominance, testing this hypothesis is to test that for all pairs of populations the magnitude of the divergence between themselves and between them and the diallel's parents is a constant.
In relation to the estimable functions, the following must be emphasized: all the estimable functions in the unrestricted model are also estimable in the Gardner and Eberhart (1966) model, even though the reciprocal is not true.Since in the latter the normal equation system is consistent and determined, the elements of the estimator of the parameter vector which are the estimators of the variety and specific heterosis effects are exclusive of this model.

RESULTS AND DISCUSSION
The analysis of variance of the diallel table (Table I) is valid for the unrestricted and the Gardner and Eberhart (analysis II) models (Table II).At the 5% level of signifi-   Gardner and Eberhart (1966).
Table II -Diallel analysis of grain yield (bushels/acre) of six corn populations and their hybrids, based on the unrestricted and Gardner and Eberhart (1966)  cance, the tests showed that i) there are differences in the gene frequencies between the populations, which justifies the inequality between their effects, ii) the deviations due to dominance contribute to the individual genotypic values for yield, iii) there are no differences between the varieties in their degree of divergence relative to the other diallel's parents, and iv) there are no differences in the degree of divergence between population pairs and between the pairs and the diallel's genitors.These results indicate that the populations chosen for an interpopulational improvement program should be those with higher frequencies of the genes that increase yield.
In the analysis according to the Gardner and Eberhart (1966) model, the variety sum of squares is the sum of squares attributable to the hypothesis of equality of the population effects, allowing for an absence of dominance or the presence of only average heterosis.Consequently, it is only possible to use the quotient between the variety mean square and the error mean square to test the H 0(2) hypothesis if there is no dominance or if there is only average heterosis.Indeed this was the situation described by Gardner and Eberhart (1966).Therefore, the test is adequate and has also been considered in the unrestricted model.If there were evidence of differences between the variety heteroses and/or the specific heteroses, the correct statistic for the test of the hypothesis of equality for the population effects would be F = 11.195/7.1 = 1.58 (probability = 0.18).The problem in this case is to obtain the adequate variety sum of squares (55.975 in the example given) since the latter is not orthogonal relative to the heterosis sum of squares, and it is therefore not possible to define an expression for its calculation.
The estimates of the main estimable functions in the unrestricted model, which are also estimable, as mentioned before, in the Gardner and Eberhart (1966) model, their variances and the variances of the contrasts between them are shown in Tables III and IV.As stated above, analyzing the estimates of the population genotypic means, and of the variety and specific heteroses is equivalent to analyzing the estimates of the population effects expressed as deviations relative to the average effect, the variety heterosis effects and the specific heterosis effects, because the correlation between the function estimates is 1.
These results showed that the populations 4 and 6 are superior to the others in terms of frequency of favorable genes and can be used in intrapopulational improvement programs, and that the dominance effects are predominantly positive.As there are no significant differences between the variety heteroses and the specific heteroses, both can also be used in a reciprocal recurrent selection program.The heterosis estimates show that the populations with the greatest gene frequency differences are 2 and 6, and 3 and 6.If the hypotheses of equality of the variety and specific heteroses were rejected, the following inferences would still hold: i) population 6 is the most divergent from the other parents, and would necessarily be chosen for an ^î nterpopulational improvement program, ii) the populations which diverge most between themselves and in relation to the parental group are 3 and 4, and 1 and 2, iii) populations 1 and 3, and 2 and 3 diverge at a small rate, but have different gene frequencies relative to the average frequencies in ) and the variances of these and other linear combinations of the parameters, relative to grain yield (bushels/acre), in the unrestricted model.
the parental group, and iv) the gene frequencies in populations 1 and 5, and 1 and 6 come close to the average values for the diallel's parents.Tables V and VI show the estimates of the parameters for the model developed by Gardner and Eberhart (1966).The differences in adjustment relative to the unrestricted model are limited to the estimates of the variety and specific heterosis effects and their variances.The variances associated with the estimates of variety heterosis and their contrasts are considerably smaller for the functions obtained by adjusting the unrestricted model.On the other hand, the estimates of the variances of the specific heterosis effects and their contrasts are smaller for the functions normally estimated by the Gardner and Eberhart (1966) model.However, the correlation between the estimated values of H * j and S * jj' are of high magnitude (1 and 0.96, respectively).Hence, the inferences that can be established tend to be the same as those obtained previously.If there were any statistical difference between the specific heteroses, only one inference would not conform with the results of the unrestricted model: the estimates of S * jj' indicate that populations 4 and 5, and 2 and 5 are the least divergent between themselves and in relation to the parental group.Gardner and Eberhart (1966) model do not satisfy the parametric values of the specific heterosis effects.Consequently, the estimators of the effects of variety heterosis, specific heterosis and their variances differ from those of the unrestricted model.Analyses using the unrestricted and the Gardner and Eberhart (1966) models should lead to the same inferences, at least in the assessment of population effects expressed as deviations from the average effect, the heteroses, the average heterosis and the variety heteroses (the correlation between the estimates of the two models is 1).The use of the unrestricted model is limited by the lack of formulas for calculating the sums of squares and the variance estimates for estimable functions, although this does not exclude the possibility of developing the appropriate software for analysis.In conclusion, it is generally quite safe to use the Gardner and Eberhart model.

Table I -
Mean grain yield (bushels/acre) of six corn populations and their hybrids 1 .

Table III -
Estimates of the population genotypic means (M j ), variety heteroses (H j ), population effects expressed as deviations from the average effect (v * j ), variety heterosis effects (H * j ) and the variances of these and other linear combinations of the parameters, relative to grain yield (bushels/acre), in the unrestricted model.

Table VI -
Gardner and Eberhart (1966)(H jj' ; values below the diagonal), specific heterosis effects (S * jj' ; values above the diagonal), average heterosis (H) and the variances of these and other linear combinations of the parameters, relative to grain yield (bushels/acre), based on theGardner and Eberhart (1966)model.

Table V -
Gardner and Eberhart (1966) effects expressed as deviations from the average effect (v * j ), variety heterosis effects (H * j ) and the variances of these and other linear combinations of the parameters, relative to grain yield (bushels/acre), based on theGardner and Eberhart (1966)model.