THEORY AND ANALYSIS OF PARTIAL DIALLEL CROSSES

This study presents theory and analysis of partial diallel crosses based on Hayman's methods. This genetic design consists of crosses among two parental groups. It should be used when there are two groups of parents, for example, dent and flint maize inbred lines, and the breeder is not interested in the assessment of crosses between parents of the same group. Analyses are carried out using data from the parents and their F1 hybrids allowing a detailed characterization of the polygenic systems under study and the choice of parents for hybridization. Diallel analysis allows the estimation of genetic and non-genetic components of variation and genetic parameters and to assess the following: genetic variability in each group; genotypic differences between parents of distinct groups; if a parent has a common or a rare genotype in the group to which it does not belong; if there is dominance; if dominant genes increase or decrease trait expression (direction of dominance); average degree of dominance in each group; the relative importance of mean effects of genes and dominance in determining a trait; if, in each group, the allelic genes have the same frequency; if genes are equally frequent in the two groups; the group with the greatest frequency of favorable genes; the group in which dominant genes are most frequent; the relative number of dominant and recessive genes in each parent; if a parent has a common or a rare genotype in the group to which it belongs, and the genotypic differences between parents of the same group. An example with common bean varieties is considered.


INTRODUCTION
A diallel is a mating system that involves all possible crosses among a group of parents.This genetic design is used to study polygenic systems that determine quantitative traits.Depending on the nature of the parents, individuals from an open pollinated variety or an F 2 generation, or selected open pollinated varieties or line cultivars, inferences based on diallel results are useful in planning base populations for intra and/or interpopulation improvement or parents for hybridization.The partial diallel is one of many variations of the diallel (Griffing, 1956;Kempthorne and Curnow, 1961;Gardner and Eberhart, 1966) and consists of crosses among two parent groups.This mating system, also called experiment II, design II and factorial design, was proposed by Comstock andRobinson (1948, 1952).It is adequate when there are distinct groups of populations, for example one group of dent inbred lines and the other of flint inbred lines, derived from a reciprocal recurrent selection program, and the breeder is not interested in evaluating the crosses between parents of the same group.The two parent groups can involve adapted or commercial populations and exotic germplasm or plant introductions, for example, dwarf cashew tree and normal clones, small and large seeded cultivated common beans, among many others.As generally the parents are open-pollinated populations or pure lines defined by the breeder, there is no base population for inferences.The inferences must be established in relation to the set of parents.Jinks and Hayman (1953) and Hayman (1954Hayman ( , 1958) ) presented methods for diallel analysis, with a firm and refined theoretical basis, using data from parents and their F 1 progeny or parents, F 1 and F 2 generations.These methods have been used ever since (Jinks, 1954;Dickson, 1967;Chung and Stevenson, 1973;Kornegay and Temple, 1986;Nishimura and Hamamura, 1993), although the methodologies proposed by Griffing (1956) and Gardner and Eberhart (1966) are the most frequently employed.Limitations of these methods are discussed by Gilbert (1958), Nassar (1965), Coughtrey and Mather (1970), and Sokol and Baker (1977).As Hayman's proposals (1954Hayman's proposals ( , 1958) ) are not valid for partial diallels, a separate theory is required.The theory and analysis of partial diallel crosses presented are a generalization of the methodology of Hayman (1954).If gene frequencies are the same for the two parent groups, results are equal to those presented by Hayman.

General theoretical considerations
Consider a polygenic system with k genes controlling a quantitative trait in a diploid species and the following conditions: a) Mendelian inheritance; b) absence of reciprocal effects; c) absence of non-allelic interaction; d) no multiple allelism; e) no correlation in the distribution of non-allelic genes in the parents, and f) homozygous parents.If N parents (N ≥ 6) are divided into two groups, one with n and the other with n' parents (n + n' = N, n and n' ≥ 3), three polygenic systems are defined: one related to the group with n parents, one associated to n' parents and the last related to N parents.Let p r be the genotypic value of the rth parent of the n

ABSTRACT
This study presents theory and analysis of partial diallel crosses based on Hayman's methods.This genetic design consists of crosses among two parental groups.It should be used when there are two groups of parents, for example, dent and flint maize inbred lines, and the breeder is not interested in the assessment of crosses between parents of the same group.Analyses are carried out using data from the parents and their F 1 hybrids allowing a detailed characterization of the polygenic systems under study and the choice of parents for hybridization.Diallel analysis allows the estimation of genetic and non-genetic components of variation and genetic parameters and to assess the following: genetic variability in each group; genotypic differences between parents of distinct groups; if a parent has a common or a rare genotype in the group to which it does not belong; if there is dominance; if dominant genes increase or decrease trait expression (direction of dominance); average degree of dominance in each group; the relative importance of mean effects of genes and dominance in determining a trait; if, in each group, the allelic genes have the same frequency; if genes are equally frequent in the two groups; the group with the greatest frequency of favorable genes; the group in which dominant genes are most frequent; the relative number of dominant and recessive genes in each parent; if a parent has a common or a rare genotype in the group to which it belongs, and the genotypic differences between parents of the same group.An example with common bean varieties is considered.
If p t is the genotypic value of the tth parental in the N parental group, then: For all loci in the polygenic system under study, the parameter d a is the difference between the genotypic value of the homozygote of largest expression and the genotypic mean of the homozygotes (m a ).The θ variable assumes the value -1 or 1 depending on if the parent is homozygous for a gene which diminishes or increases trait expression, respectively.If u a and v a are frequencies of alleles that increase and reduce trait expression, respectively, the following expectation and variance apply to the n parental group: In relation to the n' parental group: In the group formed by N parents: E(θ ta ) = u a -v a = w a = pw a + qw', with u a = pu a + qu' , v a = pv a + qv', p = n / N, and q = n' / N, and Genotypic mean of the n parental group is: Genotypic mean of the n' parental group is:

Diallel analysis of parents and their F 1 hybrids
The genotypic mean of the hybrid derived from the rth and sth parents is: with Σ h a (1 -θ ra θ sa ) = h rs being the specific heterosis of the rth and sth parents.The parameter h a is the difference between the genotypic value of the heterozygote and m a .
The genotypic mean of the hybrids from the rth parent or mean of the rth array is: where Σ h a (1 -w'θ ra ) = h r is the varietal heterosis of the rth parent.
The genotypic mean of the hybrids from the sth parent or mean of the sth array is: where Σ h a (1 -w a θ sa ) = h s is the varietal heterosis of the sth parent.
The genotypic mean of the F 1 hybrids is: where Σ h a (1 -w a w') = h is the average heterosis.
In the presence of dominance, specific heterosis of the rth and sth parents is nil if the parents have the same genotype (θ ra θ sa = 1, for every a) and greatest when the parents are carriers of distinct alleles at the k loci (θ ra θ sa = -1, for every a).Varietal heterosis of a parent is least when it carries the most frequent genes in the group of parents to which it does not belong and greatest if it is the carrier of the least frequent genes.If deviations due to dominance 1 2 are of the same magnitude and the varietal heterosis in a group is constant, the allelic genes in the other parental group have the same frequency (w a or w' equal to zero, for every a).Since (1 -w a w') ≥ 0, if h > 0, positive unidirectional dominance is established.In other words, the deviations due to dominance should be predominantly positive.If h = 0 and there is dominance, positive and negative deviations are present (bi-directional dominance).Dominance is negatively unidirectional when h < 0. If m L0 -m' is greater than zero, there is evidence that genes which increase trait expression are more frequent in the n parental group.If this difference is negative, genes which increase trait expression should be more frequent in the n' parental group.When the difference is nil, genes are equally frequent in the two parental groups (w a = w', for every a) or, alternatively, some genes which increase trait expression are more frequent in one group of parents, while others are more frequent in the other group.

Genetic components of variation
Analysis of the diallel table allows the definition of the following second degree statistics: (1) Variance of the genotypic means of the n parents (2) Variance of the genotypic means of the n' parents (3) Variance of the genotypic means of the N parents (4) Covariance between genotypic mean of rth parent hybrid and the mean of the nonrecurrent parent (covariance in the rth array) (5) Variance of the genotypic means of the rth parent hybrids (variance in the rth array) (8) Covariance between genotypic mean of rth parent hybrid and the mean of the nonrecurrent parent array Covariance between genotypic mean of sth parent hybrid and the mean of the nonrecurrent parent array Analysis of the magnitude and significance of the additive components D (1) and D (2) allows establishment of several inferences about genetic variability of each parental group.If the additive component of a group is nil, the parents have the same genotype (w 2 = 1 or w' 2 = 1, for every a) and, therefore, there is no genetic variability.When nonfixed genes are involved, the parents have distinct genotypes and the additive component is different from zero.Its value is largest when alleles are equally frequent.
The following differences are also useful: If the differences are nil, the genes are equally frequent in the two parental groups.If not, it can be concluded that the genes do not have the same frequency in the two groups.When D (1) -D (2) < 0, the n' parental group is more variable.If genetic variation is greater in the n parental group, then D (1) -D (2) > 0.
The F component of a parent can be negative, nil or positive.When negative, it indicates that the parent has more recessive than dominant genes (h a θ a < 0).If positive, the parent has more dominant genes (h a θ a > 0).If there is no dominance, F values of the parents are nil.When dominance exists, F = 0 indicates that the parent carries approximately the same number of dominant and recessive genes.For any parental group, the F value of one parent is directly proportional to the number of dominant genes it carries that are not fixed in the other group of parents.Therefore, the F value of a parent is largest when it carries all the dominant genes that are not fixed in the other group.The average F value of a parental group indicates the frequencies of dominant and recessive allelic genes.When it is positive, it indicates that dominant genes are more frequent than recessive alleles in that group (h a w a > 0 or h a w' > 0).If recessive genes are more frequent in the group, its value is negative (h a w a < 0 or h a w' < 0).If there is dominance and the average F value is nil, the alleles are equally frequent.
Given the presence of genetic variability among parents in a group, the H 1 component of the group will be nil in the absence of dominance and positive in the presence of dominance.The difference H 1(1) -H 1(2) provides the same information as D (1) -D (2) .If there is genetic variability in the two parental groups, components H 2r , H 2s and H 2 will be positive in the presence of dominance and nil in its absence.Absence of variability in a parental group makes components F and H 2 of parents from the other group, and their respective average values, to be nil.Dominance components H 1(1) , H 1(2) and H 2 have the same magnitude when allelic genes have the same frequency in the two parent groups (u a = u' = 1/2, for every a).
In the presence of dominance, the relative magnitude of H 2 components of parents in a group supplies important information.The H 2 component of a parent is largest when it carries the least frequent genes of the group to which it belongs (w a θ ra or w' θ sa < 0, for every a), and smallest when it carries the most frequent genes (w a θ ra or w'θ sa > 0, for every a).Generally, the value of the H 2 component of a parent is inversely proportional to the concentration of the most frequent genes of its group, that are nonfixed in the other group.The difference between H 2 components of two parents from the same group can be nil, if the parents have the same genotype or if the most frequent genes in the parents are different.If all H 2 components of parents in a group have the same magnitude, allelic genes are equally frequent in the group.

Average degree of dominance (h/d)
If h a = h and d a = d, then H 1(1) /D (1) and H 1(2) /D (2) are equal to h/d.Therefore, H 1(1) /D (1) and H 1(2) /D (2) express the average degree of dominance in the polygenic systems defined by the n and n' parental groups, respectively.

Proportion between dominant and recessive genes in parents of the same group
Following Hayman's approach, it can be demonstrated that the ratios reflect the average value of the proportion between the number of dominant and recessive genes in the n and n' parental groups, respectively.A ratio close to zero indicates that parents in the group have few dominant genes and many recessive ones.A ratio near one indicates that parents in the group have the same number of dominant and recessive genes.A ratio larger than one indicates that parents in the group have many dominant genes and few recessive ones.

Direction of dominance
Defining k + and k-as the number of dominant genes which increase and decrease trait expression, respectively, average heterosis can be expressed in the following way: If h a = h, d a = d, w a = w and w' = w', then: If h a = h, d a = d and w a = w' = w, then h 2 /H 2 = (k +k -) 2 /k.Therefore, if dominance exists in the polygenic system under study, but the above ratio is nil, the number of (k + -k-) 2 Theory and analysis of partial diallel crosses dominant genes with a positive effect is the same as the number of dominant genes with a negative effect.In other words, dominance is bi-directional.If the ratio is positive, the numbers of dominant genes which increase and decrease trait expression are unequal, that is, dominance is predominantly unidirectional.

The relationship between variance and covariance in the arrays
Considering the additive-dominant model the differences W r -V r = (1/4)(D (2) -H 1(2) ) and W s -V s = (1/4)(D (1) -H 1(1) ) are constant for every r and s.Thus, the coefficients of the regressions of W r as a function of V r (W r = β 0 + β 1 V r ) and W s as a function of V s (W s = β 0 + β 1 V s ) are equal to 1. Intercepts of the regressions of W r on V r as well as W s on V s are, respectively: Similarly to Hayman (1954), it can be shown that (W r , V r ) and (W s , V s ) values occur on the straight line delimited by the parabola W 2 = V 0L0(2) V r and W 2 = V 0L0(1) V s , respectively.If analysis of variance of the differences W t -V t indicates that W r -V r and W s -V s are constants and/or, if the regression analyses of W r on V r and W s on V s show that β 1 = 1, the additive-dominant model adequately describes the observed results.When, β 1 ≠ 1, the additivedominant model is inadequate.Hayman (1954Hayman ( , 1958) ) and Mather and Jinks (1974) presented detailed discussions about the causes of inadequacy of the additive-dominant model, including the presence of extranuclear gene effects, non-allelic interaction, multiple allelism, correlation in the distribution of non-allelic genes and heterozygosis in the parents.Hayman (1954Hayman ( , 1958) also discussed how to proceed with diallel analysis when one or more of the additive-dominant genetic model restrictions are not fulfilled.
Absence of a relationship between W r and V r and between W s and V s (β 1 = 0) indicates the absence of dominance in the polygenic system under study, since, when h a = 0, W r = (1/2)D (2) , V r = (1/4)D (2) , W s = (1/2)D (1) and V s = (1/4)D (1) .Thus, when W is plotted against V for each group, all points occur at the ((1/2)D, (1/4)D) position.When the additive-dominant model is adequate, the constant of the regression of W r on V r indicates the average degree of dominance in the polygenic system defined by the n' parental group.The intercept of the regression of W s on V s expresses the average degree of dominance in the polygenic system defined by the n parental group.If β 0 < 0, there is evidence of overdominance.When β 0 = 0, there is complete dominance.Partial dominance exists when β 0 > 0. If the genes are equally frequent in the two parental groups, then: Thus, the coefficient of the regression of W t on V t (W t = β 0 + β 1 V t ) is equal to one, and the intercept is (1/ 4)(D -H 1 ), if the additive-dominant genetic model is adequate.The straight line is delimited by the parabola W 2 = DV t .

The relationship between the sum of the variance and covariance in the arrays and the genotypic value of the common parent
It has already been demonstrated that variance and covariance in an array are smallest when the common parent is a carrier of dominant genes and largest when it is a carrier of recessive genes.Thus, the relationship between the genotypic value of the common parent and the sum between the variance and covariance in the parent array (p r = β 0 + β 1 (W r + V r ) or p s = β 0 + β 1 (W s + V s )) defines the presence and direction of dominance.The regression coefficient of p r on W r + V r is: The coefficient of the regression of p s as a function of W s + V s is: The existence of a relationship between the genotypic value of the common parent and the sum between variance and covariance in its array indicates the presence of unidirectional dominance.If the regression coefficient is positive, deviations due to dominance are predominantly negative, contributing to a decrease in trait expression.If the coefficient is negative, the deviations of dominance contribute, exclusively or mostly, to an increase in trait expression.If there is no dominance, the graphic points of p r over W r + V r and p s over W s + V s are (p r , (3/4)D ( 2) ) and (P s , (3/4)D ( 1) ), respectively.When there is dominance, but β 1 = 0, dominance is bi-directional.Points are randomly distributed on the p over W + V graph.

Nongenetic components of variation
Let y t be the average phenotypic value of a parent and y rs the average phenotypic value of the hybrid between the rth and sth parents.Then, y t = p t + e t (e t ∼ (0, E), independents) y rs = f rs + e rs (e rs ∼ (0, E'), independents) p t and f rs are genotypic values, and e t and e rs are non-genetic effects.Letting e t and e rs be independent of the genotypic values as well as between themselves, then: If the phenotypic values of the parents and their hybrids in the diallel table are averages of b replications, the error mean square of the analysis of variance considering N parents, divided by b, is an estimator of E. The error mean square of the analysis of variance of the nn' hybrids, divided by b, is an estimator of E'.If the parent and hybrid means have the same precision, then Ê = Ê', and both are equal to the error mean square of the analysis of variance of parents and F 1 hybrids, divided by b.

This is carried out based on fitting the linear model
For estimation purposes the ordinary (Cov(ε) = σ 2 I), weighted or generalized least squares methods can be used (Hayman, 1954;Mather and Jinks, 1974).Alternatively, the maximum likelihood method (Hayman, 1960) can also be used.

APPLICATION
In Table I is presented the grain yield per plant, in grams, of nine common bean lines (Phaseolus vulgaris L.) and 18 F 1 hybrids.Six parent lines belong to one group (group 1) and three to the other (group 2).The former group includes cultivars with good performance in low temperature conditions.Parents and their hybrids were assessed during fall-winter of 1992 at the Federal University of Viçosa, in Viçosa, State of Minas Gerais, Brazil, using a completely randomized design, with four replications.The results of the regression analyses of W r on V r (r = 1, 2, ..., 6) and W s on V s (s = 1, 2 and 3) are in Table II.The existence of a relationship between covariance and variance in the arrays shows that the yield per plant depends on nonadditive gene effects.As the regression coefficients are statistically equal to one, the additive-dominant model is suitable to describe the results.Tests of the hypothesis H 0 : β 0 = 0 indicate that, in the polygenic systems defined by the two parent groups, dominance between nonfixed allelic genes is, on average, complete.Estimation of genetic and nongenetic components of variation was carried out considering two nongenetic components and the ordinary least squares method.The error mean squares of the analyses of variance considering only parents and only F 1 hybrids were 8.79 and 19.20, respectively.The estimates are presented in Table III.Diallel analysis indicated the presence of variability in the two parent groups, with type I error probability of approximately 0.056 for group 2. Since D (1) -D (2) is positive (P = 0.09815), the genes that determine yield per plant are not equally frequent in the two parent groups.Genetic variability is greater in group 1.Estimates of the mean values of the allelic gene frequency products, in the two groups, favor the previous inference.In group 1, there is evidence that allelic gene frequencies are close to 1/2 (uv = 0.24).In group 2, allelic genes have unequal frequencies (u'v' = 0.17).Considering that m L0 -m' = 3.32 (0.01 < P < 0.05), it can be concluded that yield-increasing genes have a frequency less than 1/2 in the second parental group.In short, the frequency of genes which increase trait expression is approximately 1/2 in group 1 and less in group 2, which explains the greater genetic variability found in the former.
Parent F values from one group allow parents to be ranked according to the number of dominant genes they carry, not fixed in the other parental set.Ranking group 1 parents, according to an increasing concentration of dominant genes not fixed in group 2, the following order is obtained: Ricopardo 896, Ouro, DOR.241, Ouro Negro, RAB 94 and Antioquia 8. Ricopardo 896, Ouro and DOR.241 should have more recessive than dominant genes.Ouro Negro should have, approximately, similar numbers of dominant and recessive genes.RAB 94 and Antioquia 8 have more dominant than recessive genes.In group 2, the variety Batatinha carries the largest number of dominant genes, not fixed in group 1.Both BAT-304 and FT-84-835 should have dominant and recessive genes in approximately equal numbers.Data indicate that recessive genes Using the t statistic, with four and one degrees of freedom (d.f.) for the error in the analyses considering groups 1 and 2, respectively.b With four and one degrees of freedom in the analyses for groups 1 and 2, respectively.** P < 0.01; * 0.01 < P < 0.05; + 0.05 < P < 0.10; ++ P > 0.10.generally act to decrease plant yield, while dominant ones increase yield.This fact, established in the analysis of variance (not presented), seems to be corroborated by the estimate of h 2 D (1) D (2) /H 2 D 2 , which indicates unidirectional dominance, despite its small magnitude.The regression analyses of p r on (W r + V r ) and p s on (W s + V s ), however, do not confirm the above information.The results of these regression analyses, using the observed parent means as the genotypic values, are in Table IV.
Estimates of the regression coefficients show the presence of statistically nonsignificant positive unidirectional dominance.This seems to indicate that many nonfixed dominant genes act to decrease yield, although the majority of them increase it.Estimates of the average degree of dominance are larger than one, although previous results indicate complete dominance.It is reasonable, therefore, to admit that they are statistically equal to one.As there is evidence of complete dominance, estimates of the proportion among dominant and recessive genes suggest that the parents in group 1 have, on average, more recessive than dominant genes, while in group 2 the parents have, on average, the same number of recessive and dominant genes.These results agree with those from the analyses of the parent F values, as the majority of the parents in group 1 have more recessive than dominant genes, while in group 2 the majority have equal numbers of dominant and recessive genes.
The estimated F value for group 1 indicates that recessive genes, not fixed in group 2, are more frequent than dominant alleles.Taking the previous results into consideration, it can be concluded that this superiority is small.The estimate of F (2) suggests that dominant and recessive alleles, not fixed in group 1, are equally frequent in group 2, which is inconsistent with previous results.The presence of bi-directional dominance and fixed genes in the two parent groups may cause conflicting results in the genetic analysis.Many dominance component estimates show, as expected, the presence of dominance in the polygenic system under study.H 2 values of RAB 94, Antioquia 8 and Batatinha show that these genotypes are rare in their respective groups.They carry the least frequent genes in their respective groups which are not fixed in the other group of parents.As already seen, they have more dominant genes than other parents in their group.
The difference between H 2 values of RAB 94 and Antioquia 8 is nil (P = 0.30045), indicating that they have similar genotypes, in regard to yield determining genes.The cross between them should not result in relevant heterosis, although it may be carried out to combine its nonfixed desirable genes in a single variety.As they have many dominant genes, which are rare in their group and not fixed in group 2, both can be used in crosses with the Batatinha variety, of group 2, which also has many dominant genes, rare in its group and not fixed in group 1.Estimates of specific heteroses show that RAB 94 and Antioquia 8 genotypes are not completely distinct from that of the variety Batatinha (not presented).The small magnitude of the two specific heteroses, nil from a statistical point of view, is not surprising, since these varieties have more dominant than recessive genes.Values of the varietal heteroses indicate that RAB 94 has more frequent genes of group 2 than Antioquia 8, while Batatinha has less frequent genes in group 1 than .
Use of RAB 94 and Batatinha and/or Antioquia 8 and Batatinha as parents is justifiable if the objective of the program is to obtain pure lines.This objective would combine, in a single line, the favorable dominant genes which are not fixed in the parents.The smaller variability expected in the segregant generations derived from crosses between Antioquia 8 and RAB 94 and, preferably, between Antioquia 8 and RAB 94 with Batatinha could favor selection of a line with superior gene combination, containing the favorable dominant and recessive nonfixed genes from both parents.Crosses of RAB 94, Antioquia 8 and Batatinha, with parents with many recessive genes, even in the same group, should show high levels of heterosis, due to their genotypic differences.If the objective is production of hybrids, several crosses, such as Antioquia 8 with FT-84-835 or BAT-304 and Batatinha with Ouro or Ricopardo 896 or Ouro Negro, may be considered.These crosses could also be considered in programs for pure lines, as there are various recessive genes that increase yield.Due to the accentuated genetic variability expected in the segregant generations of these crosses, it is necessary to use large number of plants and/or families to increase the probability of selecting a variety with a superior combination of genes.

Table III -
Estimates of the genetic and non-genetic components of variation and of genetic parameters, in relation to grain yield of common bean plants, in grams.Indicator of the direction of dominance.R 2 : Determination coefficient.b Based on the t-test with seven degrees of freedom.

Table II -
Summary of the regression analyses of W r on V r and of W s on V s , regression coefficient estimates and significance level of the test of the hypothesis H 0 : β 1 = 1 a , in relation to grain yield of common bean plants, in grams.

Table I -
Grain yield per plant, in grams, of nine common bean lines and their hybrids.

Table IV -
Summary, for each group of common bean lines, of the regression analyses of the parent genotypic value on the sum of the variance and covariance in its array, in relation to grain yield per plant, in grams.