Components of variation of polygenic systems with digenic epistasis

In this paper an extension of the biometric model of Mather and Jinks for the analysis of variation with digenic epistasis is presented. Epistatic effects can contribute favorably to the determination of the genotypic values of selected individuals or families and of superior hybrids. Selection will be inefficient, however, if there is a large number of interacting genes because the epistatic components of the between-family and within-family genotypic variances are very high compared to the portion attributable to the average effects of genes. Selection tends to be efficient when the number of interacting genes is reduced, but this depends on the magnitude of due to dominance and environmental variances. The dominance component (H) and the epistatic component due to interactions between homozygous and heterozygous genic combinations (J) can only be estimated when one or more quadratic statistics from the S3 generation, obtained by randomly mating F2 individuals, are used.


INTRODUCTION
Breeders can assess the potential of base populations for their use in breeding programs and the selection efficiency by assessing the relative importance of the additive, dominance and epistatic effects in determining each important trait, as well as choosing the selective procedure that will maximize genetic gain with one or more selection cycles.In a polygenic system, additive effects are effects which are individually attributable to genes determining a quantitative trait.The existence of differences between the additive genetic values of the individuals in a population is a necessary condition for intrapopulational breeding.The viability of a breeding program aimed at developing hybrids depends on the existence of dominance effects, that is the interaction between allelic genes (Hallauer and Miranda Filho, 1988;Falconer and MacKay, 1996).
Epistatic effects are those effects due to interactions between non-allelic genes.Many questions remain as to the importance of epistasis in breeding programs.In crosspollinated species, in which individual plants possess homozygous and heterozygous genic combinations, what is the importance of additive x additive, additive x dominance and dominance x dominance epistatic effects?If the objective of a breeding program is the development of superior pure lines only additive x additive, additive x additive x additive, etc., epistatic effects can contribute to the superiority of a line in relation to the outstanding parent because each population is formed of one homozygous genotype.If the objective of the program is to develop single, double and three-way crosses, different kinds of epistatic effects can be important to ensure the production of a su-perior population because the genotype (or each genotype present) has homozygous and heterozygous genic combinations.
Inferences regarding the genetic control of quantitative traits are made by means of methods that employ linear and quadratic statistics, e.g., means, variances and covariances.The methods normally employed in determining genetic components of generation means and genotypic variances and covariances do not permit the assessment of the contribution of epistatic effects or the assessment of their relative importance compared to other effects.In genetic studies it is commonly thought that epistatic effects contribute little to the genotypic values of individuals, and that epistatic variance is small or negligible compared to both additive and dominance variance.However, there is evidence from many analyses that epistatic effects cannot always be ignored (Rishipal, 1993;Ramsay et al., 1994;Saha Ray et al., 1994;Rahman et al., 1994;Mgonja et al., 1994;Bartual et al., 1994;Das and Griffey, 1995;Barakat, 1996).
In a theoretical work on the analysis of the genetic effects of several oil palm traits, Baudouin et al. (1995) concluded that "Epistasis effects may contribute substantially to population means if the material tested is highly heterozygous, the genetic base is narrow (selected material or few individuals used) or there is linkage disequilibrium (due to further selection and insufficient intercrossing generations.)",although in the papers published by Balatero et al. (1995), Gingera et al. (1995) and Holtom et al. (1995) there was no evidence of epistasis.In all cited papers, the methodology used was generation mean analysis with first degree epistasis, either exclusively or associated with diallel (Bartual et al., 1994;Mgonja et al., 1994) or triple test cross analysis (Ramsay et al., 1994), or with analysis of variation without epistasis (Holtom et al., 1995;Barakat, 1996).A limitation of generation mean analysis is that the absence of the linear components attributable to epistatic effects does not imply the absence of epistasis, since the linear components of means can be null even when constituent effects are not.In this case, there are both positive and negative effects (Mather and Jinks, 1974;Kearsey and Pooni, 1996).This problem reveals the importance of analysis of variation using quadratic statistics in establishing inferences on the genetic control of quantitative traits.
In some genetic studies it is therefore necessary to take into account the contribution of epistatic effects to the expression of one or more traits under analysis to obtain unbiased estimates of the genetic parameters.Mather and Jinks (1974) and Kearsey and Pooni (1996) discusses the effect of epistasis on genotypic variances and covariances without considering the estimation of the epistatic components.The present paper is an extension of the model presented by the above authors in which I consider analysis of variation in the presence of first degree epistasis.

COMPONENTS OF VARIATION
The genotypic values of individuals for a digenic system with epistasis in which the genes have independent assortment and two allelic forms (A/a and B/b) are presented in Table I (Mather and Jinks, 1974), where d is the difference between the genotypic value of the homozygote with greatest expression and the mean of the genotypic values of the homozygotes (m), h is the difference between the genotypic value of the heterozygote and m, i is the epistatic effect due to the presence of two homozygous genic combinations in an individual, j is the epistatic effect attributable to the presence of one homozygous genic combination and one heterozygous genic combination and l is the epistatic effect due to the presence of two heterozygous genic combinations.If P 1 and P 2 are two homozygous parents having different allelic genes at the two loci under consideration, regardless of the gene distribution in the parents, the following can be shown: Variance (V) of the genotypic values of the F 2 individuals is: Variance of the genotypic means of the F 3 families is: Mean of the variances of the genotypic values of the individuals in the same F 3 family is: Covariance (W) between the genotypic value of the F 2 individual and the genotypic mean of its F 3 family is: Variance of the genotypic means of the S 3 biparental families (obtained by the random mating of F 2 individuals) is: Mean of the variances of the genotypic values of the individuals in the same S 3 biparental family is: Covariance between the mean of the genotypic values of the F 2 parents and the genotypic mean of their S 3 biparental family is: Variance of the genotypic means of the groups of F 4 fami- Mean of the covariances between the genotypic value of an F 3 individual and the genotypic mean of its F 4 progeny in the same group of F 4 families is: Covariance between the genotypic value of an F 2 individual and the genotypic mean of its F 4 family group (covariance between the genotypic value of the F 2 parent and the genotypic mean of its F 4 descendants) is: Variance of the genotypic values of the F 3 individuals is: Variance of the genotypic values of the S 3 individuals is: Variance of the genotypic means of the F 4 families is: Variance of the genotypic values of the F 4 individuals is: Covariance between the genotypic value of an F 3 individual and the genotypic mean of its F 4 family is: Let us now consider a polygenic system with interaction between genic combinations of two loci and genes with independent assortment.If there are allelic differences for all loci among the initial parents, then: where: D = ∑d 2 r is a parameter determined by the sum of the squares of the deviations between the genotypic value of the homozygote with greatest expression and the mean of the homozygotes, for each locus of a polygenic system, and is a function of the additive effects; H = ∑h 2 r is a parameter determined by the sum of the squares of the deviations between the genotypic value of the heterozygote and the mean of the homozygotes, in relation to each locus of a polygenic system, and is a function of the dominance effects; I = ∑ ∑i 2 rs (r < s) is a parameter determined by the sum of the squares of the epistatic effects between two homozygous genic combinations (additive x additive epistatic component); J = ∑ ∑ j 2 rs (r ≠ s) is a parameter determined by the sum of the squares of the epistatic effects between a homozygous genic combination and a heterozygous genic combination (additive x dominance epistatic component); L = ∑ ∑ l 2 rs (r < s) is a parameter determined by the sum of the squares of the epistatic effects between two heterozygous genic combinations (dominance x dominance epistatic component); DJ = ∑ ∑ d r j rs (r ≠ s) is a parameter determined by the sum of the products between the d deviation of a locus and the epistatic effect between a homozygous genic combination of the same locus and a heterozygous genic combination; HL = ∑ ∑ h r l rs (r ≠ s; l rs = l sr ) is a parameter determined by the sum of the products between the h deviation of a locus and the epistatic effect between the heterozygous genic combination of the same locus and another heterozygous genic combination.The genotypic variances of the generations obtained by backcrossing are functions of the described genetic parameters and also of others that depend on the gene distribution in the parents.

DISCUSSION
The possible kinds of digenic epistasis are shown in Table II for a polygenic system with k genes.The param-eters [h], [i], [j] and [l] are due to dominance, additive x additive, additive x dominance and dominance x dominance components of means, respectively (Mather and Jinks, 1974;Kearsey and Pooni, 1996).
In the F n generation the total (V Fn ), between-families (V GbFn ) and within-families (V GwFn ) genotypic variances can be expressed in the following way: The covariance between the genotypic value of individual F n and the mean genotypic value of its progeny F n + 1 is: After an infinite number of selfing generations, with- out selection, mutation, migration or genetic drift, we have: Therefore, as expected, in the generation with an inbreeding coefficient of one (F = 1) the covariance between relatives, the differences between the genotypic values of the individuals in the population and the differences be-tween the mean genotypic values of the families are due to the differences between the additive genetic values and the additive x additive epistatic values of the individuals.Consequently, epistatic effects between homozygous genic combinations can be important in the determination of the superiority of a pure line in relation to the best parent.In breeding programs with self-pollinated plants, the common objective for successive selection cycles is to fix the greatest number of favorable genes in a line.However, the effects of interaction between homozygous genic combina- tions of desirable genes can contribute in a negative way to the genotypic value of a selected line.If favorable genes increase trait expression and the component [i] is positive, the additive x additive epistatic effects contribute to the superiority of a pure line in relation to the outstanding parent.The same is true when the genes of interest decrease trait expression and the component [i] is negative.Differently to the quadratic components of variation, which are always greater than or equal to zero, the components DJ and HL can be negative.The sign of the component DJ is determined by the signs of the additive x dominance epistatic effects (j).When positive, it can be concluded that epistatic effects due to interactions between homozygous and heterozygous genic combinations are predominantly positive, evidence of complementary genic action or recessive epistasis or dominant and recessive epistasis or duplicate genes with cumulative effects or nonepistatic genic interaction, the last two with positive dominance.If the additive x dominance effects are most negative, DJ will be less than zero, indicating duplicate genic action or dominant and recessive epistasis or dominant epistasis or duplicate genes with cumulative effects or nonepistatic genic interaction, the last two with negative dominance.When there is no additive x dominance epistatic effects, then DJ = 0.
As the dominance effects (h) and the epistatic effects between heterozygous genic combinations (l) can be negative, null or positive, if the component HL is positive it can be concluded that these effects should be predominantly positive or negative, indicating complementary genic action or recessive epistasis or dominant and recessive epistasis or duplicate genes with cumulative effects or non-epistatic genic interaction, the last two with positive dominance.When HL is negative, there is evidence that the dominance and dominance x dominance effects have opposite signs, an indication of duplicate genic action or dominant and recessive epistasis or dominant epistasis or duplicate genes with cumulative effects or non-epistatic genic interaction, the last two with negative dominance.If there are no dominance or dominance x dominance epistatic effects, then HL is zero.
Considering that the objective of a breeding program is intrapopulational breeding or hybrid development, epistatic effects between favorable homozygous genic combinations and heterozygous genic combinations, as well as between heterozygous genic combinations themselves, are causes of covariance between relatives and of genetic variability in the populations.Nevertheless, if these effects have a sign different to the additive effects of the favorable genes they contribute negatively to the determination of the genotypic values of the individuals, thus limiting the genetic gain.
The relative importance of epistatic effects in determining a quantitative trait can be assessed from the analysis of the values presented in Tables III to IX.Table III shows the percentage of various genotypic variances and covari-ances attributable to differences between the additive and dominance genetic values of individuals in the population, assuming complete dominance and the absence of epistasis.Independent of the number of genes and the generations involved, differences between the individuals in relation to their additive genetic values always determine the major part of the genotypic variance and covariance when compared to the proportion attributable to deviations due to dominance, this being true even when there is epistasis (see Tables IV to IX).The difference between fractions attributable to additive and dominance effects increases as the population approaches homozygosis.In comparison to the values in Table III, the percentage attributable to the differences between the additive genetic values will be greater if there is partial dominance and less if there is overdominance.
Assuming deviations d of approximately the same magnitude, if there is complementary genic action or recessive epistasis, as the number of interacting genes increases the greater is the proportion of genotypic variances and covariances due to epistatic effects (Table IV).The same is true when there is duplicate genic action or dominant epistasis, also assuming d r ≅ d for each d r (r = 1, ..., k) (Table V).For the last two types of epistasis, since the components DJ and HL are negative and have a high magnitude, compared to D, H, I and L, the values of many genotypic variances and covariances in initial segregant generations can be negative if the number of interacting genes is the 100.0 0.0 same as those in the polygenic system.Even when the number of genes that interact is reduced the total contribution of the epistatic components to genotypic variances and covariances is negative, decreasing their values.
In the case of dominant and recessive epistasis, it is not possible to assume that these types of interaction will occur for each pair of genes.In relation to any three of the genes in a polygenic system (e.g., A/a, B/b and C/c), this kind of epistasis is only possible for two pairs of genes (e.g., A/a and B/b, A/a and C/c), while for the third pair (B/ b and C/c) epistasis is complementary or duplicate.Tables VI and VII show that the contribution of epistatic effects to genotypic variances and covariances is proportional to the number of interacting genes, and that as these increase the greater is the percentage of genotypic variances and covariances due to differences between epistatic genetic values.
In the case of duplicate genes with cumulative effects or non-epistatic genic interaction (both with positive dominance), if the number of interacting genes is reduced and epistatic effects are of insignificant magnitude compared to deviations d, the greater part of genotypic variances and covariances will be attributable to differences between the additive genetic values of the individuals, a situation favoring selection (Table VIII).As the epistatic effects approach the values of the deviations d, the fractions of the genotypic variances and covariances due to differences between the additive, dominance and epistatic genetic values will be close to those for complementary epistasis (Table IV).
When this is true for duplicate genes with cumulative effects and non-epistatic genic interaction (with negative dominance) the values approach those seen for duplicate epistasis (Table V).Note that for these types of epistasis the contribution of the epistatic components to genotypic variances and covariances in segregant generations is negative (Table IX).An extreme situation occurs when all k genes in a polygenic system interact: in the initial generations the genotypic variances and covariances are negative because of the values of the DJ and HL components.
If the number of genes that interact approaches the number of genes in the polygenic system the differences between epistatic genetic values of the individuals account for approximately 100% of the genotypic variances and covariances regardless of the type of epistasis, the generation and the relative values of the epistatic effects.The consequences are low, close to zero, heritability at individual and family levels (even in advanced generations), inefficient selection, and biased estimates of the additive and dominance components and consequently of heritability, predicted genetic gains, proportions of lines superior to the outstanding parent and other genetic parameters, if the additive-dominance model is adjusted.
On the other hand, if the proportion of interacting genes is reduced, independently of the predominant kind of epistasis, it can be expected that as the population approaches homozygosis the percentage of the genotypic  87.9 (80.1) 11.0 (10.0) 1.1 (9.9) V F∞ 99.0 (90.9) 0.0 (0.0) 1.0 (9.1) V GbF∞ 99.0 (90.9) 0.0 (0.0) 1.0 (9.1) V GwF∞ 0.0 (0.0) 0.0 (0.0) 0.0 (0.0) W F∞(F∞ + 1) 99.0 (90.9) 0.0 (0.0) 1.0 (9.1)  variances and covariances attributable to the differences between the additive genetic values of the individuals becomes relatively high (the superior limit of heritability), while other factors become less important, and consequently the efficiency of family and mass selection is increased.In the F 3 and F 4 generations the efficiency of within-family selection is practically the same, subsequently reducing as the inbreeding coefficient approaches 1. Heritability at a family level tends to be greater than that at the individual level, and, disregarding environmental effects, the analysis of the values presented in Tables III to IX shows the superiority of family selection in comparison to mass selection, except when F = 1, in which case these different types of selection are equivalent.If the total and withinfamily environmental variances are of approximately the same magnitude, the superiority of mass selection in relation to the selection between plants within families is also evident because heritability at the level of individuals in the population tends to be greater than that at the level of individuals within families.
An important aspect of the model presented in this paper, which needs to be further studied, is that the estimation of the genetic components of variation D, H, I, J, L, DJ and HL, depends on the inclusion of at least one variance associated with the S 3 generation and/or of the covariance W 1S23 , since in the genotypic variances and covariances of selfing generations the coefficients of the components H and J are the same.Therefore, if only estimates of the variances and covariances of selfing generations are available, only the components D, (H + J), I, L, DJ and HL, are estimable.The estimation of (H + J) may not be limiting since the two components are due to genic effects not transmitted from generation to generation and they do not contribute to the expected genetic gain due to selection, tending to disappear when the inbreeding coefficient in the population approaches one.However, the calculation of the average degree of dominance and other H-dependent parameters is not possible.The estimation of the genetic and non-heritable components can be based on the weighted or ordinary least squares method (Mather and Jinks, 1974) or on the maximum likelihood method (Hayman, 1960).
The analysis of variation by the additive-dominance model with epistasis allows an assessment of the relative importance of epistatic effects in the genetic control of a trait, and favors an unbiased estimation of the additive (D) and dominance (H) components and of other genetic parameters that depend on these effects.For a better understanding of the control of a quantitative trait, information from the generation mean analysis, including epistasis, can be associated with information from the analysis of variation.
A comparative assessment of the linear components [d], [h], [i], [j] and [l] with the corresponding quadratic components should permit the clarification of the relative importance of additive, dominance and epistatic genic effects, and allow us to decide if non-additive effects are predomi-nantly uni-or bidirectional and whether or not favorable genes are concentrated in one parent, as well as to elucidate the prevailing type of epistasis, etc., all of which allow a better planning of breeding programs.
If the objective of a breeding program is to develop superior lines, the magnitude of the epistatic components [i] and I and the sign of the former should be assessed.The analysis should permit us to infer whether or not fixation of favorable genes is associated with fixation of desirable epistatic effects due to the interaction between homozygous genic combinations increasing genetic gain.If the aim is to develop a hybrid, then it is necessary to analyze the contribution of the genetic effects represented by the parameters [h], H, [i], I, [l] and L, to select for heterosis ("hybrid vigor") in the desired direction, with greater heterosis being expected when such effects are predominantly directional.

CONCLUSIONS
If the additive x additive, additive x dominance and dominance x dominance epistatic effects have the same sign as the average effects of desirable genes, they contribute favorably to the determination of the genotypic values of selected individuals, families or hybrids.Nevertheless, if there is a large number of interacting genes the percentage due to epistatic effects of the total, between-family and within-family genotypic variances is very high in comparison to the portion attributable to the average effects of the genes, making the identification of the superior individuals or families inefficient.In this case, analysis according to the additive-dominance model will produce very biased estimates of genetic parameters.Depending on the magnitude of the dominance and environmental variances, when the number of interacting genes is reduced selection tends to be efficient and the fit with the additive-dominance model should be reasonable.The model for analysis of variation presents multicollinearity.

Table I -
Possible genotypes of two loci.

Table II -
Types of digenic epistasis.

Table IV -
Percentage of genotypic variances (V) and covariances (W) attributable to differences between additive, dominance and epistatic genetic values, assuming complementary genic action or recessive epistasis (d r ≅ d), 1000 genes, 10 and 1000 (values in parentheses) interacting.

Table VI -
Percentage of genotypic variances (V) and covariances (W) attributable to differences between additive, dominance and epistatic genetic values, assuming dominant and recessive epistasis, 300 and 30 genes (values in parentheses), 3 interacting, and complementary genic action for one pair.

Table VII -
Percentage of genotypic variances (V) and covariances (W) attributable to differences between additive, dominance and epistatic genetic values, assuming dominant and recessive epistasis, 300 and 30 (values in parentheses) genes, 3 interacting, and duplicate genic action for one pair.