Accessibility / Report Error

Estimation of the proportion of genetic variance explained by molecular markers

Abstracts

Estimation of the proportion of genetic variance explained by molecular markers (p) plays an important role in basic studies of quantitative traits, as well as in marker-assisted selection (MAS), if the selection index proposed by Lande and Thompson (Genetics 124: 743-756, 1990) is used. Frequently, the coefficient of determination (R2) is used to account for this proportion. In the present study, a simple estimator of p is presented, which is applicable when a multiple regression approach is used, and progenies are evaluated in replicated trials. The associated sampling distribution was obtained and compared with that of R2. Simulations indicated that, when the number of evaluated progenies is small, the statistics are not satisfactory, in general, due to bias and/or low precision. Coefficient R2 was found adequate in situations where p is high. If a large number of progenies is evaluated (say, a few hundreds), then the proposed estimator <img src="http:/img/fbpe/gmb/v21n4/1974f1.jpg" alt="1974f1.jpg (1159 bytes)" align="middle"> appears to be better, with acceptable precision and considerably lower bias than R2. A normal approximation to the sampling distribution of <img src="http:/img/fbpe/gmb/v21n4/1974f1.jpg" alt="1974f1.jpg (1159 bytes)" align="middle"> is given, using Taylor's expansion of the expectation and variance of this statistic. Approximate confidence intervals for p, based on normal distribution, are reasonable, if the number of progenies is large. The use of <img src="http:/img/fbpe/gmb/v21n4/1974f1.jpg" alt="1974f1.jpg (1159 bytes)" align="middle"> in MAS is illustrated for estimation of the weight given to the molecular score, when a selection index is used.


A estimação da proporção da variância genética explicada por marcadores moleculares (p) desempenha um importante papel em estudos básicos de características quantitativas, bem como na seleção assistida por marcadores (SAM), se o índice de seleção proposto por Lande e Thompson (Genetics 124: 743-756, 1990) é utilizado. Freqüentemente, o coeficiente de determinação (R2) é usado para quantificar esta proporção. Neste trabalho, um estimador simples de p é apresentado, o qual é aplicável quando um modelo de regressão múltipla é utilizado e progênies são avaliadas em delineamentos experimentais. A sua distribuição de amostragem foi obtida e comparada com aquela do R2. Simulações de Monte Carlo sugeriram que, quando o número de progênies avaliadas é pequeno, ambas as estatísticas são insatisfatórias, devido a tendenciosidades e/ou baixa precisão. O coeficiente R2 mostrou-se adequado apenas quando p é alta. Se o número de progênies avaliadas é elevado (algumas centenas), então o estimador proposto <img src="http:/img/fbpe/gmb/v21n4/1974f1.jpg" alt="1974f1.jpg (1159 bytes)" align="middle"> mostrou-se superior, com tendenciosidade consideravelmente menor que a do R2, e com precisão aceitável. Uma aproximação normal da distribuição de amostragem de <img src="http:/img/fbpe/gmb/v21n4/1974f1.jpg" alt="1974f1.jpg (1159 bytes)" align="middle"> foi obtida, utilizando a expansão de Taylor da esperança e da variância dessa estatística. Intervalos de confiança aproximados para p, baseados na distribuição normal, mostraram-se satisfatórios, se o número de progênies for elevado. O uso de <img src="http:/img/fbpe/gmb/v21n4/1974f1.jpg" alt="1974f1.jpg (1159 bytes)" align="middle"> na SAM é ilustrado para a estimativa do coeficiente dado ao escore molecular, se um índice de seleção é utilizado.


METHODOLOGY

Estimation of the proportion of genetic variance explained by molecular markers**Part of a thesis presented by E.B. to ESALQ-USP in partial fulfillment of the requirements for the Doctoral degree. Part of a thesis presented by E.B. to ESALQ-USP in partial fulfillment of the requirements for the Doctoral degree.

Eduardo Bearzoti1 and Roland Vencovsky2

1Departamento de Ciências Exatas, Universidade Federal de Lavras, Caixa Postal 37, 37200-000 Lavras, MG, Brasil. Send correspondence to E.B. Fax: +55-035-829-1371. E-mail: bearzoti@esal.ufla.br

2Departamento de Genética, Universidade de São Paulo, Caixa Postal 83, 13400-970 Piracicaba, SP, Brasil.

ABSTRACT

Estimation of the proportion of genetic variance explained by molecular markers (p) plays an important role in basic studies of quantitative traits, as well as in marker-assisted selection (MAS), if the selection index proposed by Lande and Thompson (Genetics 124: 743-756, 1990) is used. Frequently, the coefficient of determination (R2) is used to account for this proportion. In the present study, a simple estimator of p is presented, which is applicable when a multiple regression approach is used, and progenies are evaluated in replicated trials. The associated sampling distribution was obtained and compared with that of R2. Simulations indicated that, when the number of evaluated progenies is small, the statistics are not satisfactory, in general, due to bias and/or low precision. Coefficient R2 was found adequate in situations where p is high. If a large number of progenies is evaluated (say, a few hundreds), then the proposed estimator appears to be better, with acceptable precision and considerably lower bias than R2. A normal approximation to the sampling distribution of is given, using Taylor's expansion of the expectation and variance of this statistic. Approximate confidence intervals for p, based on normal distribution, are reasonable, if the number of progenies is large. The use of in MAS is illustrated for estimation of the weight given to the molecular score, when a selection index is used.

INTRODUCTION

Genetic studies of metric traits in cultivated plants have been increasingly enriched, in recent years, with the use of molecular markers. This technique has allowed the decomposition of the genetic basis of a trait into its constituent parts (QTL or quantitative trait loci) as well as estimation of the number of QTL, magnitude of their effects, degree of dominance and amount of epistasis (Edwards et al., 1987; Eshed and Zamir, 1996). Many techniques have also been proposed to construct linkage maps of QTL (Jiang and Zeng, 1995; Uimari et al., 1996). Furthermore, the use of molecular markers has also been proposed for increasing the efficiency of breeding schemes (Helentjaris, 1992; Dudley, 1993). In this respect, marker-assisted selection (MAS) is said to be an effective mean for the improvement of specific populations, especially when the heritability of the trait under selection is low and a high proportion of its additive genetic variance is explained by the markers (Lande and Thompson, 1990). Effectiveness of this technique was pointed out in some simulation studies (Zhang and Smith, 1992, 1993; Edwards and Page, 1994; Gimelfarb and Lande, 1994a,b), but there are few reports concerning its use in practice (e.g., Edwards and Johnson, 1994. Lande and Thompson (1990) proposed the use of MAS in a selection index. This would combine both phenotypic and molecular information with appropriate weights to maximize gain from selection. In this procedure, a net score is assigned to each individual, according to its molecular marker constitution. In practice, this score could be estimated by the predicted value from a multiple regression model which uses the number of alleles of each marker locus as regressor variables.

In quantitative traits studies, or MAS, it is important to estimate the proportion p of genetic variance accounted for by molecular markers. Sometimes, this partitioning of variance is made individually at each QTL, so the relative contribution of specific genome regions to the expression of the trait can be quantified (Edwards et al., 1987). As for MAS, quantification of p is essential, since the coefficient of the molecular score in the selection index of Lande and Thompson (1990) is a function of this parameter. The coefficient of determination of the multiple regression model may be used to estimate p in MAS, as suggested by Gimelfarb and Lande (1994a,b). Actually, this statistic is widely used in studies involving molecular data (Edwards et al., 1987; Reiter et al., 1991). However, the coefficient of determination may not be an adequate estimator of p, since the expected sum of squares due to regression involves variance components of environmental as well as genetical effects not associated with the molecular markers considered. Nevertheless, it seems to be appropriate when both phenotypic evaluation and genotyping for the marker loci are made at an individual level (e.g., single plants), and, consequently, the above mentioned components are not estimable as they are confounded with residual variation. When evaluation units consist of progenies or inbred lines, experiments can be carried out with an adequate number of replications. This allows unbiased estimates of these variance components and permits dividing the genetic variance into parts explained and unexplained by molecular markers. This kind of partition is not new. Ruiz and Barbadilla (1995) obtained expectations of matrices of mean squares and products, related to a multivariate regression model associated to molecular markers. Cockerham and Zeng (1996) derived mean squares expectations for the use of North Carolina Design III in association with markers. Although this division of genetic variance has already been proposed, it does not seem to have been fully exploited by plant breeders. Herein, a simple estimator of p, based on data from replicated trials and a multiple regression approach, is derived. Some of its properties are investigated, and compared with those of the coefficient of determination. Its use in the selection index of Lande and Thompson (1990) is also discussed.

ESTIMATION OF p

If there is substantial linkage disequilibrium in a population, associations between marker loci and QTL can be identified by multiple regression analysis, taking as regressor variables the numbers of one of the alleles at each of the marker loci considered. This approach is most frequently used at the individual plant level, in F2 or backcross generations, but replicated progenies can also be used. The latter may increase the power of significance tests of the partial regression coefficients, due to the larger sample size. This is the case when inbred lines are used, or when descendants are obtained from selfed individuals, randomly sampled and genotyped for marker loci. Evaluation of the progenies would serve as a base for QTL detection, using their means as the dependent variable in the multiple regression model. This procedure is not restricted to selfed progenies. It can also be used with half-sib families of genotyped seed parents. Whatever the case, identification of marker-QTL associations is facilitated because quantitative traits are better evaluated in replicated trials. It is then expected that some proportion p of the genetic variation among progenies would be explained by the molecular markers, which can be estimated. This estimate can be used in the selection index proposed by Lande and Thompson (1990), if MAS is practiced using progenies as selection units.

An estimator of p can be derived as follows. Suppose that t progenies are randomly sampled from a population, for example by selfing t individuals, and evaluated for a quantitative trait in r replications. The evaluation of the jth replication of the ith progeny can be described by the following model:

where m is a constant, gi is the effect of the ith progeny, and eij is the residual term. Since the last two terms are random, the sampling variance of experimental unit yij is:

where s2 and are the among and within progenies components of variance. Now suppose that the number of one of the alleles of k marker loci is known, with respect to each selfed individual which generated a progeny. Then, a multiple regression analysis can be performed with the totals or averages observed in the progenies as the dependent variable. This implies a rearrangement in model (1) such that:

where a is a constant, Xij is the number of one of the alleles of marker locus j in the ith progeny, bj is the corresponding partial regression coefficient, and hi is the genetic effect not associated with molecular markers in progeny i. As defined in (3), the regression model will only account for the average effects of genes of putative QTL which are in linkage disequilibrium with respect to the marker loci. By incorporating appropriate parameters in model (3), like coefficients for or crossproducts (Xij´ Xij'), non-additive effects can also be detected. The term was called by Lande and Thompson (1990) `molecular net score'. Denoting this term by mi, it follows from (1) and (3) that:

where is the portion of genetic variance not explained by molecular markers. Any sampling covariance between two different replications j and j' of the same progeny i is given by:

Additionally, the total genetic variance among progenies, not conditioned to mi, can be partitioned into:

where

is the variance among molecular scores, that is, the portion of genetic variance explained by molecular markers. It is evident that traits for which is close to zero can be more easily studied and manipulated. Parameter p is the ratio:

If progenies are randomly sampled from a population, then estimates and can be obtained through the method of moments and the analysis of variance (ANOVA) technique. Hence p can be estimated by:

Total variation among experimental units can be partitioned as shown in Table I. Model (1), expressed in matrix notation, is:

where y, q and e are the vectors of observations, parameters, and residuals, respectively, and X is the incidence matrix. Recalling (4) and (5), the dispersion matrix V of y, given mi, is:

The following matrix:

where X+ denotes the Moore-Penrose inverse, is the orthogonal projector associated with the full model (10), which yields the vector of estimated values = Py. Let 1n

denote a (n ´ 1) vector with all elements equal to 1. Defining:

then the sum of squares of progenies (Table I), adjusted for the constant m in model (1), is (Searle, 1971):

According to (3), variation among progenies can be partitioned into regression sum of squares and deviations from regression. Therefore:

where Z and X are incidence matrices, and b and g are parameter vectors associated with regression and deviations from regression, respectively. Let PR denote the orthogonal projector of regression, that is:

Then the sum of squares of regression (Table I) is given by:

The sum of squares of deviations from regression is calculated by:

The quadratic forms (17) and (18) are independent, since Z obviously belongs to the space generated by the columns of matrix X, and then:

The expectations of the quadratic forms (14), (17) and (18) can be obtained from Searle (1971):

The expectation of the sum of squares associated with progenies, ignoring the regression analysis, is:

Equating the first expression in (20) and (21), it follows that:

By substituting the above expression in (6), the component can be expressed as:

Therefore, expectation of the mean square associated with regression is:

Expectations of all mean squares are shown in Table I. The method of moments furnishes the following estimators:

where MSE, MSR and MSD are the mean squares of error, regression and deviations from regression. Therefore, proportion p can be estimated as given by expression (9).

SAMPLING DISTRIBUTION OF

A frequent procedure for estimating genetic parameters, useful in plant breeding, is to use linear fixed models to describe phenotypic values, even if the components of such models are actually random. In a typical fixed model, variation due to each of the several factors in an ANOVA consists of the sum of various sums of squares, associated with contrasts among means of the correspondent factor. In a random model, this approach is also utilized to estimate variance components, by equating the mean squares to their expectations (moments or ANOVA method of estimation). Using balanced data and assuming observations to follow a normal distribution, this procedure and the restricted maximum likelihood method yield the same estimates, except for cases where the former method yields negative values (Graybill, 1961). Given that is a function of quadratic forms, its sampling distribution can be derived. The quadratic forms, in usual regression (fixed) model, have distributions associated with non-central chi-squares, which are tedious to work with. However, by considering the sources of variation in Table I as random, distributions of the quadratic forms are actually functions of central chi-squares, since the pivotal quantity y'Qy Tr[Q] / E[y'Qy], where y'Qy denotes a quadratic form, and Tr the trace operator, is distributed as a chi-square with Tr[Q] degrees of freedom (Searle, 1971).

For derivation of the sampling distribution of consider the random variables u = SSR, v = SSD and w = SSE. Estimator is then expressible as:

Defining two other transformations of (u, v, w), say, s = v and x = w, it follows that is distributed as (Appendix A):

f(Z) = x

x x

x x

As usual with linear combinations of independent chi-squares (Fleiss, 1971), a closed form of (z) cannot be achieved. Nevertheless, for given values of t, r, k, s2, and , distribution of can be obtained by numerical integration or computer simulation. Density in (27) is quite laborious to handle; also, it cannot be directly used for interval estimation, since it is a function of the unknown parameters s2, and . However, moments E[] and V[] can be approximately obtained using Taylor's expansion of both the expectation and the variance of a ratio of two random variables (Mood et al., 1974). With appropriate estimators of such moments, confidence intervals can be constructed using a normal approximation. In Appendix B, derivations of E[] and V[] are presented, which are given by:

In a given experiment, E[] can obviously be estimated by , and V[] by (28.b), substituting the components of variance by their estimates. Table II shows the values of E[] and V[], approximated by Taylor's expansion, for k = 5 molecular markers, different treatment numbers and r = 2 replications. Two values were attributed to heritability (0.2 and 0.8) at the level of treatment means, which is defined as h2 = / ( + s2/r). Two extreme values (0.1 and 0.9) were also chosen for p. It can be seen that shows biases, which are greater with small numbers of progenies. In such cases (say, up to 100 progenies) biases are small when heritability is large, keeping p fixed. With p = 0.9, expressive biases are observed for large samples, and the approximation to V() is poor, since it yields negative values in most of cases (Table II). However, it must be pointed out that these aspects are less evident as number of markers increases (data not shown).

Such an evaluation of , based on approximations to E[] and V[], is somewhat rough, in the sense that it is not clear to what extent the greater biases and the lower precision are real, or are due only to the quality of the approximations. An analysis of this aspect was accomplished by Monte Carlo simulation of the sampling distribution of . Figure 1 displays frequency polygons of 5000 values each, generated according to (26), via computer simulation, for relatively small (t = 10, r = 2) and large (t = 100, r = 2) sample sizes, k = 5 molecular markers, and low heritability (h2 = 0.2), which is a common situation in plant breeding. The algorithm of Best and Roberts (1975) was used to simulate the necessary inverses of chi-square distributions. The normal approximations of , using the E[] and V[] values presented in Table II, are also shown in Figure 1. Under small sample size (situations a and b), the approximation is quite poor, getting worse as p increases. However, when sample size is large, there is a better closeness with data generated by Monte Carlo simulation. Results suggest that, in such circumstances, the normal distribution could be used to construct approximate confidence intervals for E[], if the confidence coefficients are not very high.

Figure 1
- Sampling distributions of

The statistic more frequently used to quantify the proportion of genotypic (or phenotypic) variation explained by molecular markers is the coefficient of determination (R2). The distribution of this statistic can be obtained similarly to the distribution of . Again, the use of non-central chi-square distributions is unjustified. In terms of the variables u and v defined earlier, the coefficient of determination is R2 = u / (u+v). In contrast with , the sampling distribution of R2 can be expressed in a closed form (Appendix A):

The sampling distribution of R2 is also presented in Figure 1, for the same parameter combinations as previously assumed for the distribution of . It can be seen in situation (a) that both statistics seem inadequate to express the proportion of genetic variance explained by molecular markers. The coefficient of determination is strongly biased upward. Since its numerator is the sum of squares of regression, it is inflated by error and deviations from regression. On the other hand, estimator apparently has a mode close to p = 0.1, but it clearly has a very low precision (high mean squared error). More than 80% of the values generated by the Monte Carlo simulation lied outside the parameter space (interval [0, 1]). In situation (b), with p = 0.9, the coefficient of determination is clearly a preferable statistic. These results suggest that, with small sample sizes (few progenies) and low heritability, statistics and R2 are very poor for quantifying the proportion of genetic variation explained by molecular markers, unless p is large. In such cases, R2 seems to be suitable. In situations (c) and (d), with large sample size, precision of is considerably higher (Figure 1) than that under small sample size. There is a clear trend of the precision of this estimator to become worse as p increases from 0.1 to 0.9, as already seen in Table II. The distribution of R2 seems to be biased downward, but at a lesser extent when p = 0.1. This reflects the fact that this statistic is defined as a ratio of sums of squares, and, under high values of t (in relation to the number of markers), the number of degrees of freedom of deviations from regression is high, which inflates the denominator of R2. This aspect becomes more evident as t increases, with k fixed (data not shown). When studying metric traits with the aid of molecular markers, the plant breeder usually faces situations similar to (c) and (d) of Figure 1, in the sense that heritability is generally low, and the number of progenies evaluated usually comprises a few to several hundreds. Although the number of markers used is frequently higher than 5, they are sometimes tested one at a time. Another approach consists in applying methods to construct multiple regression models, such as backward or stepwise (Reiter et al., 1991), which commonly reduces greatly the number of final markers used in the analysis. Therefore, under high sample sizes, estimator appears to be a better statistic to account for the proportion of genetic variance explained by molecular markers.

USE OF IN MAS

Lande and Thompson (1990) suggested that marker-assisted selection (MAS) could be made by means of a selection index Ii, associated with the individual or progeny i, given by:

where mi is the molecular net score of the individual or progeny i, Fi is its phenotypic value, and bM and bF are their relative weights, respectively. Stating bF = 1, the authors showed that the value of bM that maximizes gain from selection is:

It can be seen in (31) that the weight given to the molecular score in MAS increases as h2 decreases and/or p increases. The use of this selection index requires estimates of h2 and p to estimate bM. It has been suggested that the coefficient of determination be used as an estimate of p (Gimelfarb and Lande, 1994a,b). If the selection units are replicated progenies, and many of them (say, a few hundreds) are evaluated, then the estimator in (9) should be preferred, as was pointed out in the previous section.

Some properties of this estimator (herein denoted by ), and implications of its use in MAS, can be investigated by its sampling distribution. If selection units are replicated progenies, then h2 can be estimated by /( +/r). Recalling that =/, and that = +, then it follows that:

Estimator can now be expressed as a function of MSD and MSE, and its sampling distribution can be given in a closed form (Appendix A) as follows:

Estimate will be negative if is negative. Figure 2 shows the density given in (33) for two extreme values of bM (1 and 40). It is seen that there are situations (high bM) in which the probability of being negative is considerably high. Negative estimates of components of variance obtained through the method of moments are commonly replaced by zero. Here, if is taken as zero, will tend to infinity, meaning that selection will be based solely on molecular information. High or negative values of can occur when a large fraction of the genetic variance is explained by markers and/or heritability is very low. Coefficient is a relative weight of the molecular score. If, for instance, bM > 10, selection will rely essentially on the molecular score, and phenotypic information will have a very small contribution to the index (as given in a, Figure 2). In situation b (Figure 2), bM is small, reflecting low p and/or high h2. In such a case, will almost certainly lie between 0.5 and 2, and so MAS will almost always be such that both phenotypic and molecular information will be important for selection.

Figure 2
- Sampling distribution of

APPENDIX A

Sampling Distributions

1)

The theory of random models states that, under normality, the quantity Tr[Q] y'Qy / E[y'Qy], where y'Qy denotes a quadratic form, is distributed as a chi-square with Tr[Q] degrees of freedom. Let u, v and w denote the sums of squares of regression on molecular markers, deviations from regression, and error, respectively. If such sources of variation are interpreted as random, then their densities can be obtained through a simple transformation of the above quantity, using the expectations presented in Table I, though yielding:

Since u,v and w are independent of one another, the joint distribution of (u, v, w) is merely the product of the densities in (A1). The estimator is given by:

Two other functions of (u, v, w), say, s = v and x = w can be stated. This defines a three-dimensional transformation, with Jacobian:

The joint distribution of (z, s, x) is therefore:

Integrating (A4) with respect to s and x gives the marginal density in (27).

2) R2

The coefficient of determination can be expressed in terms of the variables u and v, defined earlier, as u / (u + v). With the additional function s = v, the corresponding Jacobian is:

The joint distribution of (z, s) is therefore:

fz,s(z,s) =

Integrating (A6) in relation to s, it follows the marginal distribution in (29).

3)

Recalling (32), and defining v = MSE and u = MSD, it follows that = z = v / (u - v). Defining s = v, the Jacobian of this two-dimensional transformation is s / z2. Then the joint distribution of (z, s) is:

fZ,S(z,s) =

Integrating (A7) in relation to s yields the marginal distribution in (33).

APPENDIX B

Approximations to E[] and V[]

1) E[]

The expectation of a ratio of two random variables x and y can be approximated by Mood et al. (1974):

where Cov denotes covariance. The estimator is a ratio of random variables, as above, with x =, defined in (9), and y = . Expectations of x and y are, respectively, and . The variance of a mean square is, according to Searle et al. (1992), 2E2[y'Qy] / Tr3[Q], and, as MSP and MSE are independent random variables, following the definition of y:

To calculate Cov[x,y] = E[xy] - E[x] E[y], it must be noted that:

It follows from (B3) that:

By making explicit the expectations and variances of mean squares in (B4), with the expectations in Table I, then the covariance between x and y follows to be:

By substituting (B2) and (B5) in (B1), the expression in (28.a) is given.

2) V[]

The variance of a ratio of two random variables x and y can be approximated by Mood et al. (1974):

Defining x and y the same way as was done for E[], it follows that:

Substituting (B2), (B5) and (B7) in (B6), results in the expression (28.b).

ACKNOWLEDGMENTS

Research supported by CNPq. Publication supported by FAPESP.

RESUMO

A estimação da proporção da variância genética explicada por marcadores moleculares (p) desempenha um importante papel em estudos básicos de características quantitativas, bem como na seleção assistida por marcadores (SAM), se o índice de seleção proposto por Lande e Thompson (Genetics 124: 743-756, 1990) é utilizado. Freqüentemente, o coeficiente de determinação (R2) é usado para quantificar esta proporção. Neste trabalho, um estimador simples de p é apresentado, o qual é aplicável quando um modelo de regressão múltipla é utilizado e progênies são avaliadas em delineamentos experimentais. A sua distribuição de amostragem foi obtida e comparada com aquela do R2. Simulações de Monte Carlo sugeriram que, quando o número de progênies avaliadas é pequeno, ambas as estatísticas são insatisfatórias, devido a tendenciosidades e/ou baixa precisão. O coeficiente R2 mostrou-se adequado apenas quando p é alta. Se o número de progênies avaliadas é elevado (algumas centenas), então o estimador proposto mostrou-se superior, com tendenciosidade consideravelmente menor que a do R2, e com precisão aceitável. Uma aproximação normal da distribuição de amostragem de foi obtida, utilizando a expansão de Taylor da esperança e da variância dessa estatística. Intervalos de confiança aproximados para p, baseados na distribuição normal, mostraram-se satisfatórios, se o número de progênies for elevado. O uso de na SAM é ilustrado para a estimativa do coeficiente dado ao escore molecular, se um índice de seleção é utilizado.

(Received October 16, 1997)

  • Best, D.J. and Roberts, D.E. (1975). Algorithm AS91. The percentage points of the chi-squared distribution. Appl. Stat. 24: 385-388.
  • Cockerham, C.C. and Zeng, Z.B. (1996). Design III with marker loci. Genetics 143: 1437-1456.
  • Dudley, J.W. (1993). Molecular markers in plant improvement: manipulation of genes affecting quantitative traits. Crop Sci. 33: 660-668.
  • Edwards, M. and Johnson, L. (1994). RFLP for rapid recurrent selection. In: Proceedings of the Symposium of Analysis of Molecular Marker Data. American Society for Horticultural Science, 33-40.
  • Edwards, M.D. and Page, N.J. (1994). Evaluation of marker-assisted selection through computer simulation. Theor. Appl. Genet. 88: 376-382.
  • Edwards, M.D., Stuber, C.W. and Wendel, J.F. (1987). Molecular-marker-facilitated investigations of quantitative trait loci in maize. I. Numbers, genomic distribution and types of gene action. Theor. Appl. Genet. 116: 113-125.
  • Eshed, Y. and Zamir, D. (1996). Less-than-additive epistatic interactions of quantitative trait loci in tomato. Genetics 143: 1807-1817.
  • Fleiss, J.L. (1971). On the distribution of a linear combination of independent chi-squares. Am. Stat. Assoc. 66: 142-144.
  • Gimelfarb, A. and Lande, R. (1994a). Simulation of marker assisted selection in hybrid populations. Genet. Res. 63: 39-47.
  • Gimelfarb, A. and Lande, R. (1994b). Simulation of marker assisted selection for non-additive traits. Genet. Res. 64: 127-136.
  • Graybill, F.A. (1961). An Introduction to Linear Statistical Models McGraw-Hill, New York.
  • Helentjaris, T.G. (1992). RFLP analyses for manipulating agronomic traits in plants. In: Plant Breeding in the 1990's (Stalker, H.T. and Murphy, J.P., eds). CAB International, Wallingford, pp. 357-372.
  • Jiang, C. and Zeng, Z.B. (1995). Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140: 1111-1127.
  • Lande, R. and Thompson, R. (1990). The efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics 124: 743-756.
  • Mood, A.M., Graybill, F.A. and Boes, D.C. (1974). Introduction to the Theory of Statistics. McGraw-Hill, Tokyo.
  • Reiter, R.S., Coors, J.G., Sussman, M.R. and Gabelman, W.H. (1991). Genetic analysis of tolerance to low-phosporus in maize using restriction fragment length polymorphisms. Theor. Appl. Genet. 82: 561-568.
  • Ruiz, A. and Barbadilla, A. (1995). The contribution of quantitative trait loci and neutral marker loci to the genetic variances and covariances among quantitative traits in random mating populations. Genetics 139: 445-455.
  • Searle, S.R. (1971). Linear Models McGraw-Hill, New York.
  • Searle, S., Casella, G. and McCulloch, C.E. (1992). Variance Components John Wiley, New York.
  • Uimari, P., Thaller, G. and Hoeschele, I. (1996). The use of multiple markers in a Bayesian method for mapping quantitative trait loci. Genetics 143: 1831-1842.
  • Zhang, W. and Smith, C. (1992). Computer simulation of marker-assisted selection utilizing linkage disequilibrium. Theor. Appl. Genet. 83: 813-820.
  • Zhang, W. and Smith, C. (1993). Simulation of marker-assisted selection utilizing linkage disequilibrium: the effects of several additional factors. Theor. Appl. Genet. 86: 492-496.
  • *Part of a thesis presented by E.B. to ESALQ-USP in partial fulfillment of the requirements for the Doctoral degree.
    Part of a thesis presented by E.B. to ESALQ-USP in partial fulfillment of the requirements for the Doctoral degree.
  • Publication Dates

    • Publication in this collection
      01 Mar 1999
    • Date of issue
      Dec 1998

    History

    • Received
      16 Oct 1997
    Sociedade Brasileira de Genética Rua Cap. Adelmio Norberto da Silva, 736, 14025-670 Ribeirão Preto SP Brazil, Tel.: (55 16) 3911-4130 / Fax.: (55 16) 3621-3552 - Ribeirão Preto - SP - Brazil
    E-mail: editor@gmb.org.br