SciELO - Scientific Electronic Library Online

vol.31 issue1Genetic variability and efficiency of DNA microsatellite markers for paternity testing in horse breeds from the Brazilian Marajó archipelagoGenetic diversity in section Rhizomatosae of the genus Arachis (Fabaceae) based on microsatellite markers author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand



Related links


Genetics and Molecular Biology

Print version ISSN 1415-4757

Genet. Mol. Biol. vol.31 no.1 São Paulo  2008 



Inclusion of genetic relationship information in the pedigree selection method using mixed models



José Airton Rodrigues NunesI; Magno Antonio Patto RamalhoII; Daniel Furtado FerreiraIII

IDepartamento de Planejamento e Política Agrícola, Centro de Ciências Agrárias,Universidade Federal do Piauí, Teresina, PI, Brazil
IIDepartamento de Biologia, Universidade Federal de Lavras, Lavras, MG, Brazil
IIIDepartamento de Ciências Exatas, Universidade Federal de Lavras, Lavras, MG, Brazil

Send correspondence to




We used a mixed model approach and computer simulation to evaluate the inclusion of parentage information as determined by the genealogy established in the pedigree method. The simulations were based on a purely additive genetic model for one quantitative trait of 20 unlinked segregating loci with equal effects and an allelic frequency of 0.5 for heritability values of 10%, 25%, 50% and 75% for selection based on an F4:5 progeny mean. We simulated 1000 experiments for each heritability value, corresponding to the evaluation of 256 F4:5 progenies. The phenotypic values of the progenies were analyzed according to two models, one ignoring and one considering the additive genetic parentage among the progenies. The additive relationship coefficients among F4:5 progenies ranged from 0.0 to 1.75. The evaluated selection procedures were the phenotypic progeny mean (M) and the best linear unbiased predictor including parentage (BLUPA). The inclusion of parentage among progenies using the BLUPA procedure resulted in higher selection gains than when the relationship information was ignored, which possibly recompenses the additional work invested to obtain these records, above all in the case of low - heritability traits.

Key words: autogamous crops, BLUP, computer simulation, plant breeding.




The pedigree method, proposed towards the end of the 19th century, is widely applied to improvement programs of self-fertilized plants and is mainly based on recording the genealogies among progenies over the selfing generations (Ramalho et al., 2001). However, not only is does this procedure require time and dedication from the breeder but the usefulness of this method for the selection process is somewhat restricted. One possibility of using this parentage information in support of the selection process would be in progeny evaluations in experiments with replications. Such an approach could be useful since breeders of autogamous species are primarily interested in selecting progenies that, during homozygosis, accumulate a higher quantity of favorable alleles that associate the best additive genetic values (AGV), bearing in mind that the ultimate aim is the establishment of lines (Fehr, 1987). For quantitative traits, however, the phenotype does not always reflect the associated AGV. In this case, it would be important to use methodologies that optimize the use of the available information, in order to classify the progenies as closely as possible to the ranking given by the true AGV (White and Hodge, 1989). Several fixed model and mixed model procedures have been proposed to predict the AGV of progenies, including the best linear unbiased estimator (BLUE) method, the best linear predictor (BLP) technique and the best linear unbiased predictor (BLUP) approach (White and Hodge, 1989; Mrode, 1996; Lynch and Walsh, 1998; Resende, 2002).

The BLUP procedure has been the most widely used in the prediction of the genetic merit in animals (Mrode, 1996) and, more recently, it has been widely applied in plant improvement (Bernardo, 2002; Resende, 2002). Under unbalanced conditions this procedure not only has the advantage of making predictions more reliable compared to those obtained by the ordinary least square method but also incorporates information on related plants and thus optimizes the use of the available data in progeny comparisons (Bernardo, 2002).

Since we found no reports on the use of genealogy established by the pedigree method in progeny selection in self-pollinated crops and field experiments produce unreliable information (Wang et al., 2003) we evaluated the efficiency of selection incorporating this genetic relationship using a mixed model computer simulation.



The program was implemented in the Delphi 6.0 environment (Cantú, 2002). A simplified genetic model was assumed for any quantitative trait considering 20 loci of independent segregation, with equal and additive effects and an allelic frequency of 0.5 without dominance. The simulations considered heritability values of 10%, 25%, 50% and 75% for selection based on an F4:5 progeny mean (). For each heritability we simulated 1000 F2 populations with 20 segregating loci consisting of 64 plants each. The plant multiplication rates were assumed to be equal, with each plant generated 40 offspring.

Initially, the generations were advanced by the pedigree method with no visual selection. A segregating F2 population of 64 simulated plants gave rise to the 64 F2:3 progenies with 40 plants each. Two plants were randomly selected from each F2:3 progeny, resulting in 128 F3:4 progenies and the process repeated in the following generation to finally obtain 256 F4:5 progenies with 40 plants each (Figure 1).



The phenotypic values for the plants of each F4:5 progeny (yi) were simulated by adding normally distributed random errors to the genotypic values (GV), by the following model:

where µ is a constant (100 in the present case), gi is the genotypic effect of plant i (i = 1, 2, ..., 40) and wi is the environmental deviation associated to yi.

The gi effect result from the cumulative effect of the 20 loci as already described in the genetic model. The additive effect (al) of the lth locus was assumed equal to 1.0, where l = 1, 2, ..., 20. The value of gi taking locus B with two alleles (B1 and B2) as reference is given by:

The wi effects were randomly attributed based on a normal distribution with constant variance, i.e., N(0,). The variance component is the environmental variance among plants, which can be obtained by:

where is the genetic variance among the F2 plants (i.e., =+), is the F2 additive variance with, in this case, an allelic frequency of 0.5 , and since al was assumed equal to 1.0 for all loci, then , is the F2 variance dominance and since dominance was assumed to be absent, then , and is the F2 generation individual heritability.

The 40 simulated genotypes or plants per F4:5 progeny were divided into two virtual plots of 20 plants (n = 20) to produce two replications (r = 2) for each progeny. In the following equations, random errors were considered to be normally distributed among plots, with e~N(0,) in relation to the mean phenotypic values of the plots. The variance component is the environmental variance among plots.

In the simulation the relation / was considered fixed at eight (c = 8). The error terms varied according to values assumed for heritability:

where is the genetic variance among F4:5 progenies (=7/4), is the phenotypic variance within a plot (=+, where is the genetic variance within F4:5 progenies given by =1/8). Thus the individual F2 heritability () was determined as a function of the pre-fixed heritability values as:

We analyzed 1000 experiments corresponding to the evaluation of 256 simulated F4:5 progenies, derived from the pedigree method. The analysis was based on the mean phenotypic data of the plots, using a completely randomized experimental design with two replications.

According to the description of the conduction by pedigree method, each F2 plant generated four F4:5 progenies (Figure 1). Based on this detailed pedigree, the matrix of the additive genetic parentages among the related progenies was determined, considering the F2 population as non-inbred. The phenotypic progeny data were then analyzed according to two models:

Model GI

The genetic relationship among progenies was ignored. The mean phenotypic data of the plots of the 256 F4:5 progenies were analyzed using a linear mixed model (Henderson et al., 1959) , y=Xb+Za+e, where y is a 512 x 1 vector of the mean phenotypic data of the plots, X is a 512 x 1 fixed effect design matrix, b is a scalar fixed effect of the constant, Z is a 512 x 256 random effects of progenies design matrix, a is a 256 x 1 progeny random effects vector with a~N(0,G) and G=A, while e is a 512 x 1 vector of errors with e~N(0,R) and R=I. The G matrix was designated by I(i.e., A = I), indicating that the progenies were assumed to be unrelated. In this case, the component is equal to the genetic variance among F4:5 progenies ().

Model GA

In this model the genetic relationship among progenies was considered by the inclusion of parentage among progenies. The mixed model for analysis was identical to model GI, except that the G matrix was designated by , with A containing the additive relationship coefficients among F4:5 progenies, corresponding to twice the Malecot's coancestry coefficient (Bernardo, 2002: section, and refers directly to the F2 additive variance among plants (). In animal breeding the A matrix is referred to as the numerator relationship matrix (Mrode, 1996) and, in this case, it was given by:

where is the Kronecker product.

The solutions for the random (â) and fixed effects () for both models were obtained by solving the following equation (Henderson et al., 1959):

To obtain the previous solutions, the components of genetic and non-genetic variances were assumed to be unknown. These variance components were estimated using the restricted maximum likelihood (REML) method (Patterson and Thompson, 1971). Since the REML method employs an iterative process, the expectation-maximization (EM) numeric algorithm was applied (Dempster et al., 1977).

The predictions of the progeny random effects (â) based on the overall adjusted mixed model are BLUP predictions (Henderson, 1975). After an adjustment of the GI model the predictions were denoted as BLUPI, while for the GA model the predictions were designated BLUPA. Additionally, the phenotypic progeny means (M) for each simulated experiment were also obtained.

It should be noted that due to the balancing conditions under which the simulation were conducted and the use of an orthogonal experimental design with no missing data the BLUPI predictions do not have selective advantage in relation to the phenotypic means of the progenies (M) (Kennedy and Sorensen, 1988). Thus, only the results using the mean M will be shown, and these should be understood as being equal to BLUPI.

For each pre-fixed heritability, corresponding to 1000 simulated experiments, we obtained the mean estimates of the genetic variance among the F4:5 progenies () and heritability on an F4:5 progeny mean basis () for both models (GI and GA). The selection procedures of the F4:5 progenies (mean M and BLUPA) were evaluated and compared based on the true genotypic values (GV) so for both procedures we estimated the Spearman correlations (rS), proportions of coincidence in the 5%, 10% and 25% selection fractions for lower and upper extremes on the ranking of progenies, and the mean GV for different percentages (0.4% (best progeny), 5%, 10% and 25%) of the superior selected progenies.

The relative efficiency (RE) of BLUPA in relation to the mean M was determined by RE = {[rS(BLUPA, GV) - rS(M, GV)] / rS(M, GV)} x 100, where rS(BLUPA, GV) is the Spearman correlation between BLUPA and GV of the selected progenies, and rS(M, GV) is the Spearman correlation between the mean M and GV of the selected progenies. The relative efficiency was obtained also using proportions of coincidence. We also calculated the relative gain (RG) of BLUPA in relation to the mean M using RG = {[MGVBLUPA - MGVM] / MGVM} x 100, where MGVBLUPA is the mean genotypic values of the selected progenies calculated by BLUPA while MGVM is the mean genotypic values of the selected progenies calculated by the mean M method.



For both models, the mean estimates of the genetic parameters associated with the F4:5 progenies were close to the pre-fixed parametric values for all the heritabilities studied (Table 1). Nevertheless in all the evaluations the genetic parameter estimates by the GA model, which includes parentage among progenies, were more accurate than those produced by the GI model. For instance, for 25% heritability the standard error associated with the estimate in the GA model was 33.5% but was 44.4% for the GI model. However, when 50% heritability was considered the same percentages were very similar at 21.3% for the GA model and 22.4% for the GI model (Table 1). This demonstrates that it is advantageous to take into account genealogy (as normally occurs when using the pedigree method), although this advantage decreases as the character heritability increases (>50%).



The selection units (mean M and BLUPA) were evaluated regarding the correct ranking of F4:5 progenies using the true associated genotypic values (GV) as reference. As expected, the mean correlation estimates rS of the evaluated procedures were directly proportional to the heritability values (Table 2). The heritability represents a determination coefficient between the M and GV means, so that the mean values of the correlation estimates (rS(M, GV)) can be used to verify the quality of the simulations, since they are approximate estimators of (Falconer and Mackay, 1996). The rS(M, GV) correlation values were near the expected ()values for all the heritabilities studied (Table 2), e.g. for 25% heritability the mean rS(M, GV) correlation estimate was 0.48 and therefore close to the population value of 0.5.



The rS(BLUPA, GV) mean correlations between BLUPA and GV were superior to the rS(M,GV) mean correlation values for all the heritability values studied (Table 2), demonstrating that the incorporation of genetic relationships results in greater efficiency regarding the correct classification of progenies, particularly in situations where heritability was less than 50%. For example, for 10% heritability the relative efficiency (RE) of BLUPA to mean M was 43.33% while for 50% heritability the RE dropped to only 14.5%, this being confirmed by the high rS(M, BLUPA) correlation (0.87) between BLUPA and mean M (Table 2).

The identification of the progenies in the extremes on their ranking is of greater relevance for breeders than the classification of all the progenies evaluated. For this we estimated the coincidence proportions (C(BLUPA, GV)) of selected progenies using the BLUPA and mean M methods and compared the results with selected progenies based on the real GV (Table 3) and found that for a fixed selection fraction (s) value the corresponding proportions of estimated coincidences in the lower and upper selected extremes were identical.



For all heritability and selection fractions s values the C(BLUPA, GV) between BLUPA and GV were higher than the C(M, GV) between the mean M and GV, (Table 3), supporting our rS estimates (Table 2). As mentioned above, the RE of the BLUPA in relation to mean M in the coincidences with GV was proportionally greater for lower heritability values and selected fractions (s). For example, for 10% heritability and s = 5% C(BLUPA, GV) was 0.21 and C(M, GV) 0.15 (an RE of 40%), while at the same heritability but with s = 25% RE and was only 15.4%. When heritability was 50% RE dropped to 26.3% for s = 5% and 13.3% for s = 25% (Table 3). This indicates that the efficiency of BLUPA could possibly be higher when breeders work with a trait of low heritability and apply high selection intensity.

Breeders want the selected progenies to have the highest possible genetic values, which ultimately reflect the gain achieved with selection, disregarding the progeny by environment interaction. In the selected fractions (s) comparing the GV means of the BLUPA-selected progenies with the mean M for the pedigree method it can be seen verify that the BLUPA procedure offers an advantage at all the heritabilities studied, although with lower relative gains (RG). The RG increased continuously as heritability and s decreased (Table 4), e.g., for 10% heritability and s = 0.4% the RG for BLUPA was 0.77%, while for s = 25% it was 0.59%. With higher h2p heritabilities RG and at  = 50% RG = 0.65% for s = 0.4% and 0.48% for s = 25%.




The fact that the dominance effect is not included in our genetic model does not constitute a severe restriction because the simulation involved F4:5 progenies that represent only 7/64 of the dominance variance (Ramalho et al., 2001). Furthermore, most of the characters of self-fertilized plants, including grain yield, usually show a non-expressive dominance effect (Souza and Ramalho, 1995; Novoselovic et al., 2004). Van Oeveren and Stam (1992) have also verified that the dominance has little importance in computer simulations of autogamous crops.

A restriction of the simulation was the lack of visual selection, normally occurring in the pedigree method, during the conduction stages (Fehr, 1987). However, there are many literature reports on the inefficiency of visual selection for characters with low (< 50%) heritability, which is the case for most characters of economic importance (Silva et al., 1994; Cutrim et al., 1997). Thus, taking two random plants to generate subsequent progenies probably causes no expressive effect on the results, especially for heritabilities lower than 50%.

It is worth mentioning that the BLUPA and mean M estimators are phenotypic data functions that both predict additive genetic values (AGV) associated with progenies. The best estimator is therefore the one that results in the AGV ranked closest to the ranking by the true AGV (White and Hodge, 1989). It should be noted that, with the adoption of the GA model, the predictions of the random effect of progenies (â) or BLUPA correspond to the predictions of the additive genetic value (AGV) of the progenies (Lynch and Walsh, 1998), indicating the theoretical superiority of the BLUPA procedure in relation to mean M.

An important aspect must be mentioned concerning the meaning of unbiasedness for BLUP, more specifically for BLUPA. As mentioned above, in the present context BLUPA is a predictor of the AGV of progenies (a) derived from the same breeding population, whose expectation, by definition, is zero [E(a)=0] (Falconer and Mackay, 1996). In this context, BLUPA is unbiased in the sense that E(â)=E(a) (Robinson, 1991), where â denotes the AGV predictors. The conclusion that can be drawn is, differently from the concept of unbiasedness for estimators of fixed effects, that the unbiasedness property for BLUP does not refer to predictions of individual random effects [E(â)=a] but to the expected value of these effects. Summing up, when , while with ® 0 we have â=E(a/y)®0 , demonstrating that the shrinkage effect in BLUPA predictions is more marked when the values are low, resulting in lower rS(M, BLUPA) correlation estimates. Thus the results of simulation showed in a concordant way that when heritability diminishes information on parentage becomes more important, so that with higher heritability (> 50%) the genotypic values are already well-determined by the mean phenotypic values (M) (Duarte and Vencovsky, 2001).

In general, our simulation showed that the inclusion of parentage among the progenies of the pedigree method using the BLUPA procedure resulted in slightly higher selections gains and more accurate estimates of genetic parameters than when this relationship information was ignored. This possibly compensates for the additional work invested in obtaining these records, especially when investigating low-heritability traits. Our results are supported by other published research showing that higher selection gains can be reached when using the G-A model or BLUPA procedure (Durel et al., 1998; Bromley et al., 2000). A study by Panter and Allen (1995) comparing two BLUP models (with and without the inclusion of information about genetic parentage between lines) for prediction of soybean crossings showed no marked differences between the BLUP models, yet the model which takes parentage into consideration performed better.



This research was financially supported by the Brazilian Agencies CAPES and CNPq. The authors gratefully acknowledge Dr. Eduardo Bearzoti for his excellent comments and suggestions.



Bernardo R (2002) Breeding for Quantitative Traits in Plants. Stemma Press, Woodbury, 359 pp.        [ Links ]

Bromley CM, Van Vleck LD, Johnson BE and Smith OS (2000) Estimation of genetic variance in corn from F1 performance with and without pedigree relationship among inbred lines. Crop Sci 40:651-655.        [ Links ]

Cantú M (2002) Dominando o Delphi 6: A Bíblia. MAKRON Books, São Paulo, 1104 pp.        [ Links ]

Cutrim VA, Ramalho MAP and Carvalho AM (1997) Eficiência da seleção visual na produtividade de grãos de arroz (Oryza sativa L.) irrigado. Pesq Agropeq Bras 32:601-606.        [ Links ]

Dempster A, Laird N and Rubin D (1977) Maximum likelihood from incomplete data via the EM Algorithm. JR Stat Soc Ser B 39:1-38.        [ Links ]

Duarte JB and Vencovsky R (2001) Estimação e predição por modelo linear misto com ênfase na ordenação de médias de tratamentos genéticos. Sci Agric 58:109-117.        [ Links ]

Durel CE, Laurens F, Fouillet A and Lespinasse Y (1998) Utilization of pedigree information to estimate genetic parameters from large unbalanced data sets in apple. Theor Appl Genet 96:1077-1085.        [ Links ]

Falconer DS and Mackay TFC (1996) Introduction to Quantitative Genetics. 4th ed. Longman, London, 464 pp.        [ Links ]

Fehr WR (1987) Principles of Cultivar Development: Theory and Technique. MacMillan Publishing Company, New York, 527 pp.        [ Links ]

Henderson CR (1975) Best linear unbiased estimation and prediction under a selection model. Biometrics 31:423-447.        [ Links ]

Henderson CR, Kempthorne O, Searle SR and Von Krosigk CM (1959) The estimation of environmental and genetic trends from records subject to culling. Biometrics 13:192-218.        [ Links ]

Kennedy BW and Sorensen DA (1988) Properties of mixed-model methods for prediction of genetic merit under different genetic models in selected and unselected populations. In: Weir B, Goodman MM and Namkoong G (eds) Second International Conference Quantitative Genetics. North Carolina State University, Raleigh, pp 91-103.        [ Links ]

Lynch M and Walsh B (1998) Genetics and Analysis of Quantitative Traits. Sinauer Associates, Inc., Sunderland, 948 pp.        [ Links ]

Mrode RA (1996) Linear Models for the Prediction of Animal Breeding Values. Biddles, Guildford, 184 pp.        [ Links ]

Novoselovic D, Baric M, Drezner G, Gunjaca J and Lalic A (2004) Quantitative inheritance of some wheat plant traits. Genet Mol Biol 27:92-98.        [ Links ]

Panter DM and Allen FL (1995) Using best linear unbiased predictions to enhance breeding for yield in soybean: II Selection of superior crosses from a limited number of yield trials. Crop Sci 35:405-410.        [ Links ]

Patterson HD and Thompson R (1971) Recovery of inter-block information when block sizes are unequal. Biometrika 58:545-554.        [ Links ]

Ramalho MAP, Abreu AFB and Santos JB (2001) Melhoramento de espécies autógamas. In: Nass LL, Valois ACC, Melo IS and Inglis MCV (eds) Recursos Genéticos e Melhoramento de Plantas. Fundação MT, Rondonópolis, pp 201-230.        [ Links ]

Resende MDV (2002) Genética Biométrica e Estatística no Melhoramento de Plantas Perenes. Embrapa Informação Tecnológica, Brasília, 975 pp.        [ Links ]

Robinson GK (1991) That BLUP is a good thing: The estimation of random effects. Stat Sci 6:15-51.        [ Links ]

Silva HD, Ramalho MAP, Abreu AFB and Martins LA (1994) Efeito da seleção visual para produtividade de grãos em populações segregantes do feijoeiro. II. Seleção entre famílias. Cienc Prat 18:181-185.        [ Links ]

Souza GA and Ramalho MAP (1995) Estimates of genetic and phenotypic variance of some traits of dry bean using a segregating population from the cross Jalo x Small White. Rev Bras Genet 18:87-91.        [ Links ]

Van Oeveren AJ and Stam P (1992) Comparative simulation studies on the effects of selection for quantitative traits in autogamous crop: Early selection versus single seed descent. Heredity 69:342-351.        [ Links ]

Wang J, van Ginkel M, Podlich D, YE G, Trethowan R, Pfeiffer W, Delacy IH, Cooper M and Rajaram S (2003) Comparison of two breeding strategies by computer simulation. Crop Sci 43:1764-1773.        [ Links ]

White TL and Hodge GR (1989) Predicting Breeding Values with Applications in Forest Tree Improvement. Kluwer Academic Publishers, Dordrecht, 363 pp.        [ Links ]



Send correspondence to:
José Airton Rodrigues Nunes
Departamento de Planejamento e Política Agrícola
Centro de Ciências Agrárias
Universidade Federal do Piauí, Campus Socopo
Bairro Ininga, 64049-550 Teresina, PI, Brazil

Received: February 2, 2007; Accepted: June 11, 2007.



Senior Editor: Ernesto Paterniani