Inclusion of genetic relationship information in the pedigree selection method using mixed models

We used a mixed model approach and computer simulation to evaluate the inclusion of parentage information as determined by the genealogy established in the pedigree method. The simulations were based on a purely additive genetic model for one quantitative trait of 20 unlinked segregating loci with equal effects and an allelic frequency of 0.5 for heritability values of 10%, 25%, 50% and 75% for selection based on an F4:5 progeny mean. We simulated 1000 experiments for each heritability value, corresponding to the evaluation of 256 F4:5 progenies. The phenotypic values of the progenies were analyzed according to two models, one ignoring and one considering the additive genetic parentage among the progenies. The additive relationship coefficients among F4:5 progenies ranged from 0.0 to 1.75. The evaluated selection procedures were the phenotypic progeny mean (M) and the best linear unbiased predictor including parentage (BLUPA). The inclusion of parentage among progenies using the BLUPA procedure resulted in higher selection gains than when the relationship information was ignored, which possibly recompenses the additional work invested to obtain these records, above all in the case of low heritability traits.


Introduction
The pedigree method, proposed towards the end of the 19 th century, is widely applied to improvement programs of self-fertilized plants and is mainly based on recording the genealogies among progenies over the selfing generations (Ramalho et al., 2001).However, not only is does this procedure require time and dedication from the breeder but the usefulness of this method for the selection process is somewhat restricted.One possibility of using this parentage information in support of the selection process would be in progeny evaluations in experiments with replications.Such an approach could be useful since breeders of autogamous species are primarily interested in selecting progenies that, during homozygosis, accumulate a higher quantity of favorable alleles that associate the best additive genetic values (AGV), bearing in mind that the ultimate aim is the establishment of lines (Fehr, 1987).For quantitative traits, however, the phenotype does not always reflect the associated AGV.In this case, it would be important to use methodologies that optimize the use of the available information, in order to classify the progenies as closely as possible to the ranking given by the true AGV (White and Hodge, 1989).Several fixed model and mixed model procedures have been proposed to predict the AGV of progenies, including the best linear unbiased estimator (BLUE) method, the best linear predictor (BLP) technique and the best linear unbiased predictor (BLUP) approach (White and Hodge, 1989;Mrode, 1996;Lynch and Walsh, 1998;Resende, 2002).
The BLUP procedure has been the most widely used in the prediction of the genetic merit in animals (Mrode, 1996) and, more recently, it has been widely applied in plant improvement (Bernardo, 2002;Resende, 2002).Under unbalanced conditions this procedure not only has the advantage of making predictions more reliable compared to those obtained by the ordinary least square method but also incorporates information on related plants and thus optimizes the use of the available data in progeny comparisons (Bernardo, 2002).
Since we found no reports on the use of genealogy established by the pedigree method in progeny selection in self-pollinated crops and field experiments produce unreliable information (Wang et al., 2003) we evaluated the efficiency of selection incorporating this genetic relationship using a mixed model computer simulation.

Methodology
The program was implemented in the Delphi 6.0 environment (Cantú, 2002).A simplified genetic model was assumed for any quantitative trait considering 20 loci of independent segregation, with equal and additive effects and an allelic frequency of 0.5 without dominance.The simulations considered heritability values of 10%, 25%, 50% and 75% for selection based on an F 4:5 progeny mean (h p 2 ).For each h p 2 heritability we simulated 1000 F 2 populations with 20 segregating loci consisting of 64 plants each.The plant multiplication rates were assumed to be equal, with each plant generated 40 offspring.
Initially, the generations were advanced by the pedigree method with no visual selection.A segregating F 2 population of 64 simulated plants gave rise to the 64 F 2:3 progenies with 40 plants each.Two plants were randomly selected from each F 2:3 progeny, resulting in 128 F 3:4 progenies and the process repeated in the following generation to finally obtain 256 F 4:5 progenies with 40 plants each (Figure 1).
The phenotypic values for the plants of each F 4:5 progeny (y i ) were simulated by adding normally distributed random errors to the genotypic values (GV), by the following model: where μ is a constant (100 in the present case), g i is the genotypic effect of plant i (i = 1, 2, ..., 40) and w i is the environmental deviation associated to y i .The g i effect result from the cumulative effect of the 20 loci as already described in the genetic model.The additive effect (a l ) of the l th locus was assumed equal to 1.0, where l = 1, 2, ..., 20.The value of g i taking locus B with two alleles (B 1 and B 2 ) as reference is given by: .
The w i effects were randomly attributed based on a normal distribution with constant variance, i.e., N w ( , ) 0 2 σ .The variance component σ w 2 is the environmental variance among plants, which can be obtained by: where σ G 2 is the genetic variance among the F 2 plants (i.e., σ σ σ is the genetic variance within F 4:5 progenies given by σ σ ).Thus the individual F 2 heritability (h F 2 2 ) was determined as a function of the pre-fixed h p 2 heritability values as: We analyzed 1000 experiments corresponding to the evaluation of 256 simulated F 4:5 progenies, derived from the pedigree method.The analysis was based on the mean phenotypic data of the plots, using a completely randomized experimental design with two replications.
According to the description of the conduction by pedigree method, each F 2 plant generated four F 4:5 progenies (Figure 1).Based on this detailed pedigree, the matrix  of the additive genetic parentages among the related progenies was determined, considering the F 2 population as noninbred.The phenotypic progeny data were then analyzed according to two models: The genetic relationship among progenies was ignored.The mean phenotypic data of the plots of the 256 F 4:5 progenies were analyzed using a linear mixed model (Henderson et al., 1959) , where y is a 512 x 1 vector of the mean phenotypic data of the plots, X is a 512 x 1 fixed effect design matrix, β is a scalar fixed effect of the constant, Z is a 512 x 256 random effects of progenies design matrix, a is a 256 x 1 progeny random effects vector with a N G ~( , ) 0 and G A a = σ 2 , while e is a 512 x 1 vector of errors with e N R ~( , ) 0 and R I e = σ 2 .The G matrix was designated by I p σ 2 (i.e., A = I), indicating that the progenies were assumed to be unrelated.In this case, the σ a 2 component is equal to the genetic variance among F 4:5 progenies (σ p 2 ).

Model G A
In this model the genetic relationship among progenies was considered by the inclusion of parentage among progenies.The mixed model for analysis was identical to model G I , except that the G matrix was designated by Aσ A 2 , with A containing the additive relationship coefficients among F 4:5 progenies, corresponding to twice the Malecot's coancestry coefficient (Bernardo, 2002: section 2.3.5.2), and σ a 2 refers directly to the F 2 additive variance among plants (σ A 2 ).In animal breeding the A matrix is referred to as the numerator relationship matrix (Mrode, 1996) and, in this case, it was given by: , where ⊗ is the Kronecker product.
The solutions for the random ($ a) and fixed effects ( $ ) β for both models were obtained by solving the following equation (Henderson et al., 1959): To obtain the previous solutions, the components of genetic and non-genetic variances were assumed to be unknown.These variance components were estimated using the restricted maximum likelihood (REML) method (Patterson and Thompson, 1971).Since the REML method employs an iterative process, the expectation-maximization (EM) numeric algorithm was applied (Dempster et al., 1977).
The predictions of the progeny random effects ($ a) based on the overall adjusted mixed model are BLUP predictions (Henderson, 1975).After an adjustment of the G I model the predictions were denoted as BLUP I , while for the G A model the predictions were designated BLUP A .Additionally, the phenotypic progeny means (M) for each simulated experiment were also obtained.
It should be noted that due to the balancing conditions under which the simulation were conducted and the use of an orthogonal experimental design with no missing data the BLUP I predictions do not have selective advantage in relation to the phenotypic means of the progenies (M) (Kennedy and Sorensen, 1988).Thus, only the results using the mean M will be shown, and these should be understood as being equal to BLUP I .
For each pre-fixed h p 2 heritability, corresponding to 1000 simulated experiments, we obtained the mean estimates of the genetic variance among the F 4:5 progenies ( $ σ p 2 ) and heritability on an F 4:5 progeny mean basis ( $ h p 2 ) for both models (G I and G A ).The selection procedures of the F 4:5 progenies (mean M and BLUP A ) were evaluated and compared based on the true genotypic values (GV) so for both procedures we estimated the Spearman correlations (r S ), proportions of coincidence in the 5%, 10% and 25% selection fractions for lower and upper extremes on the ranking of progenies, and the mean GV for different percentages (0.4% (best progeny), 5%, 10% and 25%) of the superior selected progenies.
The relative efficiency (RE) of BLUP A in relation to the mean M was determined by RE = {[r S(BLUPA, GV)r S(M, GV) ] / r S(M, GV) } x 100, where r S(BLUPA, GV) is the Spearman correlation between BLUP A and GV of the selected progenies, and r S(M, GV) is the Spearman correlation between the mean M and GV of the selected progenies.The relative efficiency was obtained also using proportions of coincidence.We also calculated the relative gain (RG) of BLUP A in relation to the mean M using RG = {[MGV BLUPA -MGV M ] / MGV M } x 100, where MGV BLUPA is the mean genotypic values of the selected progenies calculated by BLUP A while MGV M is the mean genotypic values of the selected progenies calculated by the mean M method.

Results
For both models, the mean estimates of the genetic parameters associated with the F 4:5 progenies were close to the pre-fixed parametric values for all the h p 2 heritabilities studied (Table 1).Nevertheless in all the evaluations the genetic parameter estimates by the G A model, which includes parentage among progenies, were more accurate than those produced by the G I model.For instance, for 25% h p 2 heritability the standard error associated with the $ σ p 2 estimate in the G A model was 33.5% but was 44.4% for the G I model.However, when 50% h p 2 heritability was considered the same percentages were very similar at 21.3% for the G A model and 22.4% for the G I model (Table 1).This demonstrates that it is advantageous to take into account genealogy (as normally occurs when using the pedigree method), although this advantage decreases as the character heritability increases (h p 2 50 ≥ %).
The selection units (mean M and BLUP A ) were evaluated regarding the correct ranking of F 4:5 progenies using the true associated genotypic values (GV) as reference.As expected, the mean correlation estimates r S of the evaluated procedures were directly proportional to the h p 2 heritability values (Table 2).The h p 2 heritability represents a determination coefficient between the M and GV means, so that the mean values of the correlation estimates (r S(M, GV) ) can be used to verify the quality of the simulations, since they are approximate estimators of h p 2 (Falconer and Mackay, 1996).The r S(M, GV) correlation values were near the expected ( ) h p 2 values for all the h p 2 heritabilities studied (Table 2), e.g. for 25% h p 2 heritability the mean r S(M, GV) correlation estimate was 0.48 and therefore close to the population value of 0.5.
The r S(BLUPA, GV) mean correlations between BLUP A and GV were superior to the r S(M,GV) mean correlation values for all the h p 2 heritability values studied (Table 2), demonstrating that the incorporation of genetic relationships results in greater efficiency regarding the correct classification of progenies, particularly in situations where h p 2 heritability was less than 50%.For example, for 10% h p 2 heritability the relative efficiency (RE) of BLUP A to mean M was 43.33% while for 50% h p 2 heritability the RE dropped to only 14.5%, this being confirmed by the high r S(M, BLUPA) correlation (0.87) between BLUP A and mean M (Table 2).
The identification of the progenies in the extremes on their ranking is of greater relevance for breeders than the classification of all the progenies evaluated.For this we estimated the coincidence proportions (C (BLUPA, GV) ) of selected progenies using the BLUP A and mean M methods and compared the results with selected progenies based on the real GV (Table 3) and found that for a fixed selection fraction (s) value the corresponding proportions of esti-mated coincidences in the lower and upper selected extremes were identical.
For all h p 2 heritability and selection fractions s values the C (BLUPA, GV) between BLUP A and GV were higher than the C (M, GV) between the mean M and GV, (Table 3), supporting our r S estimates (Table 2).As mentioned above, the RE of the BLUP A in relation to mean M in the coincidences with GV was proportionally greater for lower h p 2 heritability values and selected fractions (s).For example, for 10% h p 2 heritability and s = 5% C (BLUPA, GV) was 0.21 and C (M, GV) 0.15 (an RE of 40%), while at the same h p 2 heritability but with s = 25% RE and was only 15.4%.When h p 2 heritability was 50% RE dropped to 26.3% for s = 5% and 13.3% for s = 25% (Table 3).This indicates that the efficiency of BLUP A could possibly be higher when breeders work with a trait of low heritability and apply high selection intensity.
Breeders want the selected progenies to have the highest possible genetic values, which ultimately reflect the gain achieved with selection, disregarding the progeny by environment interaction.In the selected fractions (s) comparing the GV means of the BLUP A -selected progenies with the mean M for the pedigree method it can be seen verify 76 Nunes et al.   that the BLUP A procedure offers an advantage at all the h p 2 heritabilities studied, although with lower relative gains (RG).The RG increased continuously as h p 2 heritability and s decreased (Table 4), e.g., for 10% h p 2 heritability and s = 0.4% the RG for BLUP A was 0.77%, while for s = 25% it was 0.59%.With higher h p 2 heritabilities RG and at h p 2 = 50% RG = 0.65% for s = 0.4% and 0.48% for s = 25%.

Discussion
The fact that the dominance effect is not included in our genetic model does not constitute a severe restriction because the simulation involved F 4:5 progenies that represent only 7/64 of the dominance variance (Ramalho et al., 2001).Furthermore, most of the characters of self-fertilized plants, including grain yield, usually show a non-expressive dominance effect (Souza and Ramalho, 1995;Novoselovic et al., 2004).Van Oeveren and Stam (1992) have also verified that the dominance has little importance in computer simulations of autogamous crops.
A restriction of the simulation was the lack of visual selection, normally occurring in the pedigree method, during the conduction stages (Fehr, 1987).However, there are many literature reports on the inefficiency of visual selection for characters with low (< 50%) heritability, which is the case for most characters of economic importance (Silva et al., 1994;Cutrim et al., 1997).Thus, taking two random plants to generate subsequent progenies probably causes no expressive effect on the results, especially for h p 2 heritabilities lower than 50%.
It is worth mentioning that the BLUP A and mean M estimators are phenotypic data functions that both predict additive genetic values (AGV) associated with progenies.The best estimator is therefore the one that results in the AGV ranked closest to the ranking by the true AGV (White and Hodge, 1989).It should be noted that, with the adoption of the G A model, the predictions of the random effect of progenies ($ a) or BLUP A correspond to the predictions of the additive genetic value (AGV) of the progenies (Lynch and Walsh, 1998), indicating the theoretical superiority of the BLUP A procedure in relation to mean M.
An important aspect must be mentioned concerning the meaning of unbiasedness for BLUP, more specifically for BLUP A .As mentioned above, in the present context BLUP A is a predictor of the AGV of progenies (a) derived from the same breeding population, whose expectation, by definition, is zero [ ( ) ] E a = 0 (Falconer and Mackay, 1996).In this context, BLUP A is unbiased in the sense that Robinson, 1991), where $ a denotes the AGV predictors.The conclusion that can be drawn is, differently from the concept of unbiasedness for estimators of fixed effects, that the unbiasedness property for BLUP does not refer to predictions of individual random effects [ ($ ) ] E a a = but to the expected value of these effects.values are low, resulting in lower r S(M, BLUPA) correlation estimates.Thus the results of simulation showed in a concordant way that when h p 2 heritability diminishes information on parentage becomes more important, so that with higher heritability h p 2 (> 50%) the genotypic values are already well-determined by the mean phenotypic values (M) (Duarte and Vencovsky, 2001).
In general, our simulation showed that the inclusion of parentage among the progenies of the pedigree method using the BLUP A procedure resulted in slightly higher selections gains and more accurate estimates of genetic parameters than when this relationship information was ignored.This possibly compensates for the additional work invested in obtaining these records, especially when investigating low-heritability traits.Our results are supported by other published research showing that higher selection gains can be reached when using the G-A model or BLUP A procedure (Durel et al., 1998;Bromley et al., 2000).A study by Panter and Allen (1995) comparing two BLUP models (with and without the inclusion of information about genetic parentage between lines) for prediction of soybean crossings showed no marked differences between the BLUP models, yet the model which takes parentage into consideration performed better.
Pedigree selection by mixed models 77 and since a l was assumed equal to 1.0 for all loci, then σ A L 2 2 = / , σ D 2 is the F 2 variance dominance and since dominance was assumed to be absent, then σ D 2 0 = , and h F 2 2 is the F 2 generation individual heritability.The 40 simulated genotypes or plants per F 4:5 progeny were divided into two virtual plots of 20 plants (n = 20) to produce two replications (r = 2) for each progeny.In the following equations, random errors were considered to be normally distributed among plots, with e

Figure 1 -
Figure 1 -Scheme of conduction by pedigree method.

Table 1 -
Mean estimates of the genetic variance among F 4:5 progenies

Table 2 -
Mean estimates of the Spearman correlation (r S ) and standard errors (in parentheses) between genotypic values (GV), phenotypic means (M) and BLUP predictions considering the additive parentage (BLUP A ) among F 4:5 progenies, conducted by the pedigree method for different values of heritability on a F 4:5 progeny mean basis (h p 2 ).

Table 3 -
Mean values of the proportions of coincidences (C) and standard errors (values in brackets) in the selection proportions (s) of 5%, 10% and 25% of the superior or inferior F 4:5 progenies, conducted by the pedigree method, ranked by the parametric genotypic values (GV), phenotypic means (M) and BLUP considering the additive parentage (BLUP A ) for different values of heritability on a F 4:5 progeny mean basis (h p 2 ).

Table 4 -
Mean genotypic values and standard errors (values in brackets)in the selection proportions (s) of 0.4% (best progeny), 5%, 10% and 25% of the superior F 4:5 progenies, conducted by the pedigree method, ranked by the phenotypic means (M) and BLUP considering the additive parentage (BLUP A ) for different values of heritability on a F 4:5 progeny mean ba-