SELECTION IN SEVERAL ENVIRONMENTS BY BLP AS AN ALTERNATIVE TO POOLED ANOVA IN CROP BREEDING Seleção em diversos ambientes pelo BLP como alternativa à anava conjunta

Plant breeders often carry out genetic trials in balanced designs. That is not always the case with animal genetic trials. In plant breeding is usual to select progenies tested in several environments by pooled analysis of variance (ANOVA). This procedure is based on the global averages for each family, although genetic values of progenies are better viewed as random effects. Thus, the appropriate form of analysis is more likely to follow the mixed models approach to progeny tests, which became a common practice in animal breeding. Best Linear Unbiased Prediction (BLUP) is not a “method” but a feature of mixed model estimators (predictors) of random effects and may be derived in so many ways that it has the potential of unifying the statistical theory of linear models (ROBINSON, 1991). When estimates of fixed effects are present is possible to combine information from several different tests by simplifying BLUP, in these situations BLP also has unbiased properties and this lead to BLUP from straightforward heuristics. In this paper some advantages of BLP applied to plant breeding are discussed. Our focus is on how to deal with estimates of progeny means and variances from many environments to work out predictions that have “best” properties (minimum variance linear combinations of progenies’ averages). A practical rule for relative weighting is worked out.


INTRODUCTION
One of the most typical features of plant genetical essays is the high level of balance and the better precision of variance component estimates compared with animal counterparts.Plant breeders do not always keep accurate records of genetic relatedness, in some allogamous species, for instance, open pollinated families are taken as half sib progenies for selection purposes.However, this is compensated by the large number of progeny and the replications of the trials in 1 Doutor em Ciências Exatas, Professor Associado -Departamento de Ciências Exatas/DEX -Universidade Federal de Lavras/UFLA -Cx.P. 3037 -37200-000 -Lavras, MG -juliobuenof@gmail.com 2 Doutor, Professor Titular -Departamento de Genética -Escola Superior de Agricultura "Luiz de Queiroz"/ESALQ -Universidade de São Paulo/USP -Avenida Pádua Dias, 11 -Cx.P. 83 -13400-970 -Piracicaba, SP -rvencovs@esalq.usp.brsame locations and years, conditions often impossible to attain with animal trials.
The usual analysis that guides plant breeders in selecting progenies tested in several environments is the pooled ANOVA, based on marginal averages of each family tested.The underlying assumption for this approach being that for each genetic value there is a constant effect.This is the heuristics of the fixed statistical modelling.
On the other hand, breeders know that the genetic values of individuals measured by the performance of their progenies are better viewed as random effects, representing Selection in several environments... 1343 small samples of the possible genotypes being tested.This way, the genetic value of a family in each environment can be viewed as one of the possible realizations of an unobservable random variable (the "true" breeding value).
The intraclass correlation of those different realizations reflects the heritability for the selection based on progeny means.Thus, the appropriate form of analysis is more likely to follow the mixed model approach to progeny tests, which is a common practice in animal breeding after Henderson's lifework.
In particular, Best Linear Unbiased Prediction (BLUP) has been considered as the most appropriate form in the analysis of genetic data in animal breeding trials.Following Robinson (1991) this BLUP is not a "method" but a feature of estimators (predictors) of random effects and can be derived in so many ways that has the potential of unifying the statistical theory of linear models.
However, the presence of fair estimates of fixed effects, coupled with a large amount of historical of data on variance components and heritability estimates makes it possible to combine information from several different tests by relaxing some BLUP assumptions.The purpose of this work is to introduce and exemplify the advantages of Best Linear Prediction (BLP) applied to plant breeding.Our focus is on how to deal with estimates of progeny means and variances from many environments to work out predictions that have "best" properties (in the sense of being minimum variance linear combinations).
In analogy with Robinson (1991), we think that if breeders can obtain good estimates of fixed effects, the BLP "method" will have also unbiased properties and we produce BLUPs from straightforward heuristics.
Statistical steps to establish BLUP as a reasonable classical predictor of genetic values may be found in some seminal papers since Henderson et al. (1959).In particular, for BLP derivation and features like maximization of correct ranks (under normality assumptions), see Henderson (1963).For forestry breeding purposes, White & Hodge (1988) was the pioneer BLP work, and White & Hodge (1989) comprehensively covers both BLUP and BLP subjects.In forestry breeding, Resende et al. (1993) were the first Brazilian researchers to introduce BLP (and soon other mixed model techniques).Although these techniques are straightforward we have found no works in which BLP was applied to crop species.

METHODOLOGY
A progeny trial is a way to predict the breeding value of parents from realizations of its progeny phenotype.To make comprehensive selection decisions, plant breeders usually run the same trials in multiple environments (locations or years), Table I  In this model, m is the general mean, A j is the fixed effect of environment j, p i is the random effect of breeding value of progeny i; (pA) ij is the random effect of interaction of i th breeding value with j th environment and .
e ij is the mean experimental error.
The underlying assumptions for BLP purposes are that the fixed effects and variance components of random effects of each level of the model are known.Then: And the following variance component estimates are taken as true variance parameters: p 2 : progeny variance; y 2 : phenotypic variance among overall means of progenies, a function of progeny and (average) error variances; y j 2 : phenotypic variance among local means of progenies for environment j, that is a function of all the random term variances.Note that this is a usual set of assumptions if the objective is selection.In this situation, all nuisance parameters are taken as fixed, as in animal breeding literature -for a comprehensive justification see White & Hodge (1988, 1989).
In this case BLP is calculated considering usual selection as a special case of BLP in which only one parameter (the global means in the right column of Table 1) is available for guiding the selection process.
At the next hierarchical level, with one progeny mean per environment (as displayed in Table 1) another set of breeding values with more general BLP properties may be calculated as weighted averages of environmental means.In this case, the weights are in some way inversely proportional to environmental (and non-additive genetical) variances.
When there are no population differences (fixed effects) among genetic values of progenies, the breeding values may be predicted by the following g ˆvector, that has BLP properties (SEARLE et al., 1992): in which C is the covariance matrix between genetic values and its phenotypic realization (t stands for a transposition operation); V -1 is the inverse of the covariance phenotypic data matrix; y is the phenotypic data vector and E(.) stands for mathematical expectation.
Following White & Hodge (1989), both variance among the predictions and covariance between the predicted and true breeding values may be calculated by: A "goodness of fit" measure for the prediction process could be calculated by the correlation between the true and predicted genetic values: in this expression 2 g is the true genetic variance (assumed as known) of breeding values.This square root of the coefficient of genotypic determination is the so called "accuracy" and in our context will be used for comparison purposes.

RESULTS AND DISCUSSION
For the global means we can derive the result: Using all the elements in Table 1, for more than 2 environments, we get: in wich b is given by: Selection in several environments... 1345 The breeding value of i progeny can more realistic be calculated from: The relative weights being 1.000, 0.833, 0.625 and 0.625.
The global mean approach in this case corresponds to rank genetic values with the same weight for each environment, regardless of the differeces in precision of the estimated means.For the estimation of breeding progress by selecting the ith progeny, the deviance of the progeny mean must then be multiplied by the heritability of the selection based on global progeny means.The relative weights in this case are all 0.25.
As a practical rule -derived in Bueno Filho (1997), proofs shown in apendix -the relative weights of each environmental mean are in fact products of the differences between the phenotypic variances and the common progeny variance for the other environments.
In the example described in Table 1 we get the following weights (Table 2): Plant breeders are often interested in selecting progenies tested in different locations and years, but in general each progeny is not tested in all environments.
Selection based on average environmental conditions is the most common objective, although in some special cases "target environments" are elected either by being the most (or least) productive locations, or by some desired (or undesired) climate, soil features, typical years with particular experimental properties, etc.
Although pooled ANOVA is a well-known technique, it is not designed to handle unbalanced data of this type which shows dependencies between factor levels.However, the application of more general BLP to such situations is straightforward, as highlighted in the example of Table 3, that shows experimental means from the The most important factor in determining relative weights for selection is the similarity of the progeny performance in both tested and target environment.For example, in Table 4 selection for A 1 uses the specific means of progenies in environment A 1 weighted by the progeny variance in environment A1 (that includes progeny by environment interaction variance), while selecting for average environment this interaction is not included in the weighting process.
It is remarkable in Table 4 that progeny 3 has a positive predicted genetic value for environment A 1 that may lead it to be selected, although not tested!In conventional pooled ANOVA we do not even use the means of untested progeny and the target environment approach is unthinkable...One of the most interesting features of best linear unbiased predictors (BLUP) is thier ability to maximize the true ranks for normal populations.This is a theoretically proven fact which is well established by some 40 years of mixed model studies including Monte Carlo simulations.This means that rank differences between global means and BLP must lead to greater errors when using ANOVA, because the first, being more similar to BLUP, has less strong assumptions on the covariance structure.Examples that are sometimes given in which wrong ranking results from poor variance component estimates are insignificant for practical plant breeding purposes.
Registers on pedigree and molecular data information on genetic relatedness great increases the superiority of BLUP and BLP over pooled ANOVA.
In the example given in this paper, it is possible to calculate BLUP values for all progenies and any target environment by using BLUP of progeny breeding values calculated independently in each environment.This potentially simplifies the computational task in calculating BLUP in a single model and allows simpler estimates that concatenates different years, locations, experimental designs, generations, types of progeny, relatedness of genetical material, etc.
Restricted Expectation of Maximum Likelihood (REML) is becoming a standard in likelihood-based analysis of linear mixed models (SEARLE et al., 1992).Although much of the previous BLP work demonstrates the readiness for its use at the operational level, statistical analysis of genetic trials is mainly concerned with producing complex models to manage REML like estimates and predictions directly.
However, a robust handling of progeny breeding values of progenies as random variables can be easily managed using BLP techniques from usual tables that

Derivation of b'
We will consider the covariance matrices in a more amenable form without generality losses.This is achieved by taking a single vector of constants from C' and the following form for V: in which J nxp is the matrix that has only 1 as its elements; pAi can be any linear combination of variance components in which other terms plus additive genetical variance occurs (BUENO FILHO, 1997).
For any number (n) of environments, the determinant of V may be calculated as: This is a recurrencing relationship that may be proved by as follows.This relation is used to prove the analogous relation for the b vector, that has the following elements: 1) and (2) can easily be proved for special cases, e.g.: Let call V j * the matrix that has element d j = 0.
Taking equation (1) as true, it then follows that: the same operations for including an environments in set k must be: the last columns for calculating the determinants: Selection in several environments... 1349 in which V j * is the nucleus of coffactors for row j and column n of V n matrix.These matrices may be obtained by letting d j = 0 as follows for V 2 and V 3 : It may be easily shown that for odd j values this expression will be negative and positive for even j values.So for any j this will be: The negative signals result from combination of odd exponents of coffactors or by changing the assortment of column J n-1,1 when taking determinants of minor order.So, this expression can be simplifyed to: ( 1) and V k+1 determinant result in: and so on...
, that is the same of equation ( 1) q.e.d.
In an analogous way, b' in equation ( 2) is calculated by sums of the coffactors for j columns as follows: b by analogy, we get for the following j element: target environment approach could be easily adapted by taking 1+d i instead of 1 in the i element of the c' covariance vector.This leads to an additional factor of d i b' j in the weighting for environment j.

Table 2 -
Relative weights for BLP selection based on environmental means of progenies: experiments described in Table 2, and Table 4 that shows the correspondent breeding values predictions.
is that linear combinations of BLP have BLP properties.This suggest that to work out BLUP values rather than local means could lead to BLUP properties of breeding values, and these can be calculated in a BLP like

Table 4 -
Estimated means from 10 progenies; predictions of breeding values for 2 targets of selection: average environment and selection for environment 1, with ranks for selection.

Table 3 -
Estimated means from 10 progenies in four environmental situations.