Pre-weaning performance evaluation of a multibreed Aberdeen Angus × Nellore population using different genetic models

This work aimed at estimating the genetic effects that affect the pre-weaning performance of animals from multibreed crosses. In order to do so, it was used information of the weight at weaning of 79,521 animals, sired by 1,020 bulls and 61,898 cows from Aberdeen Angus and Nellore breeds and from many genetic groups resulted from their crosses. Five genetic models were tested: model 1, containing the fixed breed genetic effects (additive direct and maternal effects, heterozygote direct and maternal effects, epystatic direct and maternal effects, joint additive direct and maternal effects); model 2, equal to model 1, excluding direct and maternal joint additive effects; model 3, equal to model 1, excluding direct and maternal epystatic effects; model 4, equal to model 1, excluding direct and maternal epystactic effect and direct and maternal joint additive effects; and model 5, equal to model 1, excluding direct and maternal heterozygotic effects, direct and maternal epystatic effects and direct and maternal joint additive effects. The models were analyzed by the following methods: least square means method, ridge regression method, and the restricted maximum likelihood method. The dominant additive models usually used for genetic evaluations do not give a good description of the pre-weaning performance variations, making it necessary to add the heterozygote and epystatic effects; the joint additive effects do not significantly improve the adjustment of the analysis model and the heterozygote effects are efficient in representing a quadratic breed additive effect, in addition to insert an unnecessary bias assigned to multicollinearity related to the joint additive effects.


Introduction
In Brazil, as in other countries, the use of multibreed populations for commercial purposes has been growing, given the possibility to increase the efficiency of production determined by heterosis and complementarity between breeds used in crossing, enhanced by the selection of superior animals.
The response to selection is proportional to the accuracy of prediction of breeding values (Falconer & Mackay, 1996).On the other hand, the accurate estimation of genetic value of individuals subjected to a breeding program depends, in large part, on the effects considered in the statistical model used for evaluation of the animals.
According to Cardoso & Tempelman (2004), a model that includes additive and non-additive genetic fixed effects, in addition to additive random genetic effects, may be suitable for genetic evaluation of multiracial populations.However, Kinghorn (1993), Fries (1996), Sharma et al. (2000) and Pimentel et al. (2006) emphasized the importance of effects such as epistasis, heterozygosity and complementarity between breeds, in addition to additives.
The most used method to derive prediction equations is the least-squares method.However, when there are strong linear relationships between independent variables (multicollinearity), the simple estimation of individual regression coefficients by least squares tends to be unstable, often with high standard error, which can lead to erroneous inferences (Bergmann & Hohenboken, 1995).
The ridge regression method (Hoerl & Kennard, 1970) is one of the alternative methods of estimation that provide a more informative analysis of the data in the presence of multicollinearity.
The purpose of this study was to evaluate the importance of including non-additive genetic effects in the analysis model of records of pre-weaning performance of a multiracial population of cattle, derived from crosses between the Aberdeen Angus and Nelore breeds.

Material and Methods
The original database contained performance information of 121,241 calves sired by 1,359 bulls and 84,465 cows of Aberdeen Angus and Nellore breeds by crossing between them different genetic group, collected from 1986 to 2002 from 75 farms located in the states of Goiás, Minas Gerais, Mato Grosso do Sul, Paraná, Rio Grande do Sul, São Paulo and Tocantins.
The data were structured using the SAS application (2001) to allow the file to be used for estimation of genetic and environmental effects.In the data edition, contemporary group variables were created, considering animals born on the same farm (1-75) in the same year (1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002) and in the same season (1-4), belonging to the same gender and to the same management group until weaning (1-388), totalizing 2,439 contemporary groups; age at weaning in classes (IDC), corresponding to the weaning age in months (5-11); and age of the cow at calving in classes (IVC), indicating the age of the cow in years (3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15).It was eliminated data from product of cows whose type of reproduction did not result from artificial insemination or controlled breeding from bulls with less than 10 offspring, contemporary groups with less than five components, weaning weight (PD) outside the range of ± 2.5 standard deviations compared to the average population, genetic groups with fewer than 100 animals or observed in only one farm, farms with only one single genetic group, and cows at less than three years of age and at more than fifteen years of age when delivery occurred.After editing, the working file was composed of the weaning weights of 79,521 calves sired by 1,020 bulls and 61,898 cows, totalizing 135,051 animals in the relationship matrix.
The coefficients for direct additive (AD) and maternal additive of breed (AM) effects were defined by the contribution of genes of the Nellore breed in the genetic composition of each individual.The coefficients for joint additive direct (ACD) and maternal (ACM) effects between the Aberdeen Angus and Nellore breeds were calculated as ACD=AD*(1-AD) and ACM=AM*(1-AM), respectively, theorizing a quadratic additive effect.
The direct and maternal heterozygosity were calculated as: h ij = α t i α v i + α t i α v i , in which α t i and α v i denote the proportion of the 'i' gene race in the mother and father of the animal, respectively.
The effects of direct and maternal epistasis were calculated as the average heterozygosity present in the gametes that generate each individual or as the average heterozygosity in the parents of an individual (Fries et al., 2000).
For heterozygous and maternal epistatic effects, genetic information regarding the maternal grandparents of the individuals was necessary.When this information was not available, it was considered that the mothers were produced by inter se mating.
Five models were tested using three different methods of analysis: least squares method (MQM) and ridge regression with λ = 0.05 (RC) using SAS statistical package (2001), and Restricted Maximum Likelihood Method (REML) using the Multiple Trait Derivative Free Restricted Maximum Likelihood (MTDFREML) application described by Boldman et al. (1995).
For the least squares and ridge regression methods, the statistical model (model 1) can be described as: Y ijkl = μ + GC i + IDC j + IVC k + β 1 AD + β 2 AM + β 3 HD + β 4 HM + β 5 ED + β 6 EM + β 7 ACD + β 8 ACM + ε ijkl , in which Y ijkl = observation of weight to weaning of the l th animal; μ = overall mean of the trait; GC i = effect of the i th contemporary group; IDC j = effect of j th age class at weaning; IVC k = effect of k th age class of the cow at calving; β 1 , β 2 , β 3 , β 4 , β 5 , β 6 , β 7 and β 8 = regression coefficients of direct and maternal additive genetic, direct and maternal heterozygous, direct and maternal epistatic and joint additive direct and maternal effects, respectively, and ε ijkl = random error associated with each observation, NID (0, δ 2 ).
The ridge regression method (Hoerl, 1962) consists on the addition of a scalar (λ) to the elements of the main diagonal of the coefficient matrix, in order to break the linear dependence found between its columns.The estimator becomes: β ^ = (X´X+λI) -1 X´y Originally, Hoerl & Kennard (1970) proposed a graphical method, namely ridge trace, to choose the λ value.By this method, the parameter estimates are plotted against various values of λ, starting from zero.The value of λ is then chosen, from which it is possible to see stabilization in the values of the estimates.However, high values of λ lead to the decrease of the regression coefficients and of standard errors of these estimates, which could affect the estimation of the model effects, especially those which are not involved in cases of multicollinearity.
The procedure implemented using the SAS statistical package, as described by Freund & Littell (2000), was the following: β ^SAS = [R xx (I+D λ )] -1 R Xy , in which R XX = correlation matrix of the explanatory variables; R Xy = correlation vector of the explanatory variables with the response variable; and D λ = diagonal matrix with λ values on the diagonal.
The parameters and breeding values were obtained by REML with convergence criterion of 1x10 -9 , under the model described in the matrix form: y = Xβ + Wg + Z 1 a + Z 2 m + e, in which y = vector of observations of the weights to weaning; X = incidence matrix associated with fixed environmental effects (GC, IDC and IDV), β = vector of solutions for fixed environmental effects; W = incidence matrix for fixed genetic effects (direct and maternal additive, direct and maternal heterozygous, direct and maternal epistatic and direct and maternal joint additive); g = vector of solutions for fixed genetic effects; Z 1 = incidence matrix associated with random direct genetic effect of each animal; a = vector of solutions to the random direct additive genetic effects; Z 2 = incidence matrix associated with random maternal additive genetic effect of each animal; m = vector of solutions for random maternal additive genetic effects; and e = vector of random waste.
Briefly, the models were the following: model 1 = as described above; model 2 = model 1 without the direct and maternal joint additive effects; model 3 = model 1 without the direct and maternal epistatic effects; model 4 = model 1 without the direct and maternal epistatic, direct and maternal joint additive effects; and model 5 = model 1 without the direct and maternal heterozygous, direct and maternal epistatic and joint additive direct and maternal effects.
The efficiency of the methodologies and the comparison among models were performed by analysis of the coefficients of determination (R 2 ), the F test, the C(p) statistic of Mallows, the variance inflation factors and the magnitude of correlations (Pearson correlation) and ranking (Spearman correlation), in addition to the likelihood ratio test.
To test whether the inclusion of parameters for fixed genetic effects effectively improved the efficiency of the model in terms of adjustment, the F test was undertaken, providing evidence against the simplest model (x-1) if the F value is greater than the tabulated values for a distribution as F(gl x-1 -gl x , gl x , α).
The F value was calculated using the following formula, as described by Weisberg (1980).
, in which SQR x and SQR x-1 are the sums of squared residuals of models x and x-1, respectively; and gl x and gl x-1 are the degrees of freedom of model residues x and x-1, respectively, in which x represents models 2, 3, 4 or 5.
As described by Freund & Littell (2000), the C(p) statistic of Mallows (1973), a measure of error variance plus the bias introduced by the exclusion of a variable of type, is calculated as it follows: C (p) = [SQR (p) / MSE] -(N -2p) + 1, in which MSE is the mean square error of the full model; SQR (p), the sum of squares of a model containing a subset of p variables; and N, the number of observations.When it is observed that C (p) > (p +1) for a model containing explanatory p variables, there is evidence of bias by exclusion of an important variable in the model.
The identification of the problem of multicollinearity in regression analysis is usually done by examining the correlations between pairs of explanatory variables.However, in some cases, associations between three or more variables cannot be detected by examining correlations in pairs.A more efficient way of diagnosing multicollinearity is the examination of variance inflation factors (FIV), defined by: , in which is the squared multiple correlation coefficient that results from the regression of variable x i against all other explanatory variables included in the model.The likelihood ratio test was used to verify if the models differ statistically by the difference between the values of -2log L. This test is based on a chi-square distribution with g degrees of freedom and error probability of 5%, in which g is the difference in numbers of estimated parameters in the compared models (Dobson, 1990).

Results and Discussion
In this experiment, the animals were in average at 217.63 days of age, with an average weight of 177.69 kg, an their dams calved them at an average weight of 5.59 years.About 80% of the animals were weaned from six to eight months of age, and approximately 25% of mothers were four years of age at birth, indicating that this was a fairly young flock.
The working file was first examined by the five models submitted to the method of least squares.It can be checked the significance of all covariate fixed genetic effects on weaning weight (P<0.01),except for the direct additive effect on the model 1 and of joint additive direct effect in model 3 (Table 1).The models explained about 60% of the variation in weaning weight, whereby model 5, which contained only additive fixed genetic effects, showed the lowest setting, with a loss of about seven percentage points in the coefficient of determination (R 2 ) confirmed by the high estimate for the C(p) statistic.
However, the wide variation in the regressors is a signal for possible collinearity among the independent variables, which was confirmed by analysis of correlations between these variables, whose correlation coefficients ranged from -0.12 (between direct additive effects and maternal joint additive) to 0.98 (between epistatic and maternal heterozygous) with an average of 0.45.
Due to multicollinearity, it was employed the ridge regression method, indicated by Hoerl & Kennard (1970) as one of alternative methods of estimation that provide a more informative analysis of the data in the presence of multicollinearity (Table 2).
From analysis of the ridge trace cart (Hoerl & Kennard, 1970) we chose the value of λ as 0.05, a minimal value that provided the breakdown of multicollinearity, reducing the variance inflation factors, and not dramatically, the regressors and their standard errors.
Analyzing the comparative values of F, it is observed a loss of adjustment when a couple of effects is removed.Only the comparison value between models 1 and 2 was not significant, which indicates that withdrawal of joint additive genetic effects.The variance inflation factor is due to increased variance of the regressors, as determined by the correlation between independent variables.When the variance inflation factor is greater than 10, it indicates the occurrence of multicollinearity (Freund & Wilson, 1998).According to Dias et al. (2003), this procedure is performed because, in the presence of multicollinearity, statistical tests may fail to detect significant differences among the factors.Thus, it may be noted that the effects with greater involvement in cases of multicollinearity are the heterozygous and joint additives, and that the application of λ = 0.05 by the ridge regression method was enough to substantially reduce the values of variance inflation factors (Table 3).
In an attempt to verify the possible effects of multicollinearity and distinctions imposed by the use of different models in multiracial genetic evaluation, we estimated genetic and phenotypic covariances used later for prediction of breeding values at the studied population.
Direct heritabilities and their standard errors estimates ranged from 0.29 ± 0.020 to 0.35 ± 0.022 (Table 4), showing the possibility of genetic gain in this population through the use of mass selection for weaning weight.Lower values were reported by Everling et al. (2001) for products from the cross between the Angus and Nellore breeds (0.23), by Ferraz Filho et al. (2002) for the Tabapuã breed (0.23) and by Weber (2008) for the Aberdeen Angus breed (0.24).However, heritabilities higher than those found in this study were described by Kaps et al. (2000), as 0.53, and by Souza et al. (2007) as 0.56 for animals of the Aberdeen Angus breed and by Lopes et al. (2008) for the Nellore breed in the three Southern States of Brazil (from 0.41 to 0.47).Maternal heritability values indicated less variability in the maternal additive random effect rather than in the direct one, with estimates ranging from 0.15 ± 0.015 to 0.26 ± 0.019.These figures are evidence of the importance of maternal effect on the development of the calves at pre-weaning phase.
On the other hand, the values of genetic correlation between direct and maternal additive genetic effects (r am ) show antagonism between these two groups of genes.Negative values for r am are not uncommon in the available literature; authors like Marques et al. (2000), Everling et al. (2001), Corrêa et al. (2006) and Souza et al. (2007) reported values ranging from -0.83 to -0.17.
According to Crump et al. (1997), problems with maternal effects and their correlations may be related to the structure of the analyzed data, depending on the number of generations covered by the available kinship structure.In this study, only 12% of female born products took her offspring with accompanied performance, which may add maternal bias in the estimates.This is because the genes responsible for this effect are expressed only in females, and therefore the assessment of their importance may depend on the existence, in the data set, of a considerable number of female relatives with progeny evaluated.
The values of the likelihood ratio test indicate the existence of significant differences between the tested models (models 4 and 5; models 3 and 4; models 2 and 3; and models 1 and 2).However, comparisons between models 1 and 2 and between models 3 and 4 showed minor differences with lower values in the likelihood ratio test.The difference between each pair of models are the joint additive effects, and these minor differences show the importance of this small effect of variation in weaning weight in this population.
It is observed a reduction in the residual variance in the application of model 3, however, the high values of variance inflation factor for the heterozygous and joint additive effects present in the model, combined with the large variation in estimates of fixed genetic effects (Table 5) among the different models used, confirm the existence of multicollinearity, apart from the problems of estimation by REML in the presence of this type of interaction among estimators.
The correlations of magnitude (Pearson's) and of rank (Spearman) of breeding values, predicted by different models, ranged from 0.911 to 0.989 and from 0.819 to 0.978, respectively.The average accuracy of breeding values estimated by model 5 was 26% lower than those estimated by model 1 (0.55 and 0.74, respectively).The major similarities between the genetic values were observed between models 1 and 2, confirming the results obtained by estimates of C (p) statistic, F test and likelihood ratio test.
The results of this study show that the inclusion of heterozygous effects in the analysis model (model 4) promotes an increase in the coefficient of determination in relation to the additive model (model 5).Very small changes occurred when joint additive effects were added, reinforced by the negative fact that these effects are strongly involved in cases of multicollinearity.
The inclusion of epistatic effects in model 2 promoted substantial gain in the adjustment.These effects affect negatively the performance of pre-weaning calves in the studied population, in accordance with Trematore et al. (1998), Fries et al. (2000) and Pimentel et al. (2006).This model was very similar to the more complete model (model 1), with the advantage of not containing multicollinearities that may add bias to the estimates.Deleterious effects were observed for epistasis, according to Arthur et al. (1999), Fries et al. (2000), Roso et al. (2005) andPimentel et al. (2006), whereby the first authors, studying a Hereford-Brahman population, were those who reported the most significant effect, where they noticed a difference of -61.5 ± 19 kg in weight at weaning adjusted to 240 days of age, in favor of animals with minimal epistasis.Fries et al. (2000) explain that, when breeds cross, genes are forced to interact and to cooperate with other genes with which they are not accustomed to.A crossed animal must then be out of harmony, and it is expected that epistasis, if important, has a negative effect.
The genetic additive direct effects negative suggest that yield potential increase by increasing the contribution of Table 5 -Regression coefficients for the models analyzed in the restricted maximum likelihood models for weaning weight genes of the Aberdeen Angus breed; but rather, the positive value for maternal additive effects indicate higher performance for offspring of cows with a higher proportion of Nelore genes.This behavior may be due to the fact that most herds of this population have been created in the Southeast and Midwest regions of Brazil, where there are known major environmental constraints and where the Nellore breed gets better performances.McMorris & Wilton (1986), studying biologically different genetic groups , noticed that heavier cows and/ or major milk producers increase their nutritional needs during the dry and lactation periods.Similarly, Ferrel & Jenkins (1985), studying the interaction between the morphological type of the cows and the environment, reported that cattle require energy for maintenance, growth, pregnancy and lactation, and that the requirements for each of these conditions vary with the type of cattle.Exclusively under pasture without supplemental feeding, the size of the cow may become a very important factor in the efficiency of production systems, since 70-75% of total energy requirements are for maintenance functions.These maintenance requirements vary more often than the requirements for other functions and they seem to be more associated with the genetic potential for production measures (growth rate and milk production), indicating that animals with high genetic potential for production, as is the Aberdeen Angus breed, may have fewer advantages or present more disadvantages in restrictive environment.

Conclusions
The dominant additive models, commonly used in genetic evaluations, do not correctly describe the changes in pre-weaning performance of the cattle in a multiracial population, and the heterozygous and epistatic effects should be included in the analysis models.The inclusion of joint additive effects does not substantially improve the adjustment promoted by the analysis models, besides inserting a bias attributed to multicollinearity.The inclusion of the heterozygous effects in the models is efficient in representing the additive quadratic effect of the breed.The study of the relationship between the independent variables must always precede the choice of a genetic analysis model so that will be estimated effects and breeding values of low accuracy, generating expectations of genetic progress that will not be confirmed.

Table 2 -
Regression coefficients and F test for the models analyzed in the ridge regression method (λ = 0.05) for weaning weight Table 1 -Regression coefficients, coefficient of determination (R 2 ) and C(p) statistic for the models examined in the least squares method for weaning weight

Table 3 -
Variance inflation factors for the purposes contained in the different models analyzed in the method of Ridge Regression for weight at weaning

Table 4 -
Estimates of genetic parameters and covariance components for weaning weight for a multiracial bovine population of the Aberdeen Angus × Nellore breed evaluated by five models