Heteroskedasticity for weaning weight of Charolais-Zebu crossbred calves 1

- The objective of the present study was to analyze models with genetic and/or residual heteroskedasticity for genetic evaluation of the weaning weight of Charolais-Zebu crossbred calves. Weaning weight data from 56,965 crossbred calves were analyzed using animal models with different combinations of genetic and residual heteroskedasticity and/or homoskedasticity. The inference on a posteriori distributions of genetic parameters were by the Monte Carlo method via Markov chains. The model with genetic and residual heteroskedasticity was the best fit on the data. Groups of animals with different genetic compositions, expressed as percentages of Charolais-Zebu breed alleles and individual and maternal heterozygosis, had different genetic variances. These genetic variances could be modeled by linear functions of the Charolais and Zebu genetic variances and the variance attributed to segregation. The breed compositions, the individual and maternal heterozygosis, the sex and age of dam at calving were significant sources of residual heteroskedasticity. The a posteriori means for heritabilities and sire and dam classifications were altered due to genetic and residual heteroskedasticity.


Introduction
Crosses in beef cattle have accounted for the increase in volume of crossbred animal data available for genetic evaluation. The interest in using crossbred cows and, or, bulls in some production systems has suggested the need to develop alternatives for genetic evaluation of these animals so that sires and dams with different genetic compositions can be compared properly.
The assumption of homogeneity in genetic variances, normally presumed in intra-breed genetic evaluations, may not be true in the case of multiple-breed populations (Cardoso & Tempelman, 2004;Oliveira et al., 2010), because the breeds used may have been submitted to different selection processes causing modifications in their genetic compositions, especially mean and variance, in different directions. Arnold et al. (1992) proposed expanding the genetic evaluation procedures using mixed models to incorporate data from crossbred animals into genetic evaluations, while Lo et al. (1993) proposed an alternative to obtain between-parent covariance in multiple-breed populations considering specific genetic variances for the purebred breeds and a linear function of the genetic variances and between-breed segregation variance to obtain crossbred group genetic variances. Segregation variance can be interpreted as the increase in the additive genetic variance in the F2 generation compared with the F1 generation (Lande, 1981), which can be attributed to recombination for gamete formation of the F1 parents.
Differences in management and data collection precision (Martins, 2002;Cardoso et al., 2005), such as problems in modeling some environmental effects, can cause residual heteroskedasticity in pure or crossbred animal populations and Bayesian Hierarchical models can be used to overcome these situations (Cardoso et al., 2005). Violations in the assumptions of homogeneity in genetic and residual variances can result in errors of animal classification and reduce genetic progress. Thus, the present study was carried out with the objectives of comparing the fit of models with different assumptions on genetic and residual variances, identifying heteroskedasticity sources and verifying the impact of violations on the assumptions for variances in the genetic evaluation of Charolais-Zebu crossbred cattle for the weaning weight.

Material and Methods
The data used in the present study were supplied by Associação Brasileira de Criadores de Canchim (ABCCAN) and included the genealogical data and the weaning weight of Charolais-Zebu crossbred calves used to obtain Canchim breed animals.
The data file analyzed consisted of 56,965 weaning weight observations (225 days of age) of calves born between January 1988 and February 2005, offspring of 1,600 bulls and 27,122 cows (with 1,929 maternal grandsires) that calved at 2-15 years of age, distributed in 4,458 contemporary groups from 247 farms, located in the southern (RS, SC and PR), southeastern (SP, RJ and MG), midwestern (MS, MT, GO and DF), northeastern (BA, PE, PI and MA) and northern (TO and PA) regions of Brazil. The relationship matrix consisted of 87,312 animals.
The contemporary groups were formed by the concatenation of the variables farm, year and birth season (season 1 between September and November, season 2 between December and February of the following year, season 3 between March and May and season 4 between June and August), calf sex and feeding management at weaning (pasture, fertilized pasture, fertilized pasture with rotational grazing, irrigated pasture, supplemented pasture and feedlot).
To model contemporary group and genetic interaction effects, calves were placed in genetic groups according to the individual and maternal genetic compositions as presented by Toral et al. (2010 and2011). Other information on the editing of the relationship matrix, calculations of the expected Charolais allele percentages, heterozygosis and the crosses analyzed can be obtained in Toral et al. (2009Toral et al. ( , 2010Toral et al. ( and 2011. Bayesian Hierarchical Models (Sorensen & Gianola, 2002) were used to analyze the data available. To represent the weaning weight of animal i present in vector y of observations (56,965 x 1), the following linear model was used: , [1] where: μ represents a constant inherent to all the observations; β, fixed effect vector (4,478 x 1) (co-variables and classificatory effects); a, random vector of animal additive genetic effects (56,965 x 1); m, random vector of maternal additive genetic effects (27,122 x 1); p, random vector of not correlated maternal environment effects (27,122 x 1); q, random vector of not correlated contemporary group and genetic group interaction (9,013 x 1); and are known incidence line vectors, that relate y i with β, a, m, p and q, respectively. Initially e i ~ NID(0,σ 2 e i ) was assumed for all i =1, ..., 56,965, where σ 2 e i represents a specific residual variance for each i.
The following effects were included in the β vector: contemporary group (4,458); regression coefficients for individual and maternal Charolais percentage (2); regression coefficients for individual and maternal heterozygosis (2); regression coefficients for the effect of cow age at calving (8); regression coefficients for the effect of cow age at calving and cow genetic composition interaction, nested in calf sex (8). The effect of cow age at calving was modeled using segmented polynomials, with knots at 6.33 and 10.66 years of age, and nested in calf sex. The first segment considered the linear and quadratic coefficients and the other segments only considered the quadratic coefficients.
The residual variances were considered as multiplicative functions of the fixed effects: , [2] where: σ 2 e functions as reference parameter in the equation [2], similarly to that represented by μ in equation [1], but on a multiplicative scale; and γ = [γ 1 γ 2 … γ 20 ] ′ specific regression parameters, which can cause residual heteroskedasticity, using the information in the form of covariables p = [p 1 i p 2 i … p 20 i ] ′ specific to each animal i.
The following posterior density was assumed for the fixed effects , where β 0 and V β were hyperparameters of a bounded distribution. For the random effects, the following a priori densities were assumed: ; ; ; , where: I t represents identity matrixes of order t; σ 2 p , variance of the non-correlated permanent maternal environmental effect; and σ 2 q , variance of the not correlated contemporary group and genetic group interaction effect.
In multiple-breed populations, the matrices of additive genetic variance G (ϕ a ) and G (ϕ m ) can be functions of more than one dispersion parameter, contained in ϕ a and ϕ m (Cardoso & Tempelman, 2004). The ith element of the diagonal of the matrices G (ϕ a ) and G (ϕ m ) were computed using the tabular method by Lo et al. (1993) (G(ϕ a )) -1 was computed as , [4] where: I, identity matrix of order 87,312; P, progeny-parent relating matrix; and Ω(ϕ a ), a diagonal matrix with the i th element defined as . [5] (G(ϕ m )) -1 was computed as in [4], substituting the direct additive with the maternal genetic variances.
Inverted χ 2 a posteriori densities were assumed for the components of variance: where: υ x and S 2 x represent the a posteriori density hyperparameters.
Four models were used to analyze the weaning weight data of crossbred Charolais-Zebu calves with differences in the residual and genetic variances specified in [2] and [3], respectively: Model 1 (GhoRho) -the regression parameters specified in γ = [γ 1 γ 2 … γ 20 ] ′ were all equal to one, so that [2] was reduced to σ 2 Sm CZ = 0, and the genetic variances for the different genetic groups were equal. This model is equivalent to a conventional intrabreed model for the random effects, assuming genetic and residual homoskedasticity.
Model 3 (GheRho) -the regression parameters specified in γ = [γ 1 γ 2 … γ 20 ] ′ were all equal to one, so that [2] was reduced to σ 2 e i = σ 2 e ; specific genetic variances to each breed and segregation variance different from zero. In this model, genetic heteroskedasticity and residual homoskedasticity were considered.
Model 4 (GheRhe) -no restrictions for the regression parameter values in [2] and the variances in [3]. This model presented genetic and residual heteroskedasticity.
The inferences for the four models were based on methods of Monte Carlo using Markov chains (MCMC) with 220,000 cycles. The samples were obtained at every 200 cycles and those from the first 20,000 cycles were disregarded. The chain size and the discarding and sampling intervals were defined from graphic analysis of the samples, a posteriori means and effective sample size obtained in preliminary analyses. The INTERGEN (Cardoso, 2008) program was used to carry out these analyses.
The a posteriori means and standard deviation for the parameters of interest and the effective sample size (ESS) were calculated. The ESS, obtained for each parameter of the models, is an estimate of the number of independent samples, containing information equivalent to that contained in the 1,000 independent samples, obtained after the discard period (Cardoso et al., 2005). An estimate of the Monte Carlo variance, denoting initial positive sequence estimator (Var (μ^)), and the lag-t autocovariance of the stationary Markov chain (γ^(t)) were used to calculate the ESS for the parameter (s = 1, ..., S) of the w model (Geyer, 1992, cited by Sorensen & Gianola, 2002), as follows: . [6] The following model fit and comparison criteria were considered: Deviance Information Criterion (DIC) (Spiegelhalter et al., 2002, cited by Sorensen & Gianola, 2002 and the Bayesian factor (Good, 1958, cited by Sorensen & Gianola, 2002. In the INTERGEN program, these criterion are computed as follows (Cardoso, 2008): where: DIC w is the DIC for the model w (M w ); m is the number of samples of the a posteriori distribution; and are marginal probabilities of the observations y; θ j contains the parameters obtained in sample j; and θ _ is the a posteriori mean of the parameters. Lower DIC values represent better fit (Sorensen & Gianola, 2002).
The a posteriori means of the expected progeny differences (PMEPDs) for the direct and maternal additive genetic effects, of the sires and dams that had at least one calf with weaning weight recorded between March 2000 and February 2005 were considered. These PMEPDs were adjusted (PMEPDs aj ) by procedures similar to those adopted by Notter & Cundiff (1991) and Van Vleck & Cundiff (2005), to enable comparison between sires and dams of different genetic groups, but that can be used to produce animals of the Canchim breed. PMEPDs aj were obtained for bulls in groups 5/8 Charolais, Canchim and MA (21/32 Charolais) and for cows in groups V (7/16 Charolais), 5/8 Charolais, Canchim and MA. PMEPDS aj were also obtained for Charolais bulls and cows from group A (5/16 Charolais), because this cross, whose products are from the MA group, accounted for almost 20% of the data. The PMEPDs aj 's for the direct additive effects (PMEPDs aj.i ) of animal i were obtained by , [9] where: PMEPDs aj.i represents the PMEPDs aj of the direct additive genetic effect of animal i; PMb 1 , the a posteriori mean of the weaning weight regression coefficient in function of the Charolais breed allele percentages (CP) of the animals; and CP i , the CP of each animal. Kendall correlations were calculated between the a posteriori means of the expected progeny differences (EPD) of the sires and dams of the genetic groups used to produce the Canchim and MA animals obtained in the four situations assessed, using the PROC CORR of SAS (Statistical Analysis System, version 9.1).

Results and Discussion
The lower value for the DIC of the model with genetic and residual heteroskedasticity showed that this model provided the best fit compared with the others. This result was confirmed by the Bayesian Factor (Table 1).
The Bayesian Factor values to compare the model with genetic and residual heteroskedasticity indicated greater a posteriori chance of the first providing a better fit to the weaning weight data, compared with the other models. These results are in line with those reported by Cardoso & Tempelman (2004) and Cardoso et al. (2005), for the post weaning weight gain in Hereford-Nellore population and also agree with the results by Oliveira et al. (2010) for post weaning weight gain in an Angus-Nellore population. Differences in the genetic variances can be attributed to the different selection processes and selection intensities. The founding animals of the Charolais breed were selected to increase the growth rate, which resulted in a large-size breed (Santiago, 1975). Zebu breed animals, in spite of not being subjected to the systemic processes of selection in their locations of origin, were exposed to the action of natural selection for a long time. Regarding the selection time practiced on the Charolais breed, selection for growth in the Zebu breeds can be considered only recent.
According to Cardoso et al. (2005), the model with residual heteroskedacisity was a better fit than the model that assumes residual homoskedasticity. In the present case, the residual variance was modeled as a function of the calf and cow genetic compositions, cow age at calving and the cow age at calving and genetic composition interaction. Residual heteroskedasticity in function of cow age can be explained, at least partially, by alterations in the number of observations over time, because older cows are culled and there are problems in modeling the effect of cow age at calving on the trait in question, especially for the ages with fewer records. To consider this situation, statistical models suitable for modeling the effect of cow age at calving on weaning weight and that are robust regarding alterations in the number of observations assessed and implemented in the genetic evaluations are important.
Differences in the residual variances in function of the genetic group can be attributed to the fact that, in general, the animals in different genetic groups were subjected to different feeding and management systems so that the genetic potentials of each group were properly exploited. The maintenance of at least two genetic groups under the same feeding and management systems (generally contemporary group) could be a useful alternative to reduce the influence of heteroskedasticity on the genetic evaluations. However, this measure could run into theoretic and practical problems when applied on the farms, and the best alternative seems to be the use of techniques that consider the existence of different residual variances for each genetic group, as presented in the this study.
The low DIC value of the model with genetic homoskedasticity and residual heteroskedasticity, compared with that with genetic heteroskedasticity and residual homoscedasticity, and the comparison of these models by the Bayesian Factor indicated greater a posteriori likelihood that the model with the hypothesis of residual heteroskedasticity would be a better fit to the weaning weight data compared with that of the hypothesis of genetic heteroskedasticity. However, this result could not be generalized because the magnitude of the differences between genetic and residual variances depends on the GhoRho -genetic and residual homoskedasticity; GhoRhe -genetic homoskedasticity and residual heteroskedasticity; GheRho -genetic heteroskedasticity and residual homoskedasticity; GheRhe -genetic and residual heteroskedasticity. Under the assumption of genetic and residual heteroskedasticity, the a posteriori means of the direct additive and maternal genetic variances for the Charolais and Zebu breeds were different, and the variances of the Zebu breeds were greater (Table 2).
This result was in line with those reported by Elzo & Wakeman (1998) and Cardoso et al. (2005), who found small additive genetic variance for weaning weight and post weaning weight gain, respectively, in the European breeds (Angus and Hereford) compared with the Zebu breeds (Brahman and Nelore). These results can be explained because the European (Charolais) and Zebu animals were subjected to different selection processes. While the first group was selected for greater growth rate in temperate environments, the second was selected for greater adaptation to tropical environments. Furthermore, different selection intensities may also have caused alterations in the genetic variances, because of alterations in the allele frequencies, in gametic disequilibrium and inbreeding (Sorensen & Kennedy, 1984).
The variance attributed to between-breed segregation is a measure of how much additive genetic variance was superior in the F2 generation compared with the F1 generation (Lande, 1981;Lo et al., 1993), and can be attributed to recombination for gamete formation of the F1 parents. In the present case, the direct additive genetic variance attributed to the between-breed segregation represented 26.8% and 22.2% of the direct additive genetic variance in the Charolais and Zebu breeds, respectively. These values were superior to the values of 6.3% and 4.1% reported by Elzo & Wakeman (1998) for weaning weight for the Angus and Brahman breeds, and to the values of 5.6% and 12.9% reported by Oliveira et al. (2010) for post weaning weight gain for Angus and Nellore breeds, both assuming residual homoscedasticity. Cardoso et al. (2005) worked with post weaning weight gain and heteroskedasticity and observed that the variance in between-breed segregation represented 26.7% and 7.5% of the additive genetic variance in the Hereford and Nellore breeds, respectively.
The maternal additive genetic variances, attributed to between-breed segregation, represented 36.4% and 26.0% of the maternal additive genetic variances in the Charolais and Zebu breeds, respectively. The variances attributed to the permanent maternal environment effect and the genetic group and contemporary group interaction were also responsible for a significant part of the phenotypic variance, -specific Charolais and Zebu breed direct additive genetic variances and the Charolais-Zebu breed segregation variance, respectively, for the models with genetic heteroskedasticity; σ 2 m -maternal additive genetic variances, for the models with genetic homoskedasticity; σ 2 m C , σ 2 m Z and σ 2

Sm CZ
-specific maternal additive genetic variances for the Charolais and Zebu breeds and Charolais and Zebu breed segregation variance, respectively, for the models with genetic heteroskedasticity; σ 2 p -permanent maternal environmental variance; σ 2 q -variance of genetic group and contemporary group interaction; and σ 2 e -residual variances for the models with residual homoskedasticity and residual reference parameters for the models with residual heteroskedasticity. Table 2 -A posteriori means (PMEANS) and standard deviation (PSTD) for the components of variance (CV) for weaning weight and effective sample size (ESS) according to the assumptions on the genetic and residual variances which confirmed the importance of including these effects in genetic evaluations of weaning weight. Considering the a posteriori means for the regression coefficients responsible for the residual heteroskedasticity (γ m parameters in [2]), it was observed that increases in the percentage of Charolais breed alleles (individual) and of heterozygosis (individual and maternal) resulted in less residual variability, because the a posteriori means for these coefficients(γ 1 ,γ 3 and γ 4 ) were less than one. Cardoso et al. (2005) found probability intervals for the effect of individual genetic composition that included the value one and so did not obtain evidence of the effect of calf genetic composition on the residual variability. However, these authors reported that when the individual heterozygosis was large, the residual variability would be smaller. This result is in line with the concept of genetic homeostasis (Lerner, 1954), according to which heterozygote individuals are less influenced by environmental factors.
The a posteriori mean for the effect of the Charolais allele percentage in the cows (γ 2 ) indicated that the calves of cows with greater Charolais percentage were raised in situations of greater residual variability. Considering that the increase in the Charolais percentage in the cow increases in size and nutritional requirement, it is possible that the performance of cows with greater percentages of Charolais alleles was more vulnerable to environmental alterations, thus causing greater environmental variability for the calves. On the other hand, increase in the Charolais percentage in the calves may not be associated to increase in residual variance because the maternal heterozygosis may diminish the effect of the unfavorable environment on the variability of the individual performance.
The a posteriori means for the effect of cow age at calving and the effect of cow age at calving multiplied by the Charolais allele percentage of the cow, for males (γ 5 to γ 12 ) and females (γ 13 to γ 20 ), were in general close to Figure 1 -A posteriori means and standard deviations for the residual variances of the weaning weight (RV, kg 2 ) in function of the cow age at calving and calf sex, for the crosses available in the data set, assuming genetic heteroskedasticity.
one, suggesting little residual variability in function of these effects when considered alone. However, when the a posteriori means and standard deviation were analyzed for each genetic group present on the database, in function of cow age at calving, it was observed that together, these effects modified the residual variances (Figure 1). The a posteriori means and standard deviation for the residual variances, assuming genetic homoskedasticity, presented tendencies similar to those in Figure 1 (data not shown).
The greatest a posteriori means and standard deviation of the residual variances were observed for the groups with fewer records (15/32 Charolais cow × Zebu bull, 1/2 Charolais cow × Zebu bull, Canchim cow × Zebu bull and 5/16 Charolais cow × Canchim bull). Increases in the a posteriori means and standard deviation with increase in cow age at calving were observed for all the groups. The positive association between residual variances and cow age at calving may have occurred in function of selection and reduction in the number of cows giving birth at more advanced ages and because of the problems of modeling the effect of cow age at calving on the trait. When considering that the data corrected for the fixed effects of environment, in which animals raised in conditions with greater residual variability have proportionately less genetic contribution in their composition than the weighting that would be applied to them (Martins, 2002), it is possible that data of the calves of older cows were overvalued in the genetic evaluation that reduced the efficiency of this process.
The residual variances for the males were also in most cases superior to the residual variances for the females. This result was in agreement with reports by Rodriguez-Almeida et al. (1995) and Cardoso et al. (2005). Males have greater growth potential than females in function of sexual dimorphism. However, the size of the differences between males and females depend on the environmental conditions for the expression of these genetic potentials. In favorable environments, phenotype expression is limited by the Figure 2 -A posteriori means and standard deviation for the direct heritabilities for weaning weight, in function of cow age at calving and calf sex, for the crosses available in the data set, assuming genetic heteroskedasticity.
genetic potential for growth which, in general, is greater in males, which can, then, cause less residual variance in males. In restrictive environments (under which part of the data used in the present study were obtained), phenotypic expression is also limited by the environment. If the environmental conditions are unfavorable, such as nutrition, for example, those individuals with greater nutritional requirements in the case of the males become more vulnerable and present more variable responses. Significant variations were observed between 0.08 and 0.16 in the a posteriori means of direct heritability (Figure 2), which was expected due to the alterations in the genetic and residual variances.
The direct heritabilities for the weaning weight obtained in the present study were lower than the values of 0.48 and 0.35 estimated by Mello et al. (2002) and Toral et al. (2007), respectively, who used only data from the Canchim breed, calves of Canchim cows, but was similar to the estimate of 0.17 reported by Barichello et al. (2010).
The a posteriori means of maternal heritability that, in the present study, ranged from 0.03 and 0.06 (Figure 3), were intermediary to the values of 0.04 and 0.09 reported for the Canchim breed (Mello et al., 2002;Barichello et al., 2010).
It is possible that, in addition to the differences between data sets, including the effects of genetic groups contributed to obtaining different values for the direct and maternal heritabilities. The values obtained in the present study indicated that selection can be used as a tool to modify herd genetic composition in the sense of increasing weaning weight. However, crossbreeding can also be used to reach this objective because the weaning weight is also influenced by non-additive genetic effects.
Values lower than one were obtained for the classification correlations between the PMEPDs aj for the direct additive genetic effects of the models with different assumptions regarding the genetic and residual variances (Table 3), suggesting differences in the individuals selected as genetically superior. Figure 3 -A posteriori means and standard deviation for the maternal heritability for weaning weight in function of cow age at calving and calf sex for the crosses available in the data set, assuming genetic heteroskedasticity.

Conclusions
Models with the assumptions of genetic and residual heteroskedasticity were the best fit compared with the models that assumed homogeneous genetic variances and/ or residual homoskedasticity. The variation attributed to between-breed segregation, breed composition and individual and maternal heterozygosis, sex and cow age at calving together were sources of genetic and residual heteroskedasticity. The violations of the assumptions for the variances in the weaning weight resulted in alterations in the genetic values and bull and cow classifications.  Table 3 -Kendall correlations between the a posteriori means of the expected progeny differences for the direct additive genetic effects, obtained under the different assumptions on the genetic variances (GV) and residual variances (RV) for bulls and cows used to obtain Canchim and MA (21/32 Charolais) animals