Genetic parameters for milk production by using random regression models with different alternatives of fixed regression modeling

Records of test-day milk yields of the first three lactations of 25,500 Holstein cows were used to estimate genetic parameters for milk yield by using two alternatives of definition of fixed regression of the random regression models (RRM). Legendre polynomials of fourth and fifth orders were used to model regression of fixed curve (defined based on averages of the populations or multiple sub-populations formed by grouping animals which calved at the same age and in the same season of the year) or random lactation curves (additive genetic and permanent enviroment). Akaike information criterion (AIC) and Bayesian information criterion (BIC) indicated that the models which used multiple regression of fixed lactation curves of lactation multiple regression model with fixed lactation curves had the best fit for the first lactation test-day milk yields and the models which used a single regression of fixed curve had the best fit for the second and third lactations. Heritability for milk yield during lactation estimates did not vary among models but ranged from 0.22 to 0.34, from 0.11 to 0.21, and from 0.10 to 0.20, respectively, in the first three lactations. Similarly to heridability estimates of genetic correlations did not vary among models. The use of single or multiple fixed regressions for fixed lactation curves by RRM does not influence the estimates of genetic parameters for test-day milk yield across lactations.


Introduction
The use of more precise model definitions in genetic evaluations contributes to increase efficiency of selection programs.Improvements in modeling of environmental effects that influence the productive traits of dairy cattle can be achieved by using the test-day models.These models fundamentally evaluate differences in the lactation curves of animals (Bormann et al., 2003).Among the models that consider test-day production, random regression model has been widely shown to increase the accuracy of breeding value predictions and has been already implemented in the R. Bras. Zootec., v.40, n.3, p.557-567, 2011 official genetic evaluations of dairy cows in many countries (Strabel et al. 2004).
The use of random regression models with parametric functions as covariates, allows to split the shape of lactation curve into two parts: a general part (fixed) for assessing similarities of the lactation curves within specific groups of animals (i.e.: similar age, stage of lactation, parity and season of birth) and a second part specific for each individual animal (random) (Bormann et al., 2003).
Choosing the most appropriate model depends on decisions concerning the effects that are to be included in the model, especially those accounting for similarities of lactation curves of groups of animals under the influence of a common environmental effect (fixed part) or individual lactation curves of animals (random part).In the fixed part, one or multiple regressions of lactation fixed curves have been evaluated (Strabel et al., 2004, Costa et al., 2008, Muir et al., 2007).
The present study aimed to one or multiple regressions of fixed lactation curves included in random regression models used for the estimation of genetic parameters of test-day milk for the first three lactations of Holstein cows in Brazil.

Material and Methods
Data consisted of 2.03 million test-day records made available by the Associação Brasileira de Criadores de Gado Holandês (ABCBRH).Only records of the first three lactations of cows calving between 1993 and 2004 were considered in the analyses.Test day prior to the 5 th day and after the 305 th day of lactation were excluded.Age range by parity was 20 to 48 months for first lactation, 33 to 67 months for second lactation and 45 to 87 months for third lactation.Cows were required to have a minimum of six test-day records per lactation.Test-day observations were deleted if a herd-year-month test day contained less than four observations.After these restrictions, 25,528, 11,767 and 4,265 records, respectively, from the first, the second and the third lactation of 41,560 cows remained available for analyses with random regression models using Legendre polynomials of orders four and five (Table 1).
Four classes of age at calving (20 to 24, 25 to 29, 30 to 34 and 35 to 48 months) and four calving seasons (January through March, April through June, July through September and October through December) were combined to produce 16 age-season classes.Two models were fitted to the data.In the first model it was considered the average test-day milk yield of each sub-population which calved within the same age-season class, defining the regression of fixed lactation curves (multiple regressions) as an alternative to the model that considered the average test-day milk yield of the population defining the regression of fixed lactation curve (one single regression).The models were designated by U4, M4, M5 or U5 when referring to Legendre polynomials of orders four or five used in one (U) or multiple (M) regressions of fixed lactation curves analyses.
The random regression equations U, assuming the same sub model to fit fixed (average population test-day milk yield lactation curve), genetic and permanent environment effects, was (U), where y ijk1 = milk yield of cow k in the lactation 1, in period of lactation t within the classes i (herd-yearmonth test-day) and j (season of calving ); RAMC i = fixed effect of herd-year-month test-day; E j = calving season j; b m = linear regression coefficient of milk yield on age at calving; x ijkl = age at calving; q km = vector of fixed regression coefficients specific to modeling the average lactation curve of the population; a km and p km = vectors of random regression coefficients that describe, respectively, the additive genetic and permanent environmental effects; Z klm = n-th parameter of Legendre polynomials of order 4 or 5; e ijkl = random residual effect associated to y ijkl .
Similarly, the random regression equations M, assuming the same sub model to fit fixed (average age-season testday milk yield lactation curves), genetic and permanent environment effects, was: where y ijk1 = milk yield of cow k in the lactation 1, in period of lactation t within the classes i (herd-yearmonth test-day) and j (age-season of calving); RAMC i = fixed effect of herd-year-month test-day; β jm = vector of fixed regression coefficients specific to modeling the average lactation curve of age-season of calving classes; a km and p km = vectors of random regression coefficients that describe, respectively, the additive genetic and permanent environmental effects; e ijkl = residual random effect associated with y ijkl ; Z jlm represents the n th parameter Legendre polynomial of order 4 or 5, used in the description of genetic and permanent environment, as well as, in modeling the fixed lactation curves (averages) of animals belonging to classes of age-season of calving.The temporary environmental (residual) variance was assumed homogeneous throughout the period of lactation.
The variance and covariance components of regression coefficients for random additive genetic and permanent environmental effects were estimated using the REMLF90 program (Misztal, 2005).Convergence was assumed when the difference between the -2log values of the likelihood functions obtained in consecutive iterations was smaller than 10 -11 .Fit of different models was compared by examining the value of the maximum likelihood function (-2LogL), the Akaike information criterion-AIC (Akaike, 1973) and the Bayesian information criterion-BIC (Schwarz, 1978).

Results and Discussion
Covariance estimates between random regression additive and permanent environment coefficients estimated   2 and 3).For those models, regardless of the lactation, either for additive genetic or for permanent environment effects, the highest variance estimates were associated with the first coefficient (a 1 and p 1 ).The correlations between regression coefficients for both additive genetic and permanent environment effects ranged from -0.67 to 0.33 for the first, from -0.46 to 0.31 for the second and from -0.53 to 0.30 for the third lactation (Tables 4 and 5).The largest values were observed for the additive genetic effect.In general, correlation estimates were similar among models U4, U5, M4 and M5, indicating no difference in the estimates of correlations between the random regression coefficients when modeling single or multiple regression fixed lactation curves.Araújo et al. (2006), in Brazil, andPool et al. (2000), in the Netherlands, reported correlations among regression coefficients estimated with Legendre polynomials of order 4 and 5, ranging from 0.57 to -0.54 and from -0.53 to 0.35, respectively, for milk production of first lactation of Holstein cows.
The interesting mathematical properties of orthogonal polynomials (Strabel & Jamrozik, 2006) have supported this function to be considered as the most suitable for fitting test-day milk yield data (Schaeffer, 2004).However, the biological interpretation about their coefficients is more difficult than other parametric functions, such as the Wilmink function and the polynomial function of Ali & Schaeffer.Strabel & Jamrozik (2006) reported that the first Legendre polynomial coefficient (a 1 ), associated with the additive genetic effect, is related to total milk production and the second (a 2 ) is related to persistence in the lactation curve.Accordingly to this interpretation, the genetic correlation estimates between the additive genetic regression coefficients (a 1 and a 2 ) indicate low genetic association between total milk production and persistency.The estimates from M4 and M5 models were lower than those obtained from the U4 and U5 models (Tables 4 and 5).The trajectories of the additive genetic, permanent and temporary environment variance estimates for testday milk yields showed a similar pattern for all lactations, Legendre polynomials of orders 4 and 5 and also for the alternatives of modeling the regression of fixed lactation curves (Figures 1 to 3).
Additive genetic variances estimates did not differ between single or multiple regression of fixed lactation curves models.The higher estimates were observed between 210 and 270 days of lactation, decreasing thereafter to the end of lactation (Figure 1).
The decrease in genetic variance in the extreme periods of lactation of the Holstein breed has been commonly observed in several studies (Pool et al., 2000, López-Romero & Carabaño, 2003, Costa et al., 2008).These authors have pointed that these trends may be associated to a lower number of production records and to the inadequacy of the mathematical functions to describe the environmental and genetic effects in these periods of lactation.
The permanent (Figure 2) and temporary environment (Figure 3) variance estimates obtained by single or multiple regressions of fixed lactation curves models were practically identical, differing only in their magnitudes across lactations.
Permanent environmental variances showed higher variability at the beginning and at the end of lactation (Figure 2).These results are similar to those observed by Pool et al. (2000), Lopez-Romero & Carabaño (2003), Cobuci et al. (2005), Fujji & Suzuki (2006) and Araújo et al. (2006) for first lactation test-day milk yield of Holstein cows.The temporary environment (residual) variance estimates increased from the first to third the lactation and showed similar magnitudes for single (U) or multiple (M) regression of fixed lactation curves models (Figure 3).Residual variances estimates decreased by approximately 7.5% when the order of the Legendre polynomial increased from four to five.
The estimates of residual variance for the first, the second and the third lactation were, respectively, 6.08, 8.89 and 10.59 kg 2 , for models M4 and U4, and 5.61, 8.21 and 9.79 kg 2 for models M5 and U5.These values are very close to the values 7.16, 6.23 and 4.79 kg 2 obtained in Holstein cattle by Cobuci et al. (2005), Araujo et al. ( 2006) and Costa et al. (2008), respectively.Costa et al. (2008) concluded that although the residual variance for test-day milk yield was heterogeneous across the lactation period, assuming that homogeneous residual variance would be a parsimonious option when making decisions about fitting test-day milk yields of Holstein cows in Brazil.However, El Faro & Albuquerque (2003) indicated that heterogeneous residual variances should be considered in modeling Caracu test-day milk yield by random regression.Fujii & Suzuki (2006) reported heterogeneity of residual variances over the years but concluded that by ignoring it did not affect the ranking of breeding values of the best bulls and cows of the Holstein breed in Japan.
Heritability estimates practically did not differ between single and multiple regression of fixed lactation curves models but were larger for the first than for the second and third lactations (Figure 4).The small differences in heritability estimates between models (U4, U5 or M4 and M5) do not indicate a preferred order of the Legendre polynomial.This result does not agree with the reports of several authors that have emphasized that the order of Legendre polynomial in random regression models is an important aspect, inasmuch as the genetic parameter estimates may differ according to the order of this polynomial (Cobuci et al., 2006;Costa et al., 2008).
Heritability estimates obtained by models that considered multiple regressions for fixed lactation curves (M4 and M5) ranged from 0.22 to 0.32, from 0.11 to 0.21, and from 0.10 to 0.19, respectively, for the first, the second and the third lactations and from 0.23 to 0.34, from 0.11 to 0.21 and from 0.10 to 0.20 for the model that considered a single regression of the fixed curve of lactation (U4 and U5).Higher heritability values for the first lactation than for the following two lactations were also observed by Liu et al. (2000).
Trends of the heritability estimates during the second and the third lactation were more similar than those between the first and the second or the third lactation.In general, values increased from the beginning until 210-240 days of lactation and decreased thereafter to the end of the lactation.These trends are similar to those observed by Liu et al. (2000), Jakobsen et al. (2002), Bormann et al. (2003) andDe Roos et al. (2004).
Variations in heritability estimates across lactation were associated to different trends in genetic and permanent environment variances.During lactation, while permanent environmental variance remained relatively constant (Figure 2), the additive genetic variability gradually increased (Figure 1), resulting in higher estimates of heritability in mid lactation than in the extreme periods.During late lactation, estimates of heritability decreased due to simultaneous reduction in genetic variance and increase in permanent environmental variance.
Similarly to the estimates of heritability, genetic correlation between milk production in selected days of lactation did not differ among lactation orders, regardless of the regression of fixed curve of lactation model and the Legendre polynomial used (Figures 5 and 6).As usually observed in most of studies using random regression models, genetic correlation between adjacent test-day were high, but decreased as the length between them increased.
The values of genetic correlations ranged from 0.13 to 0.99 for the first, from 0.16 to 0.99 for the second, and from 0.20 to 0.99 for the third lactation.The variation in genetic correlation estimates was larger in the first lactation than in the second and third lactations, although the trends within lactation were similar for all parities.These results are in agreement with previous studies which have reported the effect of parity on the estimation of genetic parameters in dairy cattle (Liu et al. 2000;Guo et al., 2002).Jakobsen et al. (2002) and Araújo et al. (2006) reported genetic correlations estimates higher than 0.40 for first lactation test-day milk yield of Holstein cows, therefore much higher than some estimates observed in this study.However, lower estimates, even close to zero, were obtained for genetic correlations between test-day milk yields in extreme periods of lactation by Strabel & Misztal (1999) and Cobuci et al. (2005) when using different functions in random regression models.
The choice for the best random regression model has been commonly taken based on test results of Akaike information criterion (AIC) or Bayesian information criterion (BIC).Both tests allow the comparison of non nested models and penalize the most parameterized models, the latter being more rigorous, with a tendency to penalize more, the most parsimonious models.The values of -2 logL increased and AIC and BIC decreased as the order of Legendre polynomial increased in each lactation.Regardless of the order of the polynomial used, AIC and BIC values indicated that the models including multiple regressions of fixed lactation curves better fitted test-day milk yield in the first lactation, but the single regression fixed lactation curve model as the best for fitting test-day milk yields in the second and third lactations (Table 6).Misztal et al. (2000) observed that the choice of the mathematical function to describe fixed and random effects is the key element in fitting random regression models.After evaluating the Wilmink and Ali & Schaeffer parametric functions and Legendre polynomials, Cobuci et al. (2006) and Costa et al. (2008) reported that the Legendre polynomials of orders 4 and 5 were the most appropriated for fitting test-day and persistency of milk yield, by random regression, in Holstein cows in Brazil.
Most of the studies do not indicate a consensus about the best criterion for choosing among different models and functions used to fit test-day records by random regression.According to Liu et al. (2006), as far as genetic evaluation essentially involves the use of computers, the final decision about a model may also be based on computational requirements.Besides this practical aspect, flexibility and robustness should also be considered when making decisions on the best random regression model to be used (Druet et al., 2003 andCobuci et al., 2006).
The indication of a specific model to fit test-day milk yield has been difficult to be achieved.According to Kamidi (2005), this empirical state is likely to remain as long as modelled production patterns traverse combination of climate, management, health, age and parity order effects.

Conclusions
The use of multiple regressions for fixed lactation curves improves the goodness of fit of test day milk yield by random regression models without changing the magnitude of genetic parameters for milk yield along lactations of Holstein cows.The single regression should not be used to replace the multiple regression of fixed lactation curves model.

Figure 1 -
Figure 1 -Additive genetic variance in the first (La1), the second (La2) and the third (La3) lactations obtained by random regression models with Legendre polynomial of order four (A) or five (B).

Figure 2 -
Figure 2 -Permanent environmental variance in the first (La1), the second (La2) and the third (La3) lactations obtained by random regression models with Legendre polynomial of order four (A) or five (B).

Figure 3 -
Figure 3 -Residual variance estimates in the first (La1), second (La2) and third (La3) lactations obtained by random regression models with Legendre polynomial of order four (A) or five (B).

Figure 4 -
Figure 4 -Heritability estimates in the first (La1), second (La2) and third (La3) lactations obtained by random regression models with Legendre polynomial of order four (A) or five (B).

Figure 5 -
Figure 5 -Genetic correlations in the first (A), second (B) and third (C) lactations obtained by fitting single (left) or multiple (right) regressions models with Legendre polynomial of order four.

Figure 6 -
Figure 6 -Genetic correlations in the first (A), second (B) and third (C) lactations obtained by fitting single (left) or multiple (right) regressions models with Legendre polynomial of order five.

Table 2 -
Estimates of covariance components for random regression coefficients obtained by single regression models

Table 3 -
Estimates of covariance components of the random regression coefficients obtained by multiple regressions models U4 and U5 -models using Legendre polynomials of orders four and five and including single (U) regression of fixed lactation curve.

Table 4 -
Estimates of correlations between the random regression coefficients obtained by single regression models

Table 5 -
Estimates of correlations between the random regression coefficients obtained by multiple regression models M4 and M5 -models using Legendre polynomials of orders four and five and including multiple (M) regressions of fixed age-season of calving lactation curves.

theory and an extension of the maximum
AKAIKE, H. Information likelihood principle.In: INTERNATIONAL