Persistency of lactation using random regression models and different fixed regression modeling approaches

Milk yield test-day records on the first three lactations of 25,500 Holstein cows were used to estimate genetic parameters and predict breeding values for nine measures of persistency and 305-d milk yield in a random regression animal model using two criteria to define the fixed regression. Legendre polynomials of fourth and fifth orders were used to model the fixed and random regressions of lactation curves. The fixed regressions were adjusted for average milk yield on populations (single) or subpopulations (multiple) formed by cows that calved at the same age and in the same season. Akaike Information (AIC) and Bayesian Information (BIC) criteria indicated that models with multiple regression lactation curves had the best fit to test-day milk records of first lactations, while models with a single regression curve had the best fit for the second and third lactations. Heritability and genetic correlation estimates between persistency and milk yield differed significantly depending on the lactation order and the measures of persistency used. These parameters did not differ significantly depending on the criteria used for defining the fixed regressions for lactation curves. In general, the heritability estimates were higher for first (0.07 to 0.43), followed by the second (0.08 to 0.21) and third (0.04 to 0.10) lactation. The rank of sires resulting from the processes of genetic evaluation for milk yield or persistency using random regression models differed according to the criteria used for determining the fixed regression of lactation curve.


Introduction
Production traits are traditionally the primary concern in defining selection objectives in dairy cattle breeding programs.However, in recent years, other traits that contribute to improving dairy herd management or efficiency of production have deserved greater attention.In order to give emphasis on productive efficiency, animal performance related to management have been incorporated into the selection objectives focused on total economic merit.Several countries have included functional traits and selection indices of total merit in their breeding programmes (Mark, 2004).
The inclusion of traits called "functional" in the selection objectives is justified by their direct impact on the economy of production systems (Boettcher et al., 1999).Improvements of functional traits in dairy cattle such as persistency of lactation, somatic cell count and speed of milking, among others, make it possible to increase profitability by reducing the milk production costs of the dairy herd.
Currently, the studies of persistency of production have been based on the fitting of random regression models (Cobuci et al., 2005), considering their fundamental properties that make it possible to take into account the differences that may exist among the lactation curves of dairy cows (Bormann et al., 2003).According to Strabel & Jamrozik (2006), the association of random regression models with the mathematical properties of orthogonal polynomials allows a simple and an efficient differentiation between production and persistency of lactation.
With respect to the fixed effects included in these models, single or multiple fixed regressions of the lactation curve have usually been used (Strabel et al., 2004;Muir et al., 2007;Costa et al., 2008).It may be argued that an improvement in the quality of the fit of random regression models for test-day milk yield can be obtained by modifying the assumption associated to the effect of the fixed regression of the lactation curve, commonly based on the average of population of cows with records included in the analyses.
The objective of this study was to evaluate an alternative modeling of the fixed regression of the lactation curves for sub-populations of cows grouped according to their age and season of calving.

Material and Methods
The data used in the study were provided by Associação Brasileira de Criadores da Raça Holandesa (ABCBRH) and consisted of 2,032,305 test-day milk yield records from nine lactations of Holstein cows calving between 1987 and 2004.In order to achieve greater consistency of data, the following editing criteria were considered: 1) only data from the first three lactations and availability of the first one to include the second and third lactations of the same cow; 2) calving years from 1993 to 2004; 3) test days within the period of 6 and 305 days in milk; 4) a minimum of the first six test day records; 5) age at calving between 20 and 48 months, 33 and 67 months and 45 and 87 months respectively for the first, second and third lactations; 6) a minimum of four records in each sub-class of herd-yearmonth test-day; and 7) exclusion of records of cows with unknown sire and dam.Also, test-day milk yield records were combined into sixteen age-season groups defined by four classes of age of cow at calving (20 to 24; 25 to 29; 30 to 34; and 35 to 48 months), and four seasons of calving (January to March; April to June; July to September; and October to December).
After editing, 363,894 test-day records of 41,560 cows from first (61.67%),second (28.43%) and third (10.30%) lactation were used for the analyses (Table 1; Figure 1).The total number of sires and dams of cows in the database was 987 and 19,915, 645 and 9,798, and 377 and 3,814, respectively, for first, second and third lactation.The final database was used to model the additive genetic, permanent environment and residual variances by random regression with Legendre polynomials of orders four and five.In all models, the residual variance was assumed homogeneous throughout the lactation period.Two modeling approaches, differing by the definition of the fixed regression of lactation curves, were evaluated for each Legendre polynomial order.The first model considered only one fixed regression based on the average test-day milk of the population (S-model for a single regression curve) and the second one modelled a fixed regression for the average test-day milk of each subpopulation corresponding to an age-season class (M-model for multiple fixed regression curves).These models, used in each lactation order, are respectively identified as S or M and described as: in which y ijk1 = test-day milk yield of cow k in the lactation 1, in period of lactation t within the classes i (herd-year-month test-day) and j (season of calving ); HYM i = fixed effect of herd-year-month test-day; S j = calving season j; b m = linear regression coefficient of milk yield on age at calving; x ijkl = age at calving; q km = vector of fixed regression coefficients specific to modeling the average lactation curve of the population; a km and p km = vectors of random regression coefficients that describe, respectively, the additive genetic and permanent environmental effects; Z klm represents the n-th parameter of the Legendre polynomial of order four or five, used in the description of the random genetic and permanent environment effects, as well as, in modeling the fixed regression of the lactation curve (average) of the The variance and covariance components of regression coefficients for random additive genetic, permanent environmental and residual effects were estimated using program REMLF90 (Misztal, 2005).Convergence was assumed when the difference between the -2log values of the likelihood functions obtained in consecutive iterations was smaller than 10 -11 .
Genetic parameter estimates for nine measures of persistency (Table 2) and 305-d milk (M305), including genetic and permanent environmental correlations between them as well as prediction of breeding values, were obtained for each model and lactation order as described by Cobuci et al. (2004).

Results and Discussion
The milk yield along lactation, which defines the lactation curve, may be divided in three phases: the first one is ascending and occurs between calving and peak production; the second one is relatively constant and occurs around peak production; and the third one, which is descending, is between the peak production until the end of the lactation.The last phase is used to describe the persistency of lactation, which may be defined as the ability of the cow to maintain the production after the peak.The persistency of milk yield is higher in the first lactation, followed by the second and third lactations (Figure 2).
In general, the values for the AIC, BIC and -2log(L) parameters in each lactation (Table 3) indicate that the goodness of fit improves as the order of the Legendre polynomial increases.Other studies using test-day records of Holstein cows in Brazil (Cobuci et al., 2006;Costa et al., 2008) indicated that Legendre polynomials of orders four and five are the most appropriate to fit random regression models for milk yield and persistency of lactation.These results are similar to those reported by Guo & Schaeffer (2002) for other breeds and countries.Also, it can be observed that regardless of the polynomial order, the AIC and BIC tests indicated the models that considered the single fixed regression models, best fitted milk yield in the second and third lactations (Table 3).Measures Author   Kistemaker (2003).Cobuci et al. (2006) reported that it is not necessary to use the same order of Legendre polynomials to model the additive genetic and the permanent environmental effects and also for the fixed regression of the lactation curve.These authors compared the fitting of different orders of the Legendre polynomials and Wilmink functions to model genetic and non-genetic effects and observed that the Legendre polynomial of order five best fitted test day milk yield, based on both AIC and BIC criteria.

EBV EBV PS
Although these criteria have been commonly used for evaluating goodness of fit when comparing alternative models, there seems not to be a consensus concerning the most suitable criterion to use in the process of choice.In fact, Liu et al. (2006) used seven different criteria to compare eighteen random regression models to fit testday milk yield and reported that each criterion indicated a different model as the best fitting one.Based on these results, they proposed an index to combine the information of those criteria to assist in the decision about the best model to fit the test-day milk yield of Holstein cows in Canada.
Another criterion used to evaluate goodness of fit is the sum of the residual variance obtained along the lactation (Liu et al., 2006).The estimates of the sum of residual variance obtained in this study for the first, second and third lactations were 6.08, 8.89 and 10.59 for models S4 and M4 and 5.61, 8.22 and 9.79 for models S5 and M5, respectively.The differences between estimates of sums of residual variances over each lactation from models using Legendre polynomials of orders four and five suggest that models using the highest polynomial order (degree) are the preferred ones, corroborating the results based on the AIC and BIC criteria (Table 3).On the other hand, there was no difference between the estimated sum of the residual variance from models that included single (S) or multiple (M) fixed regressions of lactation curves.
Although the AIC and BIC criteria indicated the models using Legendre polynomials of order five as the best ones, the choice of the model to use in genetic evaluations must not ignore other practical issues.Aspects such as computational demand, flexibility and robustness may also be taken into account as auxiliary criteria when making a decision about which models to use in genetic evaluations (Druet et al., 2003;Cobuci et al., 2006).
According to Liu et al. (2006), because genetic evaluations intensively require computing time, the choice of models to be used may also consider the computational demand.They reported that the official genetic evaluations of Holstein breed in Canada use the model with Legendre polynomial of order five and that higher order models may not be computationally feasible.
In summary, the results from several studies indicate that the use of a unique model to appropriately fit test-day milk records from different herds, regions or countries is a difficult goal to be achieved.According to Kamidi (2005) this is an empirical idea and it only would be reached if models used to fit test-day milk yield do not represent the combination of climate, management, age and lactation order effects.
In general, the heritability estimates obtained by different models were higher for the first lactation (0.07 to S4, M4, S5 and M5 -random regression models including single (S) or multiple (M) fixed regressions using Legendre polynomials of order four or five, respectively.CM -ranking order of the models fitting Legendre polynomials of the same order, according to AIC and BIC criteria.CL -ranking order of all models using Legendre polynomials of order four or five, in each lactation, according to AIC and BIC criteria.
0.43), followed by the second (0.08 to 0.21) and the third (0.04 a 0.10) lactations (Tables 4 and 5).This trend differs from results observed in other countries, where estimates of heritability (0.11 to 0.39) increased with the increase of the order of lactation (De Roos et al., 2004;Weller et al., 2006;Strabel & Jamrokiz, 2006;Togashi et al., 2007).It must be remarked that all these studies were based on only multiple trait (lactations) analysis.Possible reasons for differences in estimates reported from different studies are those related to the definitions of the persistency measure as well as to the statistical models used in the analyses.However, the results from this study are similar to those reported by Jamrozik et al. (1998), who used single trait models to analyse the persistency of milk production in each lactation order.The differences between heritability estimates for most measures of persistency in the first lactation, using models with Legendre polynomials of order four (S4 and M4) or five (S5 and M5), were small.Also, regardless of the polynomial order, small differences were observed between heritability estimates obtained from models including single (S4 and S5) or multiple (M4 and M5) fixed regressions (Tables 4  and 5).
However, the heritability estimates obtained for most measures of persistency in the second and third lactations did not differ between models of the same polynomial order (S4 and M4) or (S5 and M5) indicate that modeling one fixed regression of the test-day milk yield of a single population or multiple fixed regressions of the test-day milk of age-season of calving groups (sub-populations) did not affect the estimates of heritability for persistency.
Overall, the heritability estimates changed significantly over the persistency measures (PS1 to PS9), suggesting that a moderate fraction of the total variation of the persistency is due to additive genetic factors.The heritability estimates for PS5, PS6, PS7, PS8 and PS9 measures in the first lactation, obtained from models M4 and M5, were similar to those reported by Cobuci et al. (2006) for Holstein herds in the State of Minas Gerais, Brazil.On the other hand, the heritability estimates for PS2, PS3 and PS4 measurements were higher than the 0.24 and 0.23, 0.20 and 0.18, and 0.09 and 0.11 obtained for Holstein cows by Jakobsen et al. (2002) and Cobuci et al. (2006), respectively.
It is important to remark that the heritability estimates for the PS5 and PS6 measures were similar for all models (S4, M4, M5 and S5), in all lactation orders (Tables 4 and 5).The PS5 and PS6 measures proposed by Jakobsen et al. (2002) and by Cobuci et al. (2004), respectively, were recommended to be used in genetic evaluations of Holstein cattle in Brazil, based on their smaller genetic correlations with 305-d milk when compared with several other persistency measurements (Cobuci et al., 2006).Similarly to results obtained for heritability, changes in magnitude of the estimates of genetic and permanent environmental correlations between measures of persistency and 305-d milk obtained from all models (S4, M4, M5 and S5) in all lactations (Tables 4 and 5) were small.These results also indicate that modeling fixed regression for one single population or multiple fixed regressions for ageseason groups did not affect the estimates of genetic and permanent environmental correlations between persistency and 305-d milk.
Changes in estimates of these correlations were more expressive among the persistency measurements than among models (polynomial order).In general, estimates of genetic correlation were higher than the estimates of permanent environmental correlation.Genetic correlations ranged from 0.02 to 0.41 in the first lactation, from -0.09 to 0.41 in the second lactation and from -0.08 to 0.52 in the third lactation, indicating that persistency measures in each lactation have low genetic association with the 305-d milk.
The ideal measure of persistency is characterized by high heritability and low genetic correlation with 305-d milk (Jakobsen et al., 2002).The estimates of genetic correlation obtained for PS5 and PS6 (-0.08 to 0.22) were remarkably lower than those obtained for the other measures evaluated in this study.Cobuci et al. (2006) reported genetic correlations between persistency and the 305-d milk in first lactation ranging from -0.05 to 0.50 for these same six measures by fitting models M4 and M5, respectively.These estimates are higher than those obtained in this present study, but the genetic correlations for the measures PS5 and PS6 were lower than all other measures of persistency evaluated in both studies.Estimates of genetic correlations between PS1, PS2, PS3, PS4, PS5 and P305 in the first lactation obtained by fitting model M4 were lower than those reported by Jakobsen et al. (2002) in Holstein cows.
Estimates from previous studies indicate that genetic correlations between 305-d milk in first and second and in first and third lactations are similar and range from 0.40 to 0.60, but lower than values higher than 0.90 reported for genetic correlations between 305-d milk in second and third lactations (Van Der Linde, 2000;De Roos et al., 2004).Gengler (1995) reported that using multi-trait models may increase the accuracy of estimates of genetic parameters and prediction of breeding values for milk yield or persistency of sires.Under this scenario, the estimated breeding values (EBV) of sires for different persistency measures, obtained by the best fitting models (S5 and M5), defined by the AIC and BIC criteria (Table 3), were compared.Averages of EBV for persistency and 305-d milk differed between models S5 and M5 (Tables 6, 7 and 8, respectively).
The estimated standard deviations of EBV were lower for the PS1, PS6 and PS8 than for the other measures.The ranges of EBV were similar for these measures using models S5 and M5.For instance, the EBV obtained by model M5 for the PS6 in first lactation ranged from -4.7 to 3.0 kg, nearly 7.7 kg for the 987 sires (Table 6).
Rank correlations were slightly lower in the first than in the other two lactations.In the first lactation (Table 9), the correlations for PS5, PS6 and PS8 measurements were ≤ 0.82 when selecting the 20 best sires based on model S5.All the other estimates (Tables 10 and 11) are larger than 0.89, with the exception of the estimate 0.61 for PS6 in the third lactation when selecting the best 20 sires out of 377 based on model S5.
Overall, the results of this study indicate that fitting single or multiple fixed regressions did not affect the estimates of genetic parameters, but led to differences in ranking of sires for persistency and milk yield of Holstein cattle in Brazil.
The choice of the most suitable model for genetic evaluation of persistency depends on the goodness of fit,      which, in turn, requires decisions concerning the effects to be included in the model, primarily those defining the fixed regression effect.According to Cole & VanRaden (2006), genetic selection for persistency of lactation may become an important technology in the future; therefore, it should be considered in selection objectives to improve the productive efficiency of dairy herds (Hickson et al., 2006).

Conclusions
The inclusion of multiple instead of single fixed regression in random regression models using Legendre polynomials changes the ranking order of predicted breeding values of sires in genetic evaluation for persistency of lactation and 305-d milk yield of Holstein cattle in Brazil.

Figure 2 -
Figure 2 -Average lactation curve of milk yield in the first, second and third lactations of Holstein cows.

Table 1 -
Description of the data set used in the analyses to fit random regression models

Table 2 -
Measures of persistency of lactation used in random regression analyses EBV t -estimated breeding value on day t of lactation. 1as cited by

Table 3 -
Estimates of Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and -2log of the likelihood function (L) obtained by random regression models using Legendre polynomials and single or multiple fixed regressions to fit test-day milk yield of the first three lactations of Holstein cows

Table 4 -
Heritabilities (diagonal) and genetic (above) and permanent environment (below) correlations between the nine different measures of persistency and 305-d milk obtained by models using Legendre polynomials of order four and single (S) or multiple (M) fixed regressions S4, M4 -random regression models respectively including single (S) or multiple (M) fixed regressions using Legendre polynomials of order four.PS 1 -PS 9 -measure of persistency.

Table 5 -
Heritabilities (diagonal) and genetic (above) and permanent environment (below) correlations between the nine different measures of persistency and 305-d milk obtained by models using Legendre polynomials of order five and single (S) or multiple (M) fixed regressions S5, M5 -random regression models respectively including single (S) or multiple (M) fixed regressions using Legendre polynomials of order five.PS 1 -PS 9 -measure of persistency.

Table 6
S5, M5 -random regression models respectively including single (S) or multiple (M) fixed regressions using Legendre polynomials of order five.PS 1 -PS 9 -measure of persistency; M305 -breeding value for 305-day milk yield.

Table 7 -
Mean, standard deviation, maximum and minimum values of breeding values for persistency and milk yield in second lactation, obtained for 645 sires by fitting random regression models respectively including single (S) or multiple (M) fixed regressions using Legendre polynomials of order five S5, M5 -random regression models respectively including single (S) or multiple (M) fixed regressions using Legendre polynomials of order five.PS 1 -PS 9 -measure of persistency; M305 -breeding value for 305-day milk yield.

Table 8
S5, M5 -random regression models respectively including single (S) or multiple (M) fixed regressions using Legendre polynomials of order five.PS 1 -PS 9 -measure of persistency; M305 -breeding value for 305-day milk yield.

Table 9 -
Estimates of rank correlation between breeding values obtained by models including single or multiple fixed regressions using Legendre polynomials of order five for measures of persistency and 305 milk yield in the first lactation of Holstein cows 1 -PS 9 -measure of persistency; M305 -breeding value for 305-day milk yield. PS

Table 10 -
Estimates of rank correlation between breeding values obtained by models including single or multiple fixed regressions using Legendre polynomials of order five for measures of persistency and 305 milk yield in the second lactation of Holstein cows 1 -PS 9 -measure of persistency; M305 -breeding value for 305-day milk yield. PS

Table 11 -
Estimates of rank correlation between breeding values obtained by models including single or multiple fixed regressions using Legendre polynomials of order five for measures of persistency and 305 milk yield in the third lactation of Holstein cows PS 1 -PS 9 -measure of persistency; M305 -breeding value for 305-day milk yield.