Estimation of genetic parameters for test-day milk yield in Holstein cows using a random regression model

Test-day milk yield records of 11,023 first-parity Holstein cows were used to estimate genetic parameters for milk yield during different lactation periods. (Co)variance components were estimated using two random regression models, RRM1 and RRM2, and the restricted maximum likelihood method, compared by the likelihood ratio test. Additive genetic variances determined by RRM1 and additive genetic and permanent environmental variances estimated by RRM2 were described, using the Wilmink function. Residual variance was constant throughout lactation for the two models. The heritability estimates obtained by RRM1 (0.34 to 0.56) were higher than those obtained by RRM2 (0.15 to 0.31). Due to the high heritability estimates for milk yield throughout lactation and the negative genetic correlation between test-day yields during different lactation periods, the RRM1 model did not fit the data. Overall, genetic correlations between individual test days tended to decrease at the extremes of the lactation trajectory, showing values close to unity for adjacent test days. The inclusion of random regression coefficients to describe permanent environmental effects led to a more precise estimation of genetic and non-genetic effects that influence milk yield.


Introduction
Nowadays, the genetic evaluation of dairy cattle using models that consider test-day milk yields has been officially adopted by several countries (Jensen, 2001).However, a few years ago, aggregated test-day milk yield data representing total milk yield, normally standardized for a period of 305 days, were used as a standard measure in evaluations of production traits.
Recently, 305-day milk yield (P305) was replaced by test-day milk yield (Ferreira et al., 2002), with the latter approach showing several advantages: a) it permits the removal of environmental variation in phenotypic data on milk yield, since test-day milk yield considers the specific environmental effects for each production record, which is not possible when P305 data are used (Visscher and Goddard, 1995); b) it grants a more accurate evaluation of cows, due to the use of a larger number of records per cow, as compared to the same records fitted to P305 (Rekaya et al., 1999); c) it is not affected by the accuracy of the differ-ent prediction methods for P305 (Rekaya et al., 1999), because it permits the use of part lactation information, without the need for adjusted factors and/or lactation prediction; d) it facilitates the genetic evaluation of lactation persistency (Jensen, 2001); e) it permits a more accurate estimation of the genetic and permanent environmental effects that affect milk yield.
Several models have been proposed for the genetic evaluation of dairy cattle (Strabel and Misztal, 1999).According to Jamrozik et al. (1997a) and Van Der Werf et al. (1998), random regression models (Henderson Jr., 1982;Laird and Ware, 1982) are more flexible, accurate and precise than traditional models (multiple trait models).
In view of their various advantages and the recent progress in the area of computer science, the use of random regression models has been indicated for studies analyzing production traits (Pool andMeuwissen, 1999, 2000;Olori et al., 1999;Lidauer et al., 2000;Mrode et al., 2000;Kettunen et al., 2000), as well as traits such as somatic cell count (Jamrozik et al., 1998;Liu et al., 2001), conformation (Jonest et al., 1999;Uribe et al., 2000;Veerkamp et al., 2001a), feed intake and body weight (Veerkamp and Thompson, 1999), in addition to traits that are important for the measurement of the longevity of the animals (Veerkamp et al., 2001b).
Thus, the objectives of the present study were to determine the parameters required for the use of random regression models in the evaluation of test-day milk yield in first-lactation Holstein cows, using the restricted maximum likelihood (REML) method, to estimate genetic parameters for test-day milk yield, and to determine the influence of the inclusion of random regression coefficients in the regression models on the description of permanent environmental effects, using a mathematical function with three coefficients.

Field data
Data comprised test-day records of first-lactation Holstein cows calving from 1997 through 2001.The records were obtained from the Milk Recording Organization of the Minas Gerais Association of Breeders of Holstein Cattle and comprise the National Zootechnical Archive of Dairy Cattle managed by the Embrapa Dairy Cattle Research Center.
To obtain data sets of a consistent size, the following records were eliminated: data from daughters of sires that did not have at least four daughters per class of herdyear-month test day, cows younger than 18 and older than 48 months at calving, a lactation period shorter than 120 days, daily milk yield records lower than 2.60 or higher than 79.80 kg, and animals that were not pure of origin and equal to or higher than 31/32 Holstein breed.After applying these criteria, a total of 87,045 records from 11,023 lactations of cows, daughters of 936 sires belonging to 251 herds distributed within 10 locations in the State of Minas Gerais, were left to be analyzed.
The analysis considered test-day milk yields obtained between the 6th and 305th days of lactation after calving, with an average of eight test-day records per cow.The individual test-day milk yields were grouped into four subclasses of cow's age at calving, i.e., 20 to 24, 25 to 29, 30 to 34, and 35 to 48 months, respectively, and calving seasons, i.e., January to March, April to June, July to September, and October to December.These groups were combined into 16 classes of age-calving season and included in the random regression models as fixed effects.The contemporary groups were characterized by the combination of herd and year and month of test-day milk yield.
The summary of the data used in this study and the distribution of test days within the classes formed by cow's age and calving season are shown in Tables 1 and 2, respectively.

Models
The following models were used in the evaluation of the inclusion of random regression coefficients in the de-scription of permanent environmental effects on milk production records by random regression models: where y ijkl = test-day record l of cow j obtained for days in milk (t j ) in subclass i (herd-year-month test day) and k 76 Cobuci et al.The mathematical function that permits the description of the shape of the lactation curve of the animals, described by Wilmink (1987), was represented by y = a 1 + a 2 t + a 3 exp (-0,05 t) , (3) where a 1 , a 2 and a 3 = parameters of the function, with a 1 being associated with the initial milk yield, a 2 with the decline in milk yield after peak lactation, and a 3 with the increase in milk yield after peak lactation.
In model 1 (RRM1), the permanent environmental and residual variances were considered to be constant throughout the lactation period, while in model 2 (RRM2), only the residual variance was considered to be constant.Residual variance was considered to be homogenous throughout lactation, due to limitations of the program used for the random regression analyses.

Estimation of variances and covariances
Estimation of the (co)variance components by the random regression models predicted a matrix containing variances and covariances of random regression coefficients.Thus, the variances in milk yield during different lactation periods are obtained by the (co)variance matrix and by the vector that contains (co)variables which individually describe the shape of the lactation curve of the animals.
The estimates of genetic variance (g tt ^), determined by RRM1 and RRM2, and of permanent environmental variance ( p tt ^), determined by RRM2, in milk yield during any days in milk t were obtained by: where G ^and P ^= matrices of genetic and permanent environmental variances and covariances between random regression coefficients, respectively; z t = (co)variables related to a specific test day l measured during days in milk t.
The estimates of genetic and permanent environmental (co)variances between two test days during days in milk t, g t t ^' and p t t ^' , for t' ≠ t, were obtained by: where G ^, P ^and z t are as described above, and z' t transpose of z t , for t' ≠ t.The (co)variance components of the models were estimated by the REML method, using the REMLF90 program (Misztal, 2001) on a LINUX operational system.A value lower than 10 -9 of the square of the relative differences between consecutive estimates was defined as the convergence criterion.

Estimation of genetic parameters
The estimates of heritability for milk yield during days in milk t, using RRM1 and RRM2, respectively, were obtained by: and The estimates of genetic (determined by RRM1 and RRM2) and permanent environmental correlations (determined by RRM2) between test-day t' and t milk yields were calculated by: rg g g g Comparison of the models Differences between models 1 and 2 were evaluated by the likelihood ratio test (LRT).Thus, to compare model i, which contains additional random regression coefficients describing the permanent environmental effect, with model j, in which these regression coefficients were not considered, the following likelihood ratio test (Rao, 1973) was used: Test-day milk yield in Holstein cows using a random regression model where, log e L i and log e L j = the natural logarithms of the restricted likelihood function of models i and j, with i = (1) and j = (2), respectively.The null hypothesis (H 0 ) to be tested implied that the equality of the restricted likelihood functions of the models did not differ between one another, i.e., H 0 : -2log e L i = -2log e L j .Thus, to reject the null hypothesis, the calculated value of LRT ij was compared to the chi-square (χ 2 tab ) table value with three degrees of freedom, with the level of significance set at 1%.

Results and Discussion
Mean milk yield and standard deviation, as well as the number of test-day records and the percentage of cows with the respective number of test-day records per lactation period are shown in Table 3. Small variations were observed in the standard deviation of milk yield of 10 test-day records obtained during the lactation period.In addition, 76.12% of the lactations were incomplete, i.e., there were less than 10 test-day records per lactation.
The (-2log e ) values of the likelihood function were 301, 665.2392 and 301,248.1940for RRM1 and RRM2, respectively.Application of the likelihood ratio test showed that inclusion of the Wilmink function in the description of the permanent environmental effects significantly increased the fit of the models, considering that the difference between the (-2 log e ) values was greater than the table value.Therefore, based on this test, the RRM2 model would be more adequate for the genetic evaluation of test-day milk yield in Holstein cows from the State of Minas Gerais.
According to Jensen (2001), different models can be proposed to evaluate test-day milk yield traits by random regression models.However, no consensus exists regarding the best model to fit milk yield data.In principle, the model that maximizes the genetic progress of the study population should be selected for the genetic evaluation of animals.

(Co)variance and correlation between random regression coefficients
In total, 8 and 13 (co)variance components were simultaneously estimated by RRM1 and RRM2, respectively (Table 4).
The inclusion of the random regression coefficients in model RRM2 to describe permanent environmental effects promoted a decrease in the magnitude of the (co)variances and genetic correlations between random regression coefficients (Tables 4 and 5, respectively).However, Rekaya et al. (1999) reported that the inclusion of permanent environmental random regression coefficients did not affect in a significant manner the genetic correlation between the coefficients of the model.
The estimates of genetic correlations between random regression coefficients for genetic and permanent environmental effects determined by RRM1 and RRM2 are shown in Table 5.An expressive difference in the genetic correlation estimates was observed between the two models, which might be due to the assumption in RRM1 that the permanent environmental effect was constant throughout the lactation period.
As shown in Table 5, negative associations could be observed between the initial milk yield (a 1 ), the rate in milk yield decrease after peak lactation (a 2 ), and the rate in milk 78 Cobuci et al. yield increase until peak lactation (a 3 ), whereas positive associations were found between the rate in milk yield decrease after peak lactation and milk yield increase until peak lactation.This indicates that cows with smaller production rates until peak lactation tend to present lactation curves with larger persistency of lactation (a smaller decline rate).Random regression models also permit inferences regarding the genetic aspects of the lactation curve.However, selection based on components related to different phases of the lactation curve is complex, because the association between these components and the phases of the lactation curve is not well understood (Rekaya et al.., 1999).Alternatively, random regression coefficient functions provide the genetic merit of animals during the various lactation periods (Jamrozik et al., 1997b).
The estimates of genetic correlations between regression coefficients obtained by the RRM1 model were close to the values (-0.79, -0.65 and 0.43) reported by Jamrozik et al. (1997a).

Variance components for milk yield
The estimates of genetic and environmental variances (sum of the variances of permanent environmental and residual effects), referring to test-day yields obtained during the period from day 6 to day 305 after the beginning of lactation, were calculated from the values shown in Table 6.The genetic variances estimated by the RRM1 model were greater than those estimated by the RRM2 model and were similar to those reported by Rekaya et al. (1999).
The behavior of the genetic and environmental variance estimates throughout lactation, obtained by the RRM1 and RRM2 models, is shown in Figures 1 and 2, respectively.In general, the values obtained by the RRM1 model were overestimated.Genetic variance tended to be greater at the beginning and at the end of the lactation period (Figure 1).However, a marked decrease in the course of the ge-netic variance curve was observed for the first 30 days of lactation, suggesting that the models were less robust to describe the genetic variance in milk yield during this period.A marked decrease in the course of the genetic variance Test-day milk yield in Holstein cows using a random regression model 79  Va (RRM1) , Va (RRM2) , Ve (RRM1) and Ve (RRM2) -additive genetic and environmental variances obtained by RRM1 and RRM2, respectively.curve during the first days of lactation was also reported by Jamrozik et al. (1997b), Rekaya et al. (1999) and Kettunen et al. (2000).The shape of the genetic variance curve throughout lactation obtained by the RRM1 and RRM2 models was similar to that observed by Jamrozik et al. (1997b), Olori et al. (1999) and Rekaya et al. (1999).
The variations in the environmental variance estimates (sum of permanent environmental effects and residual variances) throughout lactation obtained by the RRM2 model (Figure 2) were not observed when the RRM1 model was applied, since in this model the permanent environmental and residual variances were considered to be homogeneous throughout the lactation period.
The results obtained with the RRM2 model show that environmental factors were more expressive at the beginning and at the end of lactation (Figure 2), in agreement with the observations of Ludwick and Petersen (1943) that non-genetic factors tend to influence the milk yield in a more expressive manner during the first weeks of lactation.
Comparison of the behavior of genetic and environmental variances estimated by the RRM1 and RRM2 models (Figures 1 and 2, respectively) showed that the absence of permanent environmental random regression coefficients in the RRM1 model did not permit the differentiation between variance estimates for genetic and environmental effects, i.e., part of the genetic variability obtained with the RRM1 model was overestimated due to environmental factors.

Genetic parameters
The heritability estimates for test-day milk yield obtained for selected periods of lactation by the RRM1 and RRM2 models are shown in Table 7. Graphic representations of these estimates throughout lactation are illustrated in Figure 3.
The heritability estimates obtained with the RRM1 model ranged from 0.56 (first and last test day) to 0.34 (sixth and seventh test day), corresponding to 150 and 180 days of lactation (Table 7).A marked decrease in the heritability estimates was observed between day 6 and day 30 after the beginning of lactation, followed by an increase up to day 60, and remaining unchanged up to day 240, when an increase was again observed.Variations in heritability estimates throughout lactation have also been reported by Jamrozik and Schaeffer (1997) and Rekaya et al. (1999) for Holstein breeds and by Kettunen et al. (2000) for the Ayrshire breed, with the authors considering permanent environmental variances to be constant during lactation.
High heritability estimates for milk yield during different lactation periods were also observed by Jamrozik and Schaeffer (1997), Kettunen et al. (1997Kettunen et al. ( , 1998Kettunen et al. ( , 2000) ) and Costa et al. (2002).According to Costa et al. (2002), the overestimation of heritability has been one of the main problems to fit test-day milk yields by random regression models.
Heritabilities were always higher when estimated by the RRM1 model (Table 7), probably due to incorrect partition of the genetic and environmental components, based on the assumption of variance homogeneity for the permanent environmental effect.Similar results were obtained by Rekaya et al. (1999) in Holstein herds.
The heritability estimates obtained with the RRM2 model ranged from 0.15 to 0.31 (Table 7), with a gradual increase throughout lactation (Figure 3).This finding is in contrast to those of Rekaya et al. (1999) for Holstein cattle, and of Costa et al. (2002) for the Gir breed, considering residual variance heterogeneity between milk yield records throughout lactation.On the other hand, the present findings are similar to those reported by Olori et al. (1999) for Holstein cows, using Legendre polynomials to model permanent environmental effects.Ferreira (1999), using data from the Milk Recording Organization of the Minas Gerais Association of Breeders of Holstein Cattle collected between 1989 and 1998 in a   multiple-trait analysis, obtained heritability estimates for monthly test-day milk yields ranging from 0.11 to 0.21.A gradual increase in heritability estimates was observed up to the eighth test day (240 days of lactation), followed by a decrease on the two subsequent test days.Genetic correlation estimates between test-day milk yields during the selected lactation periods obtained by the RRM1 and RRM2 models are shown in Table 8.In general, genetic correlations between individual test days tended to decrease at the extremes of the lactation trajectory, showing values close to unity for adjacent test days.These results agree with those reported by Rekaya et al. (1999) and Olori et al. (1999) for Holstein cows, and by Kettunen et al. (2000) and Costa et al. (2002) for Ayrshire and Gir breeds, respectively.
Negative genetic correlation estimates between testday milk yields measured during the selected lactation periods were obtained by the RRM1 model (Table 8), a fact also observed by Rekaya et al. (1999), Liu et al. (2000), Kettunen et al. (2000) and Costa et al. (2002).
Permanent environmental correlation estimates between milk yields during selected lactation periods obtained by the RRM2 model are shown in Table 9.For example, permanent environmental correlations were greater between adjacent test days and tended to decrease between test-day pairs at the extremes of the lactation trajectory.
Milk yields at the beginning (DIM30) and at the end of lactation (DIM270) and midlactation yields (DIM150) were chosen to represent the character of the genetic correlations for milk yield between different lactation periods, estimated by the RRM1 and RRM2 models (Figures 4 and  5).The lowest genetic correlation estimates were obtained at the beginning and at the end of lactation.

Conclusions
In view of the capacity of random regression models to provide mechanisms for the estimation of individual lactation curves, it seems feasible to predict the genetic merit of animals, using random regression coefficients.We therefore recommend the inclusion of random regression coefficients in random regression models to describe permanent environmental effects, in order to define more precisely the genetic and non-genetic effects that influence milk yield.

Figure 1 -
Figure 1 -Additive genetic variances during the different lactation periods estimated by the RRM1 and RRM2 random regression models.

Figure 2 -
Figure 2 -Environmental variances (sum of permanent environmental and residual variances) during the different lactation periods estimated by the RRM1 and RRM2 random regression models.

Figure 3 -
Figure 3 -Heritability estimates for milk yield during the different lactation periods estimated by the RRM1 and RRM2 random regression models.

Figure 4 -
Figure 4 -Genetic correlations between daily milk yield at 30, 150 and 270 days and the other lactation periods obtained by the RRM1 model.

Figure 5 -
Figure 5 -Genetic correlations between daily milk yield at 30, 150 and 270 days and the other lactation periods obtained by the RRM2 model.

Table 1 -
Summary of the information used in this study.

Table 3 -
Means and standard deviations of milk yield, number of test-day records and percentage of cows with the respective number of test-day records per lactation (given in parentheses).

Table 5 -
Estimates of genetic and permanent environmental correlations between the random regression coefficients obtained by the RRM1 and RRM2 models.a 2 and a 3 -additive genetic regression coefficients, p 1 , p 2 and p 3 -permanent environmental regression coefficients, corresponding to (co)variables Z 1 = 1, Z 2 = t and Z 3 = exp (-0,05t) of the Wilmink function.

Table 6 -
Estimates of genetic and environmental variances for selected DIM of daily yields obtained by the RRM1 and RRM2 models.

Table 7 -
Estimates of heritability for selected DIM of daily yields obtained by the RRM1 and RRM2 models.

Table 9 -
Permanent environmental correlation estimates between selected DIM of daily yields obtained by the RRM2 model.