Estimation of genetic parameters for test ‐ day milk yield in Khuzestan buffalo

The objective of this work was to estimate covariance functions for additive genetic and permanent environmental effects, as well as to obtain genetic parameters for buffalo test‐day milk yield using random regression models on Legendre polynomials (LPs). A total of 2,538 test‐day milk yield (TDMY) records from 516 first lactation records of Khuzestan buffalo, calving from 1993 to 2009 and belonging to 150 herds located in the state of Khuzestan, Iran, were analyzed. The residual variances were modeled through a step function with 1, 5, 6, 9, and 19 classes. The additive genetic and permanent environmental random effects were modeled by LPs of days in milk using quadratic to septic polynomial functions. The model with additive genetic and animal permanent environmental effects adjusted by cubic and third order LP, respectively, and with the residual variance modeled through a step function with nine classes was the most adequate one to describe the covariance structure. The model with the highest significant log‐likelihood ratio test (LRT) and with the lowest Akaike information criterion (AIC) and Bayesian information criterion (BIC) was considered to be the most appropriate one. Unexpected negative genetic correlation estimates were obtained between TDMY records of the twenty‐fifth and thirty‐seventh week (‐0.03). Genetic correlation estimates were generally higher, close to unity, between adjacent weeks during the middle of lactation. Random regression models can be used for routine genetic evaluation of milk yield in Khuzestan buffalo.


Introduction
According to climatic conditions, Iranian water buffalo can be classified into three main groups: Azari ecotype, in Western and Eastern Azerbaijan; North ecotype, in Guilan and Mazandaran; and Khuzestan ecotype, in Khuzestan (Ghavi Hossein-Zadeh, 2016), which show some similarities to Iraqi buffalo (Tavakolian, 2000).Khuzestan, a province in the southwest of Iran, is one of the important regions for raising buffalo.More than 22% of the buffalo population in the country is found in this area, with a herd size of 5 to 300 animals (Naderfard & Qanemy, 1997).
The estimate of daily yield production with test-day models has several advantages over the traditional procedures of evaluating lactation records, such as the ability to account for environmental effects on each test-day and to model individual lactation curves (Schaeffer et al., 2000).Random regression models (RRM) have been proposed as an alternative methodology for the analysis of longitudinal data or repeated measures records.For these reasons, RRM were recommended for analyses of test-day models in dairy cattle (Schaeffer & Jamrozik, 2008).
RRM allow obtaining breeding values for milk yield at any day of lactation in a continuous manner or for functions of lactation curve, when compared to finite dimensional models that only give punctual predictions of breeding values.Moreover, RRM provide estimates of breeding values with higher accuracies than the conventional finite dimensional models, because all lactation and short-length lactation records can be used in the genetic evaluation (Jamrozik et al., 2000;Schaeffer et al., 2000).
The majority of random regression analyses fitted polynomials of time or age as basic functions at recording.In particular, Legendre polynomials (LPs) have been widely applied to estimate covariance functions for growth traits in beef cattle and production traits in dairy cattle.The order of LPs in RRM is important in that estimates of genetic parameters can differ with the order (Misztal et al., 2000).High-order polynomials were found to be the most adequate way of modeling changes in the mean and variance over time; however, these orders of polynomials might lead to errors in the estimates of genetic parameters, mainly due to oscillations at the extremes of the curve (Meyer, 2005).One alternative that is currently being studied to reduce the order of these polynomials and to minimize estimation problems is the application of segmented polynomials or spline functions (Laureano et al., 2014).
In Iran, the genetic analyses for milk yield of buffalo are carried out using a finite dimensional model, as described in other studies (Rosati & Van Vleck, 2002).However, worldwide, RRM are currently being used for national genetic evaluations of dairy cattle in several countries, such as the United States, Canada, Australia, and Brazil (Sesana et al., 2014).Therefore, it is crucial to apply these models in a genetic evaluation program for milking buffalo in Iran.Madad et al. (2013a) estimated the genetic parameters for milk and fat yields in Khuzestan buffalo of Iran with the multivariate restricted maximum-likelihood (REML) procedure; however, there are no estimates of genetic parameters for test-day milk yield in Khuzestan buffalo using RRM.
The objective of this work was to estimate covariance functions for additive genetic and permanent environmental effects, as well as to obtain genetic parameters for buffalo test-day milk yield using RRM on LPs.

Materials and Methods
A total of 2,538 test-day milk yield (TDMY) records from 516 first lactation records of Khuzestan buffalo, calving from 1993 to 2009 and belonging to 150 herds located in the state of Khuzestan, Iran, were analyzed.The age of the evaluated cows varied from 24 to 60 months.TDMY was considered in weekly classes, from 1 to 37 weeks.The number of animals with records, number of sires, and number of dams were 516, 151, and 685, respectively.Contemporary groups (CGs) were defined according to the effects of herd, year, month, and day of milk test, as well as to year and season of calving.
The choice of fixed effects to be considered was made after testing whether the effects were statistically significant with the general linear model procedure of SAS (SAS Institute Inc., Cary, NC, USA).All of the fixed effects were significant (p<0.05) and included in the final model of analysis.Only records of buffalo with at least four tests and belonging to CGs of at least four animals were kept.
Residual variances were modelled through a step function with the following classes: 1, 5 (1, 2, 3, 4, and from 5-39 weeks), 6 (1, 2-3, 4, 5-13, 14-35, and 36-39 weeks), 9 (1-5, 6-9, 10-13, 14-17, 18-21, 22-25, 26-29, 30-33, and 34-39 weeks), and 19 (1, 2, 3, 4, 5, 6, 7, 8, 9, 10-13, 14-18, 19-24, 25-28, 29, 30, 31-32, 33-35, 36-37, and 38-39 weeks).The models Pesq.agropec.bras., Brasília, v.51, n.7, p.890-897, jul.2016 DOI: 10.1590/S0100-204X2016000700012 of analysis included the fixed effects of CGs and age of cow at calving as linear and quadratic covariables, respectively.The additive genetic and animal permanent environmental effects were considered as random effects.Additive genetic effects (a) were modeled through quadratic, cubic, quartic, quintic, and sextic polynomial functions, involving k a = 3, 4, 5, 6, and 7 random regression coefficients, respectively.Animal permanent environmental effects (p) were modeled through quadratic, cubic, quartic, quintic, sextic, and septic polynomial functions, involving k p = 3, 4, 5, 6, 7, and 8 random regression coefficients, respectively.The following random regression model was used for the analysis: in which, Y imnptv is the test-day record i obtained at dim t of buffalo p calved at the n th age in herd-test date m; F imnptv are fixed effects related to Y mnptv (herd, year, month, and day of milk test, and year-season of calving); Cf is the f th fixed regression coefficient for calving age; age n is the n th calving age; k is the order of fit for fixed regression coefficients (k=2); βr is the r th fixed regression coefficient; ka is the order of fit for additive genetic random regression coefficients; kp is the order of fit for permanent environmental random regression coefficients; αpr is the r th random regression coefficient of additive genetic value of buffalo p; γpr is the r th random regression coefficient of permanent environmental effect of buffalo p; Φr(dimt ) is the r th coefficient of LPs evaluated at days in milk t th ; and εmnpt v is the random residual error.It was assumed that distributions for random genetic and residual effects were multivariate normal distribution with mean 0 and variances A a σ 2 and I e σ 2 , respectively, in which A and I are the additive numerator relationship matrix and identity matrix, respectively; and σ a 2 and σ e 2 are direct genetic and residual variances, respectively.The (co)variance components and genetic parameters for productive traits were estimated using the average information (AI) REML algorithm of the Wombat program (Meyer, 2006).The following formulas were used to estimate the additive genetic, permanent environmental, and residual variances, as well as the heritabilities and genetic and permanent environmental correlations between different days in milk: ; are the direct additive and permanent environmental covariances between days in milk i th and j th ; h i 2 is the heritability for days in milk i th ; R a(i,j) and R p(i,j) are the direct genetic and permanent environmental correlations between days in milk i th and j th , respectively; K a is the order of fit for additive genetic random regression coefficients; K p is the order of fit for permanent environmental random regression coefficients; and φ ir and φ' jr are r th coefficient of LPs evaluated at days in milk i th and j th , respectively.Results from different models of analyses were compared by the REML form of the Akaike information criterion (AIC) (Akaike, 1974) and of Schwarz's Bayesian information criterion (BIC) (Schwarz, 1978), and by inspecting the variance component and genetic parameter estimates.A model with the highest significant (p <0.05) LRT and with the lowest AIC and BIC was considered to be the most appropriate.The information criteria are as follows: AIC = -2log + 2p and BIC = -2log + p log (N -r(x)).
In addition, LRT for models i and j was calculated with the following formula: LRT ij = 2 × (Log L i -Log L j ).
For the abovementioned equations, p denotes the number of parameters estimated; N is the sample size; r(x) is the rank of the coefficient matrix of fixed effect in the model of analysis; and Log L is the REML maximum log likelihood.The polynomial order and type of residual variance in different RRM are similar: k a .kp .hety or k a .kp .hom, in which: k a is the order of the covariance function for the additive genetic effect; k p is the animal permanent environmental effect; hom is the homogeneity of residual variances; and het are the residual variances modeled by a step function with y classes.

Results and Discussion
The mean, standard deviation, and coefficient of variation for TDMY during lactation were 8.96 kg, 2.69 kg, and 3.3%, respectively.The highest TDMY, i.e., 10.36 kg, was observed at the nineteenth week of lactation and then decreased to 7.05 kg until the end of lactation (Figure 1).Breda et al. (2010) found that the peak TDMY of Murrah buffalo occurred around the eleventh week of lactation, whereas Hurtado-Lugo et al. (2006) reported the greatest values of TDMY close to the middle of lactation in North Colombia buffalo.According to Madad et al. (2013b), the peak milk yield of Iranian buffalo occurred on the ninetieth day of lactation.
The order of fit for additive genetic and permanent environmental effects was kept constant to define the best variance structure to model the residual variances.The results of LRT, AIC, and BIC indicated a significant improvement in the level of fit when residual variance was considered heterogeneous (Table 1).This shows that residual variances had different behavior along lactation; therefore, it is necessary to consider a heterogeneous variance structure for the residuals.Several studies on dairy cattle have found a heterogeneous variance structure for the residual over lactation (Brotherstone et al., 2000;Bignardi et al., 2009).In the present study, only models with step function fit were The AIC results indicated that a step function with 19 classes was the most adequate to model the residual variances.However, the results of LRT and of BIC that  (1) k a , order of fit of additive genetic effect; k p , order of fit of permanent environmental effect; e, residual effect with heterogeneous (het) or homogeneous (hom) classes; P, number of parameters; Log L, log likelihood value; AIC, Akaike information criterion; BIC, Bayesian information criterion; and LRT, likelihood ratio test.**Significant at 1% probability.ns Nonsignificant.
penalized more parameterized models showed that a step function with nine classes was sufficient to model the residual variances.After having chosen the most adequate structure to model the residual variances, with nine classes of residual variances, several models were compared varying in the order of covariance functions for additive genetic (k a ) and permanent environmental effects (k p ).According to the AIC, the 3.3.het19model was the most adequate one to describe the covariance structure of data, whereas the BIC pointed out that the 3.3.het9model was the best to fit the data.Since BIC tends to choose more parsimonious models and is more rigorous than AIC, the 3.3.het9model was selected as the most adequate one to describe milk yield variation during lactation.
Additive genetic variances had the highest values in the early weeks of lactation, whereas permanent environmental variances were the highest in the twenty-fifth week (Figure 2).Likewise, Sesana et al. (2010) compared different structures of permanent environmental variances using RRM and found higher variance estimates at early lactation in milking buffalo.Madad et al. (2013b), using RRM in dairy buffalo, also reported high additive genetic variance in the early months of lactation, despite observing lower variances.
Phenotypic variance estimates were higher during the first two weeks of lactation, which decreased at the ninth week and then reached the maximum value at the thirteenth week of lactation.Based on the results of Bignardi et al. (2009) and Sesana et al. (2010), using RRM in dairy cattle and in milking buffalo, respectively, the highest phenotypic variances for dairy traits occurred in the first days of lactation.Residual variance estimates were higher at the middle of the lactation period and lower at the end.Sesana et al. (2010) compared different structures of residual variances using RRM and also found higher residual variance estimates at early lactation in dairy buffalo.The heritability estimates were higher at the beginning of the lactation period and then decreased along lactation (Figure 3), probably due to the limited production of data attributed to the small number of animals and milk yield records.Heritability estimates obtained by the 3.Even though higher heritability estimates were obtained at the beginning of lactation in the present study, RRM are probably weak to describe variance components at the extreme of the trajectory, when the number of TDMY decreased.In addition, the highest heritability estimates at initial lactation might be due to the fact that milk yield during the first test-days is critical to calf survival in terms of both volume and content, and, therefore, could have a large genetic component (Geetha et al., 2007).According to Meyer (2005), LPs were susceptible to "end-range" problems, including implausible variance estimates at the extreme of the trajectory, mainly when the number of records decreased during this period, as observed in the present work.Furthermore, Cobuci et al. (2005), who used the exponential function of Wilmink (1987), found higher heritability estimates at the end of lactation but lower estimates at the beginning.
Estimates for milk yield ranged from -0.03 to 1.0 for genetic correlations and from -0.94 to 1.0 for the permanent environmental ones (Table 2).Genetic

Figure 1 .
Figure 1.Number of records (lines) and test-day milk yield (bars) for each week of lactation of Khuzestan buffalo.

Figure 2 .
Figure 2. Additive genetic (A), permanent environmental (B), phenotypic (C), and residual (D) variance estimates for test-day milk yield for each week of lactation of Khuzestan buffalo, obtained with a model with nine classes of heterogeneous residual effect and orders of fit for direct additive and permanent environmental effects equal to 3.

Figure 3 .
Figure3.Heritability estimates for test-day milk yield for each week of lactation of Khuzestan buffalo, obtained with a model with nine classes of heterogeneous residual effect and orders of fit for direct additive and permanent environmental effects equal to 3.

Table 1 .
Order of fit for the polynomial used in the different models and information criteria(1).
Hurtado-Lugo et al. (2006).53 in the first week to 0.04 in the thirty-third week.When the estimates of heritabilities for milk yield at early lactation were ignored, heritability for milk yield was higher than that reported byHurtado-Lugo et al. (2006)for Murrah buffalo in Colombia (0.01 to 0.20), but similar to that obtained by Chakraborty et al.

Table 2 .
Estimated genetic correlations (below diagonal) and permanent environmental correlations (above diagonal) among different lactation weeks for test-day milk yield in Khuzestan buffalo.