Random regression models with B-splines to estimate genetic parameters for body weight of young bulls in performance tests

The objective of this study was to estimate genetic parameters for body weight of beef cattle in performance tests. Different random regression models with quadratic B-splines and heterogeneous residual variance were fitted to estimate covariance functions for body weights of Nellore and crossbred Charolais × Nellore bulls. The criteria −2 residual log-likelihood (−2RLL), Akaike Information Criterion (AIC), and consistent AIC (CAIC) were used to choose the most appropriate model. For Nellore bulls, residual variance was modeled with six classes of age, and direct additive genetic and permanent environment effects were modeled with quadratic B-splines with two and one intervals, respectively. For crossbred bulls, quadratic B-splines with one interval fitted direct additive genetic and permanent environment effects and nine classes of age were needed to fit residual variance. Pooling classes of age with up to 40% in difference of residual variances does not compromise the fit of the model. Heritability for body weight in performance tests are moderate (>0.25, for crossbred bulls) to high (>0.5, for Nellore bulls) and genetic correlation between weights over the test are also high (>0.65). Then, selection of young bulls in performance test is an efficient tool to increase body weight in beef cattle.


Introduction
Using genetically superior young bulls is important for the increase in efficiency of beef cattle production.Genetic differences among young bulls can be measured in performance tests.In such tests, candidates for selection are maintained in controlled environments and are later ranked according to specific criteria.Typically, body weight is an important selection criterion for beef cattle breeding programs (Campos et al., 2014), which is measured several times over the life of the animal.Random regression models are often more appropriate for genetic evaluation of body weight because they describe phenotypic and genetic changes over time.
The growth of Bos indicus (Riley et al., 2007;Boligon et al., 2012;Lopes et al., 2012;Scalez et al., 2014;Bertipaglia et al., 2015), Bos taurus (Meyer, 2005a,b;Mota et al., 2013b), and crossbred beef cattle (Sánchez et al., 2008;Baldi et al., 2010;Mota et al., 2013a;Scalez et al., 2014) has been modeled by random regression models.Legendre polynomials are the typical choice for most researchers for fitting mean growth trajectories and also additive and permanent environmental effects.Despite the advantages observed in random regression analyses, its use with Legendre polynomials can lead to problems when the variances increase at extremes of age intervals (Meyer, 2005b).Overcoming these issues involves increasing the order of Legendre functions or using more robust (e.g., spline) functions that allow low-degree polynomials to fit over short segments of the growth trajectory.
The identification of the best-fit models is essential because inadequate modeling of genetic and non-genetic effects affects selection response.Since the use of B-spline polynomials in random regression models for beef cattle genetic evaluations is recent, it remains important to study this type of function.Therefore, the current study estimated genetic parameters for body weight of beef cattle in performance tests using random regression models with quadratic B-spline functions.

Material and Methods
The weight and age data of two genetic groups of young bulls (Nellore and MA) were recorded during performance tests.The genetic group MA (21/32 Charolais + 11/32 Nellore, approximately) came from a rotational crossbreeding of Canchim (5/8 Charolais + 3/8 Zebu) and Charolais bulls with Nellore cows.The Nellore breed database consisted of 16,291 observations from 3,356 animals across 37 performance tests conducted by Grupo Provados a Pasto.The weaned animals were kept in a single management group on pasture with mineral supplementation throughout the evaluation period.The young bulls were weighed after an adaptation period of 70 days and at regular intervals of 56 days up to 224 days after the end of the adaptation period.Approximately 15 and 85% of the animals had four and five records each, respectively.For Nellore, age ranged from 262 to 642 days.
The data for MA group were from the evaluation program of young bulls of Agropecuária Ipameri.The growth of bulls was evaluated in 10 performance tests between weaning (225 days of age, on average) and approximately 18 months of age.The animals were kept on pasture and received protein-energetic supplementation during the dry season and mineral supplementation in the whole evaluation period.The database consisted of 3,997 observations of 884 MA bulls.Approximately 3, 63, 13, and 21% of the animals had three, four, five, and six records each, respectively.For MA group, age ranged from 166 to 707 days.Information on the average weight and the number of observations for each age was reported in Scalez et al. (2014).
Random regression models were fitted to each genetic group separately.Each model included the fixed effects of the contemporary group and the average growth trajectory (quadratic B-spline polynomial with four intervals) nested in the performance test.The contemporary groups of Nellore bulls were composed of animals born in the same year and month and evaluated in the same performance test.The contemporary groups of MA bulls included animals born in the same year and season (i.e., January to March, April to June, July to September, and October to December), and weighed and evaluated in the same performance test.Quadratic B-splines polynomials were used to model the random additive genetic and permanent environmental effects.
The coefficients of the quadratic B-spline functions were generated for the k interval defined by the points T k and T k+1 , with T k ≤ T k+1 , according to De Boor (2001) and Meyer (2005b).The polynomial functions considered in the present study were those with up to four intervals of the same size.The division into m intervals requires the specification of m−1 internal knots and two external knots (T 0 and T m ).This approach generates m+1 knots and m+p non-null functions φ k,p , in which p is the degree of the basis function (Meyer, 2005b).The general random regression model with quadratic B-splines was described as follows: , in which CG i represents the effect of the i-th contemporary group; is the k-th quadratic B-spline covariate at age j, which was nested in i*-th performance test to fit the average growth trajectory; β k(i*) is k-th coefficient of the fixed regression nested in fixed effect i*; α km and ρ km are the k-th random regression coefficients for the additive genetic and permanent environmental effects of animal m, respectively; and M a and M p represent the number of intervals used to model the random additive genetic and permanent environmental effects, respectively.
In matrix notation, the quadratic B-spline general random regression models were as follows: , in which represents the vector of observations, X is the incidence matrix of fixed effects (i.e., contemporary groups and average growth trajectories), is the vector of fixedeffect solutions, Z 1 is the incidence matrix of random additive genetic effects, is the vector of random additive genetic effect solutions, Z 2 is the incidence matrix of random permanent environmental effects, is the vector of random permanent environmental effect solutions, and is the vector of random errors.The covariance of random effects was defined as follows: , in which K a represents the covariance matrix of random regression coefficients for additive genetic effects, A is the additive genetic relationship matrix, K p is the covariance matrix of random regression coefficients for permanent environmental effects, I is the identity matrix, and R is the covariance matrix of random error.The relationship matrices were composed of 4,283 and 1,491 animals for Nellore and MA groups, respectively.
The covariance components and genetic parameters were estimated by the Restricted Maximum Likelihood method using the Wombat software (Meyer, 2007).Models with one, two, four, eight, and sixteen classes of residual variance were evaluated.After analyzing the model with 16 classes, the adjacent residual variance classes with differences less than 10, 20, 30, 40, and 50% (relative to the smallest value of variance) were pooled, yielding new models with fifteen, ten, nine, six, and four age classes for Nellore and fourteen, thirteen, thirteen, ten, and nine age classes for MA group, respectively.In these analyses, the additive genetic effect and the direct permanent environmental effects were modeled using a quadratic Bspline polynomial with four intervals to compare models with different numbers of classes of residual variance.The criteria −2 restricted likelihood function (−2RLL), Akaike Information Criterion (AIC), and consistent Akaike Information Criterion (CAIC) were used to choose the most appropriate number of classes of age for residual variance (Bozdogan, 1987;Wolfinger, 1993).
After defining the most suitable residual variance structure, the models with differences in the number of intervals for the covariance functions of random effects were compared.Models with one, two, three, and four intervals were evaluated (sixteen models).The knots were equidistant and the criteria used to assess the best model were the same as those defined in the previous stage.The quadratic B-spline models were named according to the number of non-null functions.Thus, the quadratic B-splines with one, two, three, and four regular intervals for additive genetic and permanent environment effects were named QBS33, QBS44, QBS55, and QBS66, respectively.
After estimating the covariance functions, the breeding values of selection candidates were predicted from the solutions of the random regression coefficients based on Mrode (2005).The predicted breeding values were calculated for weight at 300, 365, 450, and 550 days for Nellore group and at 225, 365, 450, and 550 days for MA group.These predictions were used to rank the top 10% young bulls within each performance test.The percentage of commonly selected animals was obtained using different models indicated by −2RLL, AIC, and CAIC.In addition, Pearson and Spearman correlations were computed between the breeding values obtained using the models indicated by the above criteria and the breeding values obtained using the homogeneous residual variance model.

Results
The values of −2RLL, AIC, and CAIC were higher in the model with homogeneous residual variance for both genetic groups (Table 1).Thus, these models had the worst fit.The AIC showed the lowest values for models with nine age classes for Nellore, and ten and thirteen age classes of residual variances for MA group (Table 1).The CAIC values were lower for models with six and nine age classes of residual variance (Table 1).Therefore, these results indicate that such models fitted better the records of Nellore and MA groups, respectively.In general, models with low information criteria values were derived by pooling adjacent age classes of residual variances.The estimates of residual variance increased from the beginning to 550-600 days of age (Figure 1).The models that considered homogeneous residual variance showed the lowest percentage of individuals selected in common compared with models with heterogeneous residual variance (Table 2).Pearson and Spearman correlations involving the breeding values predicted by the models with heterogeneous residual variance within each test were above 0.97.Considering the models with heterogeneous residual variance, the average percentage of individuals selected in common were above 98 and 96% for Nellore and MA groups, respectively, with minimum values equal or greater than 80% (Table 2).
After assessing different quadratic B-splines functions to adjust the additive and permanent environmental effects for growth of Nellore bulls, the QBS65 model displayed the best fit according to −2RLL and AIC.The variance estimates obtained by the models QBS43 and QBS65 were similar for Nellore breed, especially at initial ages and after 550 days of age (Figure 2).These estimates were also similar to those obtained using the most complete model, QBS66.On the other hand, the simplest model (QBS33) showed high values for the genetic additive variance and low estimates for the permanent environmental variance (Figure 2).The estimates of permanent environmental variance were constant in most of the age intervals for the Nellore breed (Figure 2).
In the case of MA genetic group, the models QBS66, QBS63, and QBS33 exhibited the best fit based on −2RLL, AIC, and CAIC criteria, respectively (Table 3).The trends of the direct additive genetic and the direct permanent environmental variances were similar for the QBS33 and QBS66 models (Figure 2).However, the permanent environmental variance was underestimated in the QBS63 model, most likely due to the overestimation of the direct additive genetic variance.The estimates of permanent environmental variances increased from the beginning to approximately 500 days and were constant after this point for the MA group (Figure 2).
The −2RLL, AIC, and CAIC indicated different models.To assess any disparities in the selection of young bulls, the percentage of common selected bulls were compared among models (Table 4).
The lowest value for the average percentage of young Nellore bulls commonly selected among the top 10% within each performance test using the QBS43 (lowest CAIC value) or QBS65 (lowest −2RLL and AIC values) models was 85%.Despite this high average, the minimum value was only 50% for one of the performance tests.The lowest value for the average percentage of MA young bulls selected in common in the top 10% rank using QBS63, QBS33, and QBS66 models was 67.3%, with a minimum value below 50% for some performance tests (Table 4).Furthermore, the estimates of genetic and permanent environmental variances obtained with the QBS63 model were higher and lower, respectively, than the estimates of the other models involved in this comparison (Figure 2).
The estimates of heritability obtained by QBS65, QBS43, and the most complete model (QBS66) were similar for the Nellore breed (Figure 3).However, in the simplest model (i.e., QBS33), the estimates of heritability were lower than those of the other models included in the comparison (Figure 3).The estimates of heritability for body weight of MA bulls showed similar results with QBS33 and QBS66 models.On the other hand, heritability estimates with QBS63 model were higher for nearly all

Discussion
Heterogeneous residual variance is more appropriate for modeling body weight data at different ages because the temporary environment does not affect equally the entire  1 QBSk a+2 k ap+2 : quadratic B-spline, in which k a and k ap specify the number of intervals of additive genetic and permanent environmental effects, respectively.
ages, most likely due to the overestimation of the direct additive genetic variance (Figure 3).The genetic correlation between weights at different ages was high, with values above 0.92 and 0.65 for Nellore and MA groups, respectively (Figure 4).
1 QBSk a+2 k ap+2 : quadratic B-spline, in which k a and k ap specify the number of intervals of additive genetic and permanent environmental effects, respectively.animal growth trajectory (Bohmanova et al., 2005;Dias et al., 2006;Toral et al., 2009).However, the use of age classes in the adjust of residual variance increases the number of parameters in the models (El Faro and Albuquerque, 2003).Alternatively, some classes can be clustered to reduce the number of parameters in the model without compromising its fit (Toral et al., 2009).This study supports this hypothesis because models that generally showed lower information criteria values were derived from the pooling of adjacent age classes.Thus, this approach reduces the number of classes and highlights the possibility of working with different size of age classes (Table 1 and Figure 1).The estimates of residual variances for the first and last classes of age were lower (Figure 1), especially for MA group.This result is partially justified by the small number of records at the extremes of the average growth curve and also due to a better fit of the mean trajectories at these regions.In the MA group, classes with more ages showed the worst estimates of residual variance.Moreover, this finding can be due to the combination of small number of observations within each class that inadequately fits the average growth curve.For instance, this situation was observed in the class that represents the interval from 642 to 675 days of age in the MA group for all heterogeneous residual variance models.
The CAIC indicated models with six and nine classes of residual variance for Nellore and MA group, respectively.These models co-selected several young bulls that were chosen based on the use of more parametrized residual variance structure as suggested by −2RLL and AIC.Therefore, this finding indicates that both models were as efficient as the models with more residual variance classes for the two genetic groups.This result confirms the possibility of pooling age classes with up to 40% differences in variance as a viable approach to reduce the number of model parameters without compromising its fit or the overall evaluation quality.
The QBS65 and QBS43 were indicated by the goodness-of-fit criteria, after testing different quadratic B-spline functions for genetic additive and permanent environmental effects.The estimates of variance and heritability were similar for both models within Nellore breed, especially at the beginning and end of the age interval.Regarding the MA group, the trends of additive genetic and permanent environmental variances were similar for the simplest (QBS33) and the most complete model (QBS66) (Figure 2).However, the permanent environmental variance was underestimated in the intermediate model ( QBS63), most likely because of the overestimation of the direct additive genetic variation (Figure 2).According to Meyer (2005b), few and dispersed records at the upper end of the curve caused problems when estimating the variance components.Particularly, the direct permanent environmental effect showed remarkable dependency between the variance estimates and the order of the polynomials.
The phenotypic variance trends were similar for all models (Figure 2), which suggest that differences between models are related to the partition of the variances.
The differences between estimates of heritability for body weight of MA bulls in performance tests (Figure 3) reflected differences in the estimates of additive genetic variances of the models QBS33, QBS63, and QBS66.The use of different models can result in disparities between variance components (Albuquerque and Meyer, 2001;Toral et al., 2009).Changes in the estimates of heritability can have serious implications for the design of animal breeding programs because this parameter is essential for defining selection criteria and methods of selection.
Regarding random regression models, studying statistical fits is crucial to choose suitable models that are applicable to real datasets.In the present study, the different models indicated by the goodness-of-fit criteria did not necessarily result in different variance component.This finding supports the need for more detailed studies on the choice of statistical models for the genetic evaluation of beef cattle for body weight and, possibly, for evaluation of other traits in different species.
In general, the estimates of direct additive genetic and phenotypic variances increased with age for the two genetic groups.These results are in agreement with those obtained by Albuquerque and Meyer (2001), Baldi et al. (2010), andLopes et al. (2012) with Nellore and Canchim (5/8 Charolais + 3/8 Zebu), and reflects changes in genetic and phenotypic variances over time.
The present study showed oscillations in the estimates of permanent environmental variance, including the models with three and four intervals.However, these variations were lower for shorter periods than those reported by Dias et al. (2006), who used Legendre polynomials to model random effects for growth from birth to 550 days of age of Tabapuã animals.According to these authors, this increase reflects the difficulties in modeling this effect.Meyer (1999) also pointed the issues of using Legendre polynomials for cattle growth, primarily because of the large emphasis that is placed on the observations at the extremes of growth curve.Meyer (2005b) suggested using B-spline functions to correct this problem because these functions might improve the fit at the extreme of the age intervals given that the fit of each part of the trajectory is based on the number of intervals.These oscillations for R. Bras.Zootec., 47:e20150300, 2018 short periods might be the result of the split of the age interval using B-splines, given that sharp variations in the estimates were not observed in the model with only one interval (MA group, QBS33 model) (Figure 2).Another factor which might have contributed to obtaining nearly constant estimates of permanent environmental variance is the high average number of observations per animal, which was above 4.5 for the two datasets.
In general, the estimates of residual, genetic, and phenotypic variances increased because of the increase in the mean of the trait.This finding occurred in this study to some extent, with the residual variances increasing until 550-600 days.However, modelling the average growth trajectory with different intervals and classes of residual variances might result in a better fit of this curve decreasing the values of residues.
The estimates of heritability (Figure 3) suggest that it is possible to obtain genetic gain for body weight in Nellore and MA groups through the selection.Moreover, performance tests are conducted in homogeneous environments, which provides a better partitioning of the environmental and genetic effect, and these effects can be estimated with more accuracy.
The best model for the Nellore breed (QBS65) displayed heritability estimates that increased over the age interval.The QBS33 and QBS66 models showed a decreasing trend from beginning up to 365 days of age for MA group.The same trend was observed in Nellore breed evaluated after weaning (Lopes et al., 2012).This trend might have happened because of the confounding of genetic effects (maternal and direct), given that the MA young bulls were included in tests shortly after weaning.Dias et al. (2006) estimated that the maternal effect has the greatest magnitude until approximately 240 days of age.Therefore, heritability might have been overestimated for the first ages evaluated in the MA group because maternal effects were omitted.In the case of the Nellore animals, the heritability continuously increased, perhaps by not confusing the direct and maternal effects in the initial ages and because the animals underwent an adaptation period between weaning and the beginning of the test.
The percentage of animals commonly selected by different models indicates the disparities among them regarding animal ranking, which is possible and allows to choose models properly when the database is real.When the percentage of commonly selected individuals is low, the best model can be chosen based on comparisons involving the most complete model because such model is the one that best approximates the true model if the convergence of the analysis is guaranteed.
The comparison involving the two concurrent models for evaluation of body weight of Nellore bulls showed that the model with lower values of −2RLL and AIC (QBS65) provided the best fit (aside from the more complex model).Therefore, B-splines with at least four and three intervals, respectively, are recommended to fit additive and permanent environmental effects in genetic evaluation of Nellore bulls in performance tests.
The model that better fits the records of MA genetic group (CAIC, QBS33) resulted in parsimonious estimates of genetic parameters.Therefore, the choice of models for genetic evaluation should also consider the simplicity of the model and the ease of convergence.
In general, the correlations of body weight after 490 days of age were higher in both genetic groups (Figure 4), which, along with the estimates of heritability (Figure 3), suggests the possibility of direct responses and favorable correlations when the animals are selected based on their weight at the end of the performance tests.

Conclusions
B-spline functions with different residual variance structures can be applied to estimate the variance components for body weight of young bulls in performance tests.Pooling classes of age with similar residual variances reduces the number of estimated parameters and does not compromise the fit of the model.Selection of young bulls evaluated in performance test is an efficient tool to increase body weight in beef cattle.

Figure 1 -
Figure 1 -Estimates of residual variance for body weight of purebred Nellore (left) and crossbred Charolais × Nellore (right) young bulls in performance tests, according to the number of classes of age for residual variance.

Figure 4 -
Figure 4 -Estimates of genetic correlations between body weights in different ages of purebred Nellore (left; model QBS65) and crossbred Charolais × Nellore (right; model QBS33) young bulls in performance tests using quadratic B-spline 1 models.

Table 1
−2RLL -value of the restricted likelihood function; AIC -Akaike Information Criterion; CAIC -consistent Akaike Information Criterion. 1 Number of classes of age after pooling the classes of age from model with 16 classes.

Table 2 -
Mean and minimum (Min)values for the percentages of purebred Nellore and crossbred Charolais × Nellore young bulls among the top 10% for expected breeding value for body weight in standard ages within each performance test, according to the models with different number of classes of age (NCA) for residual variance

Table 4 -
Mean and minimum (Min)values for the percentages of purebred Nellore and crossbred Charolais × Nellore young bulls among the top 10% for expected breeding value for body weight in standard ages within each performance test, according to the number of intervals to model additive genetic and permanent environmental effects