Modeling of average growth curve in Santa Ines sheep using random regression models

Polynomial functions of age of different orders were evaluated in the modeling of the average growth trajectory in Santa Ines sheep in random regression models. Initially, the analyses were performed not considering the animal effect. Subsequently, the random regression analyses were performed including the random effects of the animal and its mother (genetic and permanent environment). The linear fit was lower, and the other orders were similar until near 100 days of age. The cubic function provided the closest fit of the observed averages, mainly at the end of the curve. Orders superior to this one tended to present incoherent behavior with the observed weights. The estimated direct heritabilities, considering the linear fit, were higher to those estimated by considering other functions. The changes in animal ranking based on predicted breeding values using linear fit and superior orders were small; however, the difference in magnitude of the predicted breeding values was higher, reaching values 77% higher than those obtained with the cubic function. The cubic polynomial function is efficient in describing the average growth curve.


Introduction
Random Regression Models (RRM) have become the standard procedure for the genetic evaluation of longitudinal data in animal improvement.Several studies have presented advantages of these models over the traditional models of analysis, according to Meyer (2004).
For the usage of the RRM in genetic evaluations of growth curve, it is necessary to fit a fixed continuous function to represent the growth tendency of average body weight of the population, which is called the average trajectory of growth (Kirkpatrick et al., 1990).The curve of each animal is obtained as a deviation of the average trajectory of growth (fixed regression).Therefore, it is R. Bras. Zootec., v.40, n.2, p.314-322, 2011 essential to model correctly the average trajectory in random regression models.
Even though some studies have been carried out using meat sheep, they directed efforts in an attempt to find the best modeling of the random part, paying little attention to the modeling of the fixed part.Lewis & Brotherstone (2002), by using the random regression technique on genetic growth evaluation of Suffolk sheep, used a continuous function of age of order five to describe the average growth curve of the population.Fischer et al. (2004), describing the growth curve of Poll Dorset sheep, using the same methodology, used age orthogonal polynomials of order three (quadratic).However, the authors reported that this decision was taken based on preliminary analyses performed by using ordinary least squares, with the help of the GLM procedure of SAS, ignoring the animal effect.Later, Sarmento et al. (2006a), using the methodology mentioned above for genetic evaluation of the growth curve of Santa Inês sheep, used Legendre orthogonal polynomials of orders three and four on the model fit as a whole, that is, from the average trajectory of growth and random regressions; the authors concluded that the model using a cubic continuous function provided a better fit.However, fixed curve modeling was the same attributed to the random part, which may not be the best analysis strategy.
In beef cattle, some authors investigated the order of fit in polynomial functions to represent the average curve of growth in random regression models.Meyer (1999), studying post-weaning growth of beef cows in two cattle breed in Australia, compared orthogonal polynomials of age of several orders by the analyses of ordinary least squares ignoring the animal effect.The author compared functions with orders varying from two to eight and concluded that the animal growth was better described by a cubic regression.In Brazil, Sakagutti et al. (2002) investigated the most appropriate order of polynomial function of age to describe the average growth curve in Tabapuã cattle.These authors also compared different orders in ordinary least squares analysis and concluded that the average growth curve should be represented by polynomials of, at least, order four.Arango et al. (2004) used the SAS MIXED procedure to find the best order of fit of the fixed regression over the age for modeling the average trajectory in beef cattle.According to these authors the quadratic regression of age was the highest polynomial order with significant effect, which was crucial for the authors to assume it as the best fit order.
Thus, this study has the objective of evaluating the most appropriate modeling to describe the average growth trajectory in Santa Inês sheep by polynomial functions of age of different orders, in random regression models, as well as investigating the influence of this fit on the estimation of the variance components and prediction of breeding values.The initial file consisted of 23,198 weight records measured at different ages.It was considered for analysis information on lambs weighed up to 196 days of age; born to single and twin birth; average daily weight within the average of all animals of the same contemporary group ± three standard deviations; and belonging to contemporary groups with at least five animals.The definition of contemporary group included: herd, year and season of weighing, sex of lamb, birth type and age classes of the animal.Two weighing seasons were chosen: rainy (from March to July) and season (from August to February).

A
The initial file consisted of 23,198 records of measured weights at different ages.Lambs weighed up to 196 days of age, born to single and dual delivery, with average daily weight within the average of all animals of the same contemporary group ± three standard deviations, and belonging to contemporary groups with at least five animals were considered for analysis of information.The contemporary group included: herd, year and season of weighing, sex of lamb, birth type and age classes of the animal.We defined two weighing stations: the rainy season (March-July) and dry season (August to February).
The age classes, grouped every 28 days (for a total of eight classes), were included in the contemporary group as suggested by Meyer (2005) to reduce the amplitude of ages of animals that were compared directly within each group.This decision was made based on the observation in previous studies of high amplitude of ages within contemporary groups that were formed not considering the classes, which directly reflected in an exacerbated increase of phenotype variance.After the assembling of the data and pedigree files, 17,767 weight records at different ages from 4,210 lambs born to 130 rams and 1,552 ewes were analyzed, totaling 5,537 animals in the relationship matrix.
As the age ranged from birth to 196 days, with weight records at average intervals of 28 days, the number of animals at some ages was small.Thus, to increase the number of animals at each age, they were grouped every three days, from 112 days, for a total of 151 age classes, including birth.
Initially, the average trajectory of the population was fitted by a fixed regression under orthogonal polynomials of age with orders varying from two to eight (k=2,…, k=8).The analyses were done by the ordinary least square method, using the GLM procedure of SAS (1999), and not considering the animal effect.The other fixed effects considered in this analysis were those of the contemporary groups and the age of the ewe at lambing, as linear and quadratic covariate.
In order to evaluate the fitting of the functions of different orders, the following criteria were used: significance of the regression coefficients; residual mean square (RMS), calculated as following: , where n is the number of observations, y i and y ^i are the observed and estimated values, respectively; the percentage of squared bias (PSB), proposed by Ali and Schaeffer(1987), which is given by ; mean absolute deviation (MAD), which was obtained as: , proposed by Sarmento et al. (2006b); and the coefficient of determination (R 2 ) -calculated as the square of the correlation between the observed and estimated weights by each function, which is equivalent to , where RSS is the residual sum of squares and TSS is the total sum of squares corrected by the mean.
In a second stage of analyses, a phenotypic random regression model of third order was fitted, including only the animal global effect as random, disregarding the relationship among the animals, using the program DXMRR (Meyer, 1998).For modeling the fixed regression, it was used Legendre polynomial function of age with orders varying according to what was previously described.The criteria used for the comparison of different orders were the same used in the previous analysis.
Later on, in order to evaluate the influence of the order of fit used to represent the average trajectory of the population over the estimated variance components and the predicted breeding values, seven animal models of different random regression were fitted, varying the order of fit of the fixed regression from two to eight.Both fixed and random regressions were represented by continuous functions, whose ages were described in terms of Legendre orthogonal polynomials.Generalizing, the studied models can be described as the following: in which: y ij = the weight on day j of lamb i; F = a set of fixed effects, constituted by the contemporary group (1,112 subclasses) and by the covariate age of ewe at lambing, linear and quadratic effects; β m = the fixed regression coefficient m of the weight over the Legendre Polynomial, with k β , polynomial order for the average growth curve, varying from two to eight (k β = 2, ..., k β =8) for modeling the average growth curve of the population; α im , γ im , δ im e ρ im = direct additive genetic regression coefficients, maternal additive genetic, maternal permanent environment and animal permanent environment regression coefficients, respectively, for the lamb i; k a =3, k m =3, k q =3 and k c =3 are orders of fit of corresponding Legendre Polynomial; φ m is the Legendre Polynomial function m of standardized age (-1 < age < 1); and ε ij denotes the residual random effect.
In the matrix form, the previous model, with its respective assumptions, can be described as: and in which y = a vector of N observations referring to N d animals; b = a vector that contains fixed effect and the b m coefficients of the fixed regression; a = a k a × N D vector of random regression coefficients for direct additive genetic, where N D > N d denotes the total number of animals in the analysis, that is, in the relationship matrix (5,357); c = k c × N d vector of random regression coefficients for animal permanent environment; m = a k m × N D vector of random regression coefficients for maternal additive genetic; q = k q × N m vector of random regression coefficient for maternal permanent environment, in which N m = the same number of ewes with offsprings; e = a vector of random errors; X, Z 1 , Z 2 , Z 3 and Z 4 = the incidence matrices of the fixed regression coefficients, random direct additive genetic, animal permanent environment, maternal additive genetic, maternal permanent environment regression coefficients, respectively.K a , K c , K m and K q = the (co)variance matrices among random direct additive genetic, animal permanent environment, maternal additive genetic and maternal permanent environment regression coefficients, respectively; A = the coefficient numerator of the relationship matrix among individuals; I Nd = an identity matrix of N d dimension; I Nm = an identity matrix of N m dimension; ⊗ = the Kroenecker product; R = diagonal matrix of residual variance.
The residual variance was considered homogeneous throughout the growth curve.The covariance between direct and maternal additive genetic effects was assumed to be zero.
The (co)variances among direct and maternal additive genetic, animal and maternal permanent environment random regression coefficients, according to the model fitted, were estimated by the restrict maximum likelihood model (REML), using the DXMRR program from the DFREML software (Meyer, 1998).
Changes occurred in variance components and genetic parameters estimated throughout the growth curve were evaluated, as well as the ranking of the animals (Spearman's correlation), based on predicted breeding values at weaning and at 196 days of age in relation to the fixed regression modeling.

Results and Discussion
The fixed effects of the contemporary group and the age of the ewe at birthing, linear and quadratic effects, considered in the analysis model of the growth curve were all significant (P<0.01,results not presented), in accordance with Sarmento et al. (2006b), when studying the growth curve of Santa Inês sheep.When evaluating the order of fit of the polynomial function of age, it was observed that the higher order with significant effect (P<0.01) was the third (k=3), suggesting that a quadratic function would be sufficient for describing the average growth curve.
Initially comparing the order of the fitted functions by the ordinary least square method, it was observed that the biggest change occurred when passing from a linear fit (k=2) to quadratic (k=3), as indicated by the criteria used for evaluating fit quality (Table 2).It can be observed that, according to RMS, MAD, R 2 and PSB, little improvement was obtained with the superior orders to the third (quadratic), in accordance with the significance of the regression coefficients in the analysis of variance.A slight decrease in the RMS was verified with the fit of order four and on RMS and PSB with the orders seven and eight.
It was observed that the linear fit was the worst (Table 2), while the others had similar orders of fit.The functions differed (Figure 2), mainly, from mid to the end of the growth curve, with the exception of the linear fit, which differed from the others at the beginning, in the middle, and at the end of curve, besides being further from the averages observed, confirming its inferiority.
The functions of order three, four and five tended to estimate curves closer together, although the one estimated by the quadratic function had moved away from most dots on the observed curve and the estimated curve by the polynomial of fourth and fifth order, after 100 days of age.Those estimated by orders four and five seem to have presented a higher flexibility, allowing it to estimate curves closer to the weights over the age.The difference between orders four and five was small, tending to increase with the proximity of the end of the curve.Thus, although not indicated by the criteria presented in Table 2, the cubic function provides a fit representing reality better.
It is also observed (Figure 2) that functions with orders superior to five, although not differing from the others, based on the criteria of Table 2, tended to present a less coherent behavior in terms of fit when compared to those of order three, four and five.The function of order six described a growth rate reduction when approaching to the end of the curve, estimating weights inferior to those  observed on the analyzed data.The functions of order seven and eight presented irregular behavior, at the end of the curve also, decelerating near the 160 th day and increasing again at 196 days of age, tending to increase the difference between these two estimated curves at the end of the trajectory.Similar behavior was verified by Sakaguti et al. (2002) with the function of order seven for modeling the growth curve in beef cattle.
A function of at least third order (quadratic), should be used (Table 2; Figure 2).However, an order four function (cubic) better reflected the reality of the changes than those of order three, that is, the cubic function fitted the average growth curve according to the body development of the studied sheep, mainly because it presented higher flexibility from mid to the end of the curve, at points that the functions most diverged.
The random regression analyses at the phenotypic level, including the global effect of the animal as random, were conducted only to examine the behavior of the average growth trajectories, estimated by Legendre polynomial functions of different orders, through the restrict maximum likelihood methodology, to compare with the analyses performed by ordinary least squares method.Fit improvement of the fixed regression was noticed through the mean absolute deviation, coefficient of determination and percentage of square bias by using the orders superior to two (Table 3).Mean absolute deviation considerably reduced with the increase from order two to four, tending to increase with superior orders, except with the eighth order which obtained the lowest MAD value.Similar behavior was verified with the PSB and R 2 .
Despite the best values of MAD, R 2 and PSB been obtained with the function with k=8, it was observed in Figure 2 that this order estimated a sinuous trajectory at the end of the period studied, which does not representing reality.However, it is evident that the polynomial function of fourth order (cubic) may be used to suitably represent the average growth curve of the Santa Inês sheep population under this study (Tables 2 and 3; Figure 2).These results contradict those obtained by Fischer et al. (2004), however they are in agreement with those obtained by Sarmento et al. (2006a and2010).
In an attempt to assess the influence of the order of fit of the fixed regression on the estimation of the variance components and on the resulting genetic parameters, they were estimated by the models which used the functions of different orders for the fixed regression.It is observed that the greatest influence of the order of fit of the functions to represent the average growth trajectory was on the animal random effect, with a greater proportion on the direct additive genetic effect (Figure 3).
The additive genetic variances estimated by the model that used the linear fit were superior to those estimated with functions of higher orders throughout nearly all the growth curve (Figure 3).Those estimated with the quadratic function, of order three, were slightly higher than those of order two from the middle to the end of the curve; the other orders estimated very close variances, in accordance with    the statistics used to evaluate the fit, indicating that orders that are higher than four virtually provide no improvement on the fit.For the animal permanent environment effect, it was verified a small difference between the linear fit and others, whereas the effect obtained with the linear fit was slightly inferior to those of superior orders (Results not presented).Maternal variance (genetic and permanent environment) was not influenced by the order used on the fit of the average growth curve.The models estimated that the maternal additive genetic and maternal permanent environment variances were very close, regardless of the order of fit of the function used in the fixed part of the model.
A behavior similar to that one seen with the estimates of the variance components was obtained with direct and maternal heritability estimates and with the proportions of phenotypic variance due to the animal and maternal permanent environment effects.Direct heritability estimated by the fit which used the linear function was superior to that estimated by the other functions, except at both ends of the curve (Figure 4), where all the estimates were similar.The quadratic function estimated heritability  slightly superior to the other functions, which were nearly the same.The highest heritabilities were estimated between 100 and 150 days of age, varying according to the used fit.In the linear fit, the highest direct heritability was 0.17, in the quadratic, it was 0.14 and in the others it was near 0.13.The genetic correlations estimated by the models that used different orders to fit the fixed regression hardly differed, except those estimated with linear fit.The correlations estimated with functions of orders superior to two were of high magnitude, near the unit.Those estimated with linear fit were slightly smaller on both ends of the curve, reaching 0.96.
It was observed that at weaning and at 196 days of age (Table 4), the lowest correlations were obtained between the breeding values predicted by the linear function and those predicted by the other functions, however the correlation obtained by the models that considered the linear and the quadratic fit was slightly higher (0.96).Despite the changes in ranking, they were not higher than 5% at weaning and than 4% at 196 days of age.Thus, except for the linear fit, the others virtually provided no change in the classification of the animals.
As the influence of the order of fit of the fixed regression over the rank of animals based on predicted breeding values by each one of the functions was moderate, that is, around 5% change when the fit was linear, it was also evaluated the possibility of change in the magnitude of the predicted breeding value due to the order of the function used.Thus, the animals were ranked based on the predicted breeding value at 112 days of age, which was obtained based on the model with the average growth curve fitted by the cubic function, and comparing the obtained breeding values of the eight best animals when it was used each of the functions (Table 5).
It is observed that there was, as expected, a great variation in the magnitude of the predicted breeding values for the weights at two ages (Table 5), although this higher variation occurred when the fitted function was linear or quadratic, when compared to the other functions.By observing the deviation, expressed in percentage, in relation to the predicted breeding value when the fit was cubic (Table 5), it was noticed that when the fit was linear, there was an increase of the predicted breeding value in more than 77% for the weight at weaning and more than 54% at 196 days of age.Smaller deviations, despite being considerable, were observed when it was a quadratic function, more than 11% for the weight at 112 days of age and around 13% for the weight at 196 days of age.With orders superior to four (cubic function), the deviations were smaller in magnitude, close to 5%.Based on the order correlations shown in Table 4, the quadratic fit hardly altered the classification when compared to the other fits, although the change of the values has been considerable (Table 5).When the fit was linear, there was some changes on the classification of about 5%, however, the change in breeding values ranged from 23% to 77%, for weight at 112 days of age, and from 10% to 54% for weight at 196 days of age.
Thus, it was observed that the modeling of the fixed curve changed with higher intensity the magnitude of the breeding values, that is, while altering the breeding value in a higher proportion, the change in classification did not happen with the same intensity.That might have occurred because certain fit provides changes to the breeding values in similar intensities to all the animals, providing little change in the classification, although the magnitude of values was modified.
The practical importance of this result is that even if an animal had high probability of being indicated as the best one based on predicted breeding value by a model using a quadratic or cubic function, the expected progeny differences (EPDs) of the same animal would be higher, which would result in erroneous estimates of annual genetic gains, in case the most suitable function is not the one used.

Conclusions
The use of a polynomial function of order four (cubic) proves to be efficient to describe the average growth curve, described by the order of fit of the function, of the population of Santa Inês sheep under this study.The estimates of direct additive genetic variance and heritability are influenced by the order of fit, which also influences with more intensity the magnitude of predicted breeding values than the classification of the animals.
total of 17,767 weight data obtained from weight records measured at different ages from 4,210 lambs born to 130 sire and 1,552 dams which belong to three experimental Santa Ines sheep flocks of the Paraíba State Company of Agricultural Research (EMEPA-PB) and Brazilian Agricultural Research Company (EMBRAPA Caprinos and EMBRAPA Tabuleiros Costeiros) were used.

Figure 2 -
Figure 2 -Dispersion of the average weights observed according to age ( ) and estimated average trajectory of the population (fixed regression under orthogonal polynomials of age) by least square according to the order of fit (k=2, ..., k=8).

Figure 3 -
Figure 3 -Components of direct additive genetic variances estimated by models that used functions of different orders to represent the average trajectory of the population.

Figure 4 -
Figure 4 -Estimates of direct heritability obtained by models that used functions of different orders (k=2, ..., k=8) to represent the average trajectory of the population.

Table 2 -
Fit of the average trajectory of the population by a polynomial function with different orders(k=2, ..., k=8)

Table 3
MAD = Mean absolute deviation; R 2 = Coefficient of determination; PSB = Percentage of squared bias.

Table 4 -
Spearman's correlation between the breeding values predicted for weights at weaning (above the diagonal) and at 196 days of age (below the diagonal) by models that used functions of different orders to represent the average trajectory of the population

Table 5 -
Breeding value and deviation (%) in relation to the predicted cubic function, for weight at 112 and 196 days of age for each of the fitted functions