Selection of models of lactation curves to use in milk production simulation systems 1

The objective of this study was to select models of lactation curves with a better adjustment to the observed data in models of milk production simulation systems. A data base on 6,459 recordings of daily milk production was used. These data were obtained from monthly and fortnightly controls of milk between 2004 and 2007, from 472 lactations of animals from ten different milking cow herd farms. Based on rolling averages of milk production (MP-L/day) per cow, the ten herd farms were divided into low (L < 15), medium (15  20). Data were also divided according to the lactation numbers in first, second, third or greater. Eight lactation curve models commonly used in literature were compared. The models were individually adjusted for each lactation. The goodness of fit used for comparison of those models was the coefficient of determination, mean square error, mean square prediction error and the Bayesian information criterion. The values for the goodness of fit obtained in each model were compared by using 95% probability confidence interval. Wilmink (1987) model showed a better adjustment for cows of the first lactation numbers, whereas the Wood (1967) model showed a better adjustment for cows of the third or greater lactations numbers for the low milk production groups. Wood model showed a better adjustment for all the lactation numbers for the medium milk production group. Dijkstra (1997) model showed a better adjustment for all lactation numbers for the high milk production group. Despite of being more recent, the model by Pollott (2000), mechanist based and with a higher number of parameters, showed a good convergence for the used data.


Introduction
The term lactation curve refers to a graphic representation of the ratios between milk production and lactation time starting at calving (Papajcsik & Bodero, 1988).Equations that describe milk production in time can be very useful in genetic breeding programs, herd nutritional management, decision taking on the culling cows and milk production simulation systems.
The lactation curve is also important because its wide characterization of the animal production throughout lactation allows to estimate the peak yield, lactation persistency and days in milk (Ferreira & Bearzoti, 2003).
The first attempts to mathematically represent the lactation curve were made by Brody et al. (1923) and Brody et al. (1924).However, only after the development of the model by Wood (1967) did the use of lactation curve models become more popular.Since then, many researchers have attempted to develop lactation curve models from empirical conceptions (Cobby & LeDu, 1978;Wilmink, 1987) or mechanist conceptions (Rook et al., 1993;Dijkstra et al., 1997;Pollott, 2000).
The objective of this study was to select models of lactation curves for analysis of milk production data to be used in milk production simulation systems.

Material and Methods
A database on 6,459 recordings of daily milk production (MP) was used.These data were obtained from monthly and fortnightly milk records from 2004 to 2007, from 472 complete lactations from ten different milking cow herd farms in the region of Viçosa, Minas Gerais, Brazil.All the farms are assisted by the Programa de Desenvolvimento da Pecuária de Leite (PDPL -RV) an agreement between Universidade Federal de Viçosa and Dairy Partner of America.
Based on rolling average milk production (MP), the ten herd farms were divided into low (MP < 15.0), medium (15.0 < MP < 20.0) and high (MP > 20.0).Except for herd farms 4 and 8, in which the cows were corn silage fed throughout the year, the other farms use the pasture production system, with bulk supplementation in the dry season of the year.All animals in the ten herd farms were either crossbred Holstein × Zebu (HZ) or purebred (PB) Holstein animals.The animals in the herd farms classified as low, medium and high milk production were predominantly crossbred 1/2 HZ and 15/16 HZ; crossbred 3/4 HZ and PB and crossbred 7/8 HZ to 31/32 HZ, respectively.
After classification, the data were ranked according to the lactation number, into first, second, third or greater, as reported by Val-Arreola et al. (2004) and Dematawewa et al. (2007).This procedure resulted in the formation of nine subgroups: L1 = low production and 1st lactation number; L2 = low production and 2nd lactation number; L3 = low production and 3rd or greater lactation number; M1 = medium production and 1st lactation number; M2 = medium production and 2nd lactation number; M3 = medium production and 3rd or greater lactation number; H1 = high production and 1st lactation number; H2 = high production and 2nd lactation number; and H3 = medium production and 3rd or greater lactation number.
Eight lactation curve models were compared and they ranged in the number of parameters and from empirical to mechanistic models (Table 1).
1 Yt = milk production per cow (L/day), t = days in milk, a, b, c, d, g, h = (all > 0) = parameters that define the scale or shape of curves.

Author
Lactation curve model 1  The parameters of the models were adjusted considering the set of information from each subgroup, using the interactive Gauss-Newton method modified for nonlinear regression, Proc NLIN procedure of SAS 9.0.After obtaining the fitted parameters, the following were estimated for each model: peak yield (PY), the peak day (Pday) and the 305 days corrected milk production (305MP).The calculations were made by using interactive methods in the STELLA 8.0 simulation program.The program uses stocks, flows and auxiliary variables to calculate the total accumulated value (305MP), the maximum point (Pday) and the value at the maximum point (PLP) of the lactation curve models.
The model was individually adjusted for each lactation within each subgroup, and the goodness of fit of the model parameters were calculated using the Proc MODEL procedure of SAS 9.0.The goodness of fit used to compare those models were the determination coefficient (R2), mean square error (MSE), mean square prediction error (MSPE) and the Bayesian information criterion (BIC).The values for the goodness of fit obtained in each model were compared by using the 95% probability confidence interval.
It was considered non-converged each lactation curve model that completed 100 interactions without reaching the reduction of the sum-of-squares error (SSE) or whose parameters converged to unreal values.In each subgroup, the percentage of convergence in each lactation curve model was calculated.When the convergence percentage for a certain lactation curve model was less than 50%, the model was considered of low convergence and discarded.

Results and Discussion
The production per cow ranged from 10.50 to 14.70 and from 16.14 to 19.26 L/day in five low milk production milk farm (L) and four medium milk production herd farms (M), respectively (Table 2).In the single herd farm of high production (H), the production was 22.91 L/day.
Within each group ranked by milk production (L, M, H), the production mean of the cows ranked three or greater (L3, M3 and H3) was higher than the cows of the second lactation number (L2, M2 and H2) which was greater than the first lactation number (L1, M1 and H1) (Table 3).By comparing milk production (L/day) among the groups, the production from the L3 subgroup, order of the greatest production in the L group, was smaller than that of M1, order of the smallest production in the M group.Furthermore, the production of the M3 subgroup, order of the greatest production in the M group, was smaller than that of H1, the order of the smallest production in the H group.
The lactation curve models by Cobby & Le Du (1978) estimated peak day values considerably closer to the beginning of lactation than the models by Wilmink (1987), Wood (1967), Rook et al. (1993) and by Dijkstra et al. (1993), in all the lactation numbers (Table 4).Considering the model by Brody et al. (1924), there was no need to show the peak yield and peak day values in the tables, because this model cannot represent increase in production at the beginning of lactation, and the estimate of maximum milk production was at the intercept, whose value corresponds to the a parameter.Although this characteristic did not permit the model to describe lactation curves peculiar to herd farms specialized for milk production, Thornley & France (2007) considered that that model can be adequately fitted for low milk production crossbred animals reared in tropical conditions.
The models by Rook et al. (1993), Dijkstra et al. (1993) and Pollott (2000) were excluded from the analysis of low milk production herd farms because they showed less than 50% convergence for the data (Table 5).The model by Pollott (2000) did not converge in any of lactations in the L1 and L2 subgroups.It is possible that those more recent   Table 4 -Parameters and estimates of the lactation curve models for the data of low milk production herd farms models, with a mechanist basis, did not adjust properly to lactations data of cows with low milk production, whose lactation curves do not present evident milk production peaks.Problems of convergence have been previously reported for the models by Rook et al. (1923), (Pérochon et al., 1996, Vargas et al., 2000) and Pollott (2000), Val-Arreola et al., (2000), but not for the model by Dijkstra et al. (1993).
There was no difference (P>0.05) in the L1 subgroup among the four lactation curve models that presented convergence greater than 50% neither for any of the goodness of fit measurements.In absolute values, the model by Brody et al. (1923) showed a low adjustment in all the goodness of fit measurements.Although the model by Brody et al. (1923) presented greater coefficient of determination, the model by Wilmink (1987) was selected to represent the lactations in the L1 subgroup because it presented, in absolute values, the best goodness of fit for the mean square error, mean square prediction error and Bayesian information criterion goodness of fit measurements.
The Wilmink (1987) model is a modification of the model by Cobby & Le Du (1978) where a is associated to the production level, b is related to milk production before lactation peak, d represents the decrease in production after the lactation peak and c is associated to the peak day.
Although the Wilmink (1987) model has a one more parameter than the model by Wood (1967), the value obtained with a-b corresponds exactly to milk production at the beginning of lactation, and variations in the value of the parameter of a scale do not produce alterations in the shape of the curve.The same is not true for the model by Wood (1967), therefore in this case, it cannot be considered precisely as a parameter of scale.
Regarded to the L2 subgroup, from the five lactation curve models that presented convergence greater than 50%, the model by Brody et al. (1923) showed an adjustment Table 5 -Model adjustment statistics for the data of low milk production herd farms lower (P<0.05)than the one by Wood (1967) according to the determination coefficient and mean square error goodness of fit measurements.The other models did not differ (P>0.05) in any of the goodness of fit measurements.
Although, in numerical terms, the model by Wilmink (1987) showed a better adjustment than the others for the mean square error and mean square prediction error goodness of fit measurements, the model by Wood (1967) was selected to represent the L2 subgroup because it showed a better adjustment for the coefficient of determination and Bayesian information criterion goodness of fit measurements, in addition to have presented a higher percentage of convergence than the model by Wilmink (1987).
The model by Wood (1967) had a greater advantage of producing a good fit measurement with only three parameters.Concerned to milk production simulation systems, this means a computer resource saving through the use of a fewer number of constants, although nowadays, computer resources do not present a very limiting factor.However, some criticisms have been made regarding to the difficulty of precise biological interpretation of the parameters and the tendency to under-and overestimate milk production in the middle and at the end of the lactation, respectively (Cobby & Le Du, 1978;Rook et al., 1993).Perhaps the most undesirable characteristic of this model was the nil of the intercept model.Nevertheless, for the use in milk production simulation systems, such as those by Rotz et al. (1999), Rotz et al. (2005), Rennó et al. (2008a, b, c), in which the time interval used in each interaction is greater than one day, this may not present serious problems.
In the L3 subgroup, considering the five lactation curve models that showed convergence greater than 50%, the models by Wood (1967) and Wilmink (1987) showed a better adjustment (P<0.05)than that one by Brody et al. (1923) for the determination coefficient, mean square error and mean square prediction error, but not for the Bayesian information criterion goodness of fit measurement.The fact that the model by Brody et al. (1923) has only two parameters helped to improve its adjustment by the Bayesian information criterion goodness of fit measurement.The other models did not present differences (P>0.05) for any of the goodness of fit measurements.The model by Wood (1967) was selected to represent the lactations in the L3 group because, in absolute values, it presented the best adjustment in all the goodness of fit measurements.Furthermore, except for the model by Brody et al. (1923), the model by Wood (1967) showed better convergence percentage than the others (Figure 1).
Although considered as low milk production, the lactations of the animals in the L group presented production peak after the beginning of lactation.Even though Val-Arreola et al. ( 2004) observed good adjustment of the model by Brody et al. (1923) for lactations of low milk production animals in Mexico, and Thornley & France (2007) considered it could be used for low milk production crossbred animals reared in tropical conditions, the model was shown not to be suitable to represent the lactations of animals in group L in any of the lactation number.
Once more, the models by Brody et al. (1924) and by Cobby & Le Du (1978) estimated peak day values considerably closer to the ones at the beginning of lactation (Table 6) than the models by Wilmink (1987), Wood (1967), Rook et al. (1993) and Dijkstra et al. (1993), in all the lactation numbers.For this group of farms, the models by Rook et al. (1993), Dijkstra et al. (1993) and by Pollott (2000) presented less than 50% convergence again and were not considered in the comparison of the goodness of fit measurements (Table 7).The model by Wood (1967) presented the best adjustment for all the lactation numbers of the medium milk production group (Figure 2).For the M1 subgroup, the model by Wood (1967) showed a better adjustment for the coefficient of determination and Bayesian information criterion goodness of fit measurements and greater percentage of convergence than the model by Wilmink (1987).For the M2 and M3 subgroups, in numerical terms, the model by Wood (1967) showed a better adjustment than the others for all the goodness of fit measurements except Bayesian information criterion in the M2 subgroup, for which the model by Brody et al. (1924) showed the best adjustment.The model by Wood (1967) has been widely used in several types of studies, such as for new models assessment (Cobby & Le Du, 1978;Rook et al., 1993), genetic breeding (Schneeberger, 1982;Grossman et al., 1984;Ferris et al., 1985;Faro & Albuquerque, 2002), milk production simulation systems (Rotz et al., 1999;Rotz et al., 2005;Rennó et al., 2008a;   Rennó et al., 2008b;Rennó et al., 2008c) and nutrition (Fox et al., 2003), because of its recognized capability allied to its simplicity.Models by Brody et al. (1924) and by Cobby & Le Du (1978) estimated, once more, peak day values remarkably closer to the beginning of lactation than the ones by Wilmink (1987), Wood (1967), Rook et al. (1993) and by Dijkstra et al. (1993) in all the lactation numbers (Table 8).The obtained results indicated the possibility that the models by Brody et al. (1924) and Cobby & Le Du (1978) have the characteristic of underestimating peak day.Concerning to the high milk production group, the model by Dijkstra et al. (1993) showed a better adjustment than the others in all the lactation numbers.Models by Rook et al. (1993) and Pollott (2000) were excluded from the analysis because they presented less than 50% convergence (Table 9).
For the H1 subgroup, from the six models that converged in more than 50% of the lactations, the model by Brody et al. (1924) showed a poorer adjustment (P>0.05)than the others for the coefficient of determination, mean square error and mean square prediction goodness of fit measurements and a poorer adjustment (P>0.05)than the model by Wood (1967), Cobby & Le Du (1978), Wilmink (1987) and by Dijkstra et al. (1993)  Values in parenthesis = standard deviation.a, b, c, d, g and h = parameters of the lactation curve models; PY = peak yield (L/day), Pday = peak day and 305MP = 305 days corrected milk production (L).
Table 8 -Parameters and estimates for the high milk production farm dat model by Dijkstra et al. (1993) showed a better fit (P>0.05)than the one by Brody et al. (1924) for the mean square error and mean square prediction error goodness of fit measurement and in absolute terms, presented a better fit than the others for all the goodness of fit measurements used.
In the H2 subgroup, the models by Wood (1967), Cobby & Le Du (1978), Wilmink (1987) and by Dijkstra et al. (1993) did not differ (P>0.05) in the adjustment of any of the goodness of fit measurements, while the model by Brody et al. (1924) showed a poorer adjustment (P>0.05)than the others in all goodness of fit measurements.The model by Dijkstra et al. (1993) showed a better adjustment than the model by Brody et al. (1924) (P>0.05) in the R2s, mean square error and mean square prediction error goodness of fit measurements, but not in Bayesian information criterion.The model by Brody et al. (1924) did not differ from the models by Wood (1967), Cobby & Le Du (1978) and by Wilmink (1987) (P>0.05) in the adjustment of any of the goodness of fit measurements.Numerically, the model by Dijkstra et al. (1993) showed a better adjustment than the others in all the goodness of fit measurements.
In the H3 subgroup, from the six models that converged in more than 50% of the lactations, the model by Brody et al. (1924) showed a poor adjustment (P>0.05) in all the goodness of fit measurements while the others did not differ (P>0.05) for any of the goodness of fit measurements.In absolute values the model by Dijkstra et al. (1993) showed a better adjustment than the others in all the goodness of fit measurements.
Val-Arreola et al. ( 2004) also observed the better adjustment of the model by Dijkstra et al. (1993) over the models by Brody et al. (1923), Wood (1967), Rook et al. (1993) and by Pollott (2000) for lactations of animals on more technological farms.The model by Dijkstra et al. (1993) showed a higher percentage of convergence in the high production subgroups than the low and medium production subgroups.This may have occurred because of the smaller number of data per lactation in the low and medium production subgroups compared to the high production subgroups (10 vs 20 data, respectively).Another reason that may have led to the lower convergence of the model by Dijkstra et al. (1993) in low and medium milk production subgroups was that the peak day was closer to the beginning of lactation.The model by Dijkstra et al. (1993) showed higher estimates of the peak day than those of the other models, in all the subgroups.A similar result was reported by Val-Arreola et al. (2004).Therefore, it is possible that the model by Dijkstra et al. (1993) is more suitable to represent lactations whose peak day occurs a day farther from the beginning of lactation (Figure 3).
At first, the model by Dijkstra et al. (1993) has two advantages over the model by Wood (1967), namely, the precise biological meaning of the parameters and the value of the intercept that is not nil.The model by Dijkstra et al. (1993) was conceived under a mechanistic basis similar to that of the model by Brody et al. (1924), in which milk production is the function of the number of producer cells present in the mammary gland parenchyma during lactation, which is a result of the dynamic relationship between the new cell differentiation and differentiated cell death.
In the model by Dijkstra et al. (1993), the a parameter represents the theoretical initial milk production and is a product of the number of differentiated parenchyma cells and milk production per cell that is assumed as being constant during lactation.The b parameter represents the cell proliferation rate at birth; the c parameter the rate of decrease in cell proliferation during lactation, and the d parameter, the cell death rate during lactation.
Although the value of the a parameter in the model by Dijkstra et al. (1993) represents exactly the initial milk production, it cannot be precisely considered a parameter of scale, unlike the parameter in the model by Wilmink (1987) because modifications in its value produce alterations in the shape of the lactation curve.

Conclusions
In groups of herd farms of low milk production, the model by Wilmink (1987) shows a better adjustment for cows of the first lactations numbers, while the model by Wood (1967) shows a better fit for cows of the second and third or greater lactation numbers.The two lactation curve models can be applied to milk production simulation systems for cows of low milk production.The model by Wood (1967) shows a better adjustment for cows of all the lactation numbers of medium milk production group, therefore it can be applied to milk production simulation systems with cows with medium milk production.The model by Dijkstra et al. (1993) shows a better adjustment for cows of all lactation numbers in the high milk production group, being able to be applied for milk production simulation systems with cows of high milk production.

1
Values in parenthesis = standard deviation.a, b, c, d, g and h = parameters of the lactation curve models; PY = peak yield (L/day), Pday = peak day and 305MP = 305 days corrected milk production (L).

Figure 1 -
Figure 1 -Lactation curves and lactation curve models selected that best fit the data of the L1, L2 and L3 subgroups.L1 = low milk production (MP) first lactation number (LN), L2 = low MP second LN, L3 = low MP third or greater LN, y = milk production (L/day), x = days in milking.

1
Values in parenthesis = standard deviation.a, b, c, d, g and h = parameters of the lactation curve models; PY = peak yield (L/day), Pday = peak day and 305MP = 305 days corrected milk production (L).

Figure 2 -
Figure 2 -Lactation curves and lactation curve models selected, that best fit the data of the M1, M2 and M3 subgroups.M1 = medium milk production (MP) and first lactation order (LN); M2 = medium MP and second LN; M3 = medium MP and third or greater LN; y = milk production (L/day), x = days in milking.

Figure 3 -
Figure 3 -Lactation curve and selected lactation curve models that showed a good adjustment for the data of the H1, H2 and H3 subgroups.H1 = high milk production (MP) and first lactation number (LN); H2 = high MP and second LN; and H3 = high MP and third or greater LN, y = milk production (L/day), x = days in milking.

References
BRODY, S.; RAGSDALE, A.C.; TURNER, C.W. The rate of decline of milk secretion with the advance of the period of lactation.The Journal of General Physiology, v.5, p.442-444, 1 9 2 3 .BRODY, S.; RAGSDALE, A.C.; TURNER, C.W. The relation between the initial rise and the subsequent decline of milk

Table 1 -
Lactation curve models

Table 2 -
Classification of the herd farms by milk production and descriptive statistics regarded to the used the database

Table 3 -
Descriptive statistics and data classification according to milk production mean and the lactation number of the cows

Table 6 -
Parameters and estimates for the data of medium milk production herd farms for the Bayesian information criterion goodness of fit measurement.The

Table 9 -
Goodness fit model for the data of the high milk production herd farms