INTRODUCTION:
Brazil, a major global producer of milk, is ranked fourth in world production and produces 34.1 million tons of milk per year (FAO, 2013). Considering the growth in the demand for milk in the foreign market and the potential of Brazil to meet a large part of this demand, achieving the internationally required milk quality standard is extremely important.
However, the Brazilian supply chain has significant heterogeneity in its production systems in all federal units; thus, meeting the current demand for milk has been problematic (^{ALEIXO et al., 2007}). Awareness of the heterogeneity of these systems is becoming increasingly important for effective communication with rural producers and for improvement of the quality of national milk (^{HOSTIOU and DEDIEU, 2012}).
To assist producers with the pricing of milk and direct technical assistance at rural property level, dairy industry has constructed a database of the monthly collections of the physical-chemical and microbiological characteristics of milk. In terms of milk composition, the following two main aspects are considered: the centesimal composition, which includes the fat, protein (PROT), lactose (LACT), total dry extract (TDE) and defatted dry extract (DDE) content, and the hygienic-sanitary component, which includes the somatic cell count (SCC) and total bacterial count (TBC). However, according to ^{BODEN MÜLLER FILHO et al. (2010}), the appropriate use of the monthly collection database requires analysis tools to simplify the use of the database. Furthermore, according to ^{ALEIXO et al. (2007}), multivariate data analysis techniques such as principal component analysis (PCA) combined with cluster analysis (AAG) are statistical tools that could be potentially used to resolve these problems.
The analysis of milk production groups and milk typology become indispensable tools for the dairy industry. However, interpretation of the data is difficult for rural producers. Providing more specific technical assistance for data analysis at this level could improve the quality of this raw material. The objective of this study was to form homogeneous groups of bovine milk production units based on the chemical and microbiological quality of milk via multivariate statistical techniques.
MATERIALS AND METHODS:
In total, 1,706 milk producing units (MPUs) were collected on a monthly basis from June 2008 to December 2011. Samples were analyzed to determine their fat, PROT, LACT, DDE and TDE content and SCC and TBC, which resulted in the analysis of 54,696 records. To reflect the particular uniformity of each MPU, results concerning the collective expansion tanks were excluded from the statistical analyses. Records were considered as monthly classes, and properties with less than four controls and with three standard deviations above or below the mean in the month were excluded. After the original database was edited, 44,089 records of 1,541MPUs were used in the statistical analyses. These data were from 15 municipalities in the east-central mesoregion of Rio Grande do Sul, which belongs to the microregion of Lajeado-Estrela.
The SCC data were transformed into somatic cell linear scores using the following equation: (SCLS) = [log_{2}(SCC/100)]+3 (^{SHOOK, 1993}). The TBC variable was defined as the natural logarithm of the initial TBC.
Subsequently, the multivariate analysis of variance (MANOVA) was performed by the general linear model (GLM) procedure and MANOVA command (^{SAS, 2002) }according to the following statistical model: Y _{ ijk } = µ _{ k } + H _{ ik } + e _{ ijk } in which Y _{ ijk } = the observed value of the variable within the k -th MPU and i-th replicate; µ _{ k } = the overall mean of the k-th variable; H _{ ik } = the fixed effect of the i-th MPU in the k-th variable; and e _{ ijk } = the random effect associated with observation Y _{ ijk } .
Because of the correlation between DDE and the other variables analyzed, DDE was excluded from the statistical model. In the multivariate analysis used to test the hypothesis asserting the means of treatment (MPU) vectors were zero, H _{ 0 } : µ _{ 1 } = µ _{ 2 } K = µ _{ 1541 } , the Wilks test was performed using the following equation:
where is the determinant of the matrix E that refers to the sum of the squares and residual products and is the determinant of the matrix A that refers to the sum of the squares and total products. Afterward, the PRINCOMP procedure was used to perform PCA (^{SAS, 2002}).
Subsequently, the first three dimensions of the principal components were used to group the milk properties according to their similarities (cluster analysis). The number of homogeneous MPU groups was obtained using the co-expressed correlation coefficient (CCC), pseudo-F and pseudo-t^{2}, which were expressed in relation to the first three dimensions of the principal components. The procedure PROC CLUSTER was used to perform the cluster analysis (^{SAS, 2002}).
The SAS^{®} System for Windows^{™} version 9.0 (SAS Institute Inc., Cary, NC, USA) was used to perform the statistical analyses.
RESULTS AND DISCUSSION:
The Wilks test showed a significant difference (P <0.05) in the mean vectors of the different MPUs upon performing multivariate analysis of the data. The first three principal components explained 81.38% of the total variation in the data: principal component 1 (PC1), PC2 and PC3 explained 38.66%, 25.02% and 17.70% of the data, respectively. In the three-dimensional space of the principal components, production units were grouped by similarity, permitting the reduction of dimensionality from 1,541 to 15MPUs, which were represented by three two-dimensional graphs (PC1 x PC2, PC1 x PC3 and PC2 x PC3) (Figure 1).
PC1 and PC2 exhibited a cumulative variability of 63.68%, demonstrating a smaller loss of information than the results in the literature, including those reported by ^{ALEIXO et al. (2007}) and ^{BODEN MÜLLER FILHO et al. (2010)}. These authors reported cumulative variabilities of 45.00%, 45.70% and 56.51%, respectively, for the first two principal components.
However, the dimensional set used must contain at least 70% of the total variance of the data. Thus, in this study, considerations were in relation to the first three principal components.
In addition, through visual inspection and according to ^{SMITH et al. (2002}), the angle between the vectors of the loads explains the correlation between the variables: if this angle is near zero, the correlation is positive; if this angle is near 180°, the correlation will be negative; and, finally, if this angle is near 90°, these variables are mostly unrelated.
The first two-dimensional representation provided evidence that the SCC and TBC are directly correlated with each other and, in turn, are inversely correlated with the LACT content (Figure 1). Likewise, as the values of SCC and TBC increased, ^{VARGAS et al. (2014}) and ^{VARGAS et al. (2013}), respectively, observed the same behavior for the LACT variable. The SCC and TBC are indicative of the hygienic-sanitary quality of milk. The high negative correlation between these microbiological indicators and LACT might be due to the decreased synthesis of this constituent due to alterations in the mammary gland as well as the decrease in LACT due to its absorption into the bloodstream. In addition, the use of this carbohydrate by breast pathogens might reduce the LACT content in milk (^{VARGAS et al., 2013}; ^{VARGAS et al., 2014}). Therefore, in addition to the SCC and TBC, the LACT content could function as a variable indicative of the sanitary quality of milk.
Interpretation of the principal components was performed by determining the correlations between the variables and the components. Thus, in the dimensions of the Cartesian plane, the variables that explained the variability on the x-axis (PC1) were fat (r=0.84, P <0.001), PROT (r =0.75, P <0.001) and TDE (r =0.99, P <0.001).Variables that explained the variability on the y-axis (PC2) included LACT (r =-0.86, P <0.001), the SCC (r =0.50, P <0.001) and the TBC (r =0.62; P <0.001). Variables that explained the variability on the z-axis (PC3) were the SCC (r =0.78, P <0.001) and TBC (r =-0.67, P <0.001). Therefore, the existing correlations showed the first principal component (x-axis) was linked to the chemical quality of the milk and the second (y-axis) and third (z-axis) principal components were linked to microbiological quality (Figure 1).
Furthermore, the first (PC1 x PC2) and second Cartesian planes (PC1 x PC3), which described the variability of the data in further detail (63.68% and 56.36%, respectively), enabled the placement of the clusters in relation to the different quality parameters of the milk. Thus, in both bi-dimensional projections, groups 4, 6, 7, 8, 10, 12, 13 and 14 were located in quadrants 1 and 4 and 1, 2, 3, 5, 9, 11 and 15 were located in quadrants 2 and 3. Therefore, the first and second Cartesian planes exhibited the highest and lowest values, respectively, of fat, PROT and TDE (Figure 1). However, considering averages of these variables, none of the groups formed were in disagreement with the regulatory standards for chemical quality of milk, established by the Ministry of Agriculture, Livestock and Food Supply (MAPA) Normative Instruction No. 62 (IN 62, ^{BRASIL, 2011}) (Table 1).
Group | No. of producers | Fat (%) | PROT (%) | LACT (%) | Minerals (%) | TDE (%) | DDE (%) | SCC (cells mL^{-1})^{1} | TBC (UFC mL^{-1})^{2} |
1 | 127 | 3.46 | 3.01 | 4.32 | 0.94 | 11.74 | 8.27 | 579.000 | 4,426.712 |
2 | 74 | 3.73 | 3.07 | 4.22 | 0.97 | 11.99 | 8.26 | 792.000 | 5,473.356 |
3 | 74 | 3.48 | 3.04 | 4.41 | 0.95 | 11.87 | 8.40 | 595.000 | 1,239.746 |
4 | 173 | 3.96 | 3.23 | 4.29 | 1.00 | 12.48 | 8.52 | 894.000 | 4,688.176 |
5 | 70 | 3.38 | 3.00 | 4.33 | 0.94 | 11.65 | 8.26 | 889.000 | 725.000 |
6 | 203 | 3.61 | 3.08 | 4.32 | 0.97 | 11.98 | 8.37 | 897.000 | 2,580.787 |
7 | 43 | 3.50 | 3.17 | 4.52 | 0.97 | 12.16 | 8.66 | 385.000 | 2,107.026 |
8 | 144 | 3.55 | 3.10 | 4.37 | 0.97 | 12.00 | 8.44 | 433.000 | 4,623.804 |
9 | 113 | 3.17 | 2.96 | 4.34 | 0.93 | 11.39 | 8.22 | 547.000 | 2,663.781 |
10 | 89 | 3.62 | 3.15 | 4.40 | 0.98 | 12.15 | 8.53 | 945.000 | 819.000 |
11 | 123 | 3.51 | 2.96 | 4.18 | 0.94 | 11.58 | 8.08 | 968.000 | 4,370.221 |
12 | 105 | 3.45 | 3.09 | 4.46 | 0.96 | 11.96 | 8.51 | 628.000 | 528.000 |
13 | 95 | 3.88 | 3.27 | 4.44 | 1.02 | 12.60 | 8.73 | 677.000 | 1,786.089 |
14 | 73 | 3.88 | 3.23 | 4.44 | 1.00 | 12.55 | 8.67 | 423.000 | 4,114.262 |
15 | 35 | 3.05 | 3.10 | 4.46 | 0.96 | 11.57 | 8.51 | 184.000 | 3,909.601 |
Mean | - | 3.55 | 3.10 | 4.37 | 0.97 | 11.98 | 8.43 | 655.93 | 2,799,042.18 |
Standard deviation | - | 0.25 | 0.10 | 0.09 | 0.03 | 0.37 | 0.19 | 0.02 | 0.03 |
CV | - | 7.05 | 3.14 | 2.15 | 2.64 | 3.06 | 2.21 | 5.17 | 7.56 |
Total observations | 1541 | 44.089 | 44.089 | 44.089 | 44.09 | 44.089 | 44.089 | 44.089 | 44.089 |
^{1}Data de-transformed from the somatic cell linear score (SCLS = [log_{2} (SCC / 100)] + 3). ^{2}Data de-transformed from the natural logarithm of the normal TBC. ^{3}CV
With regard to microbiological quality indicators, the differential correlations between the TBC variable and PC2 and PC3 were highlighted. TBC and PC2 were directly correlated (r =0.62, P <0.001), and TBC and PC3 were inversely correlated (r =-0.67, P <0.001). Thus, the positive/negative points of the microbiological quality of the productive strata were identified via the joint analysis of the two-dimensional planes PC1 x PC2 and PC1 x PC3, which resulted in technical assistance more specific than that from using the first Cartesian plane decision-making method.
In PC1 x PC2, the clusters located in the first (4, 6 and 10) and second quadrants (1, 2, 5 and 11) exhibited low microbiological quality. However, when the data was arranged in PC1 x PC3, these groups were distinguished with regard to the negativity of hygienic-sanitary standards. Groups 1, 5, 6 and 10 were considered inferior due to their high SCC (quadrants 1 and 2), while groups 2, 4 and 11 were considered inferior due to their high TBC values (quadrants 3 and 4). Thus, to obtain the desired improvements, the groups that presented high SCC values (1, 5, 6 and 10) at the producer level should have this parameter monitored periodically and continuously in individual cows to assist in the identification of animals responsible for the high counts of the expansion tank and, thus, more appropriately to direct the actions of the producers. In this sense, according to ^{MAIA et al. (2013}), the development of microbiological cultures of milk would allow the identification of pathogens that cause mastitis and, consequently, aid in determining treatment strategies.
Due to their high TBC values, productive groups 2, 4 and 11 should result in increased care concerning milk contamination by the resident microbiota outside the udder. ^{WINCK and THALER NETO (2012}) showed cleaning udders before milking affects the TBC: the producers who pre-immersed teats in disinfectants exhibited better results. In addition, the study showed care is required regarding personal hygiene, milker training and water, which is a potential contaminant source for milk because of its importance in milking activities.
The groups located in quadrants 3 (3, 9 and 15) and 4 (7, 8, 12, 14 and 13) in the two-dimensional projection PC1 x PC2 demonstrated high microbiological quality. However, the following groups were even more distinctive when placed in the PC1 x PC3 projection: groups 8, 14 and 15 (quadrants 3 and 4) exhibited low SCC values and groups 3, 7, 9, 12 and 13 (quadrants 1 and 2) exhibited a desirable TBC. Among these groups, when evaluating the two-dimensional projection PC2 x PC3, which explained 42.72% of the variability in the data and described only the microbiological quality of the milk, groups 12 and 15 differed more in relation to other strata. Group 12, comprising 105 MPUs, exhibited the lowest values of TBC (higher scores in relation to PC3), and group 15, comprising 35 MPUs, exhibited the lowest values of SCC (lower scores in relation to PC2). Therefore, these productive strata could be used as a positive reference in the technical assistance provided to the milk producer in order to alleviate the high SCC and TBC observed in other units.
Notably, all groups failed with respect to the SCC limit (400,000 cells mL^{-1}) and/or the TBC limit (100,000 colony-forming units [CFU] mL^{-1}), proposed by the last stratification of IN 62, which was implemented on July 30, 2016. Furthermore, six (2, 4, 5, 6, 10 and 11) and 13 groups (2, 3, 4, 6, 7, 8, 9, 10, 11, 13, 14 and 15) exceeded the SCC and TBC limits proposed by extinct IN 51, which is now extinct, of 750,000cells mL^{-1} and 750,000CFU mL^{-1}, respectively (Table 1).
Among the major obstacles to increasing the Brazilian export of dairy products are those related to sanitary and sanitary embargo. Thus, at the industrial level, a review of the SCC and TBC quality payment system is needed. In addition, at the producer level, an increase in educational actions with the purpose of increasing knowledge and awareness of issues aiming to improve milk quality is needed.
In this sense, at an industrial level, the use of reward and penalty systems to identify the main problems of raw material quality is possible with assistance from the data gleaned from the groupings placement in the Cartesian plane. This method would facilitate the logistics of capturing milk with quality chemical and microbiological characteristics, representing a dilution in production costs, as these indicators are directly related to the industrial yield.
CONCLUSION:
Multivariate statistical techniques were used to form 15 homogeneous groups based on the chemical and microbiological data of the bovine milk of 1,541 production units. These data were obtained from 15 municipalities in the central eastern mesoregion of Rio Grande do Sul, Brazil which belongs to the microregion of Lajeado-Estrela. The first three principal components accounted for 81.38% of the total variation in the data. The two-dimensional representations showed that the SCC and TBC were directly correlated with each other and inversely correlated with the LACT content. In addition, the different productive strata were characterized by their quality attributes, and the positive and negative microbiological characteristics of the milk were identified. In total, 105MPUs showed low TBC values, and 35MPUs showed low SCC values. Thus, the multivariate statistical techniques used herein were used in the generation of hypotheses concerning heterogeneous groups. Variation was reduced by categorizing data into similar groups using the qualitative variables of milk, and this type of information can aid in the technical assistance of rural producers.