PERFORMANCE ANALYSIS AND CARCASS CHARACTERISTICS OF SANTA INÊS SHEEP USING MULTIVARIATE TECHNICS

The objective of this study was to apply multivariate analysis techniques such as principal component and canonical discriminant analyses to a set of performance and carcass data of Santa Inês sheep, to identify the relationships and select the variables that best explain the total variation of the data, in addition to quantifying an association between performance and carcass characteristics. The main components generated were efficient in reducing a cumulative total variation of 25 original variables correlated to four linear combinations, which together explained 80% of the total variation of the data. The first two principal components together explained approximately 65% of the total variation of the variables analyzed. In the first two linear combinations, the characteristics with the highest factor loading coefficients were cold carcass weight (CCW), hot carcass weight (HCW), empty body weight (EBW), average weight (AW), croup width (CW), cold carcass yield (CCY), and hot carcass yield (HCY). The variables selected in the canonical discriminant analysis, in order of importance, were total carbohydrate intake (TCI), total digestible nitrogen intake (TDNI), dry matter intake (DMI), non-fibrous carbohydrate intake (NFI), and fiber detergent neutral intake (NDFI). The first canonical root shows a correlation coefficient of approximately 0.82, showing a high association between the performance variables. The classification errors in the discriminant analysis were less than 5%, which were probably due to the similarity between individuals for the studied traits. The multivariate techniques were adequate and efficient in simplifying the sample space and classifying the animals in their original groups.


INTRODUCTION
The good adaptability of sheep to semi-arid conditions was a determining factor for the growth in sheep farming in Brazil (SILVA, 2017b). This favorable condition for sheep allows advances in animal productivity, since this is the main focus of farmers, always seeking to meet market demands (SALES, 2017). In parallel with this growth is the increased demand for lamb, further boosting the consumer market. The value of animals destined for meat production is estimated by the carcass yield, which expresses, in percentages, the relation between the carcass and the animal's weight (LEÃO et al., 2012;SAÑUDO et al., 2012). However, carcass characteristics depend on the hereditary, nutritional, and environmental factors to which animals are subjected (ESTEVES et al., 2018).
Most consumer markets require standards for different commercial cuts (LEONIR et al., 2011). Therefore, there must be a standardization of the sheep carcass made available to the market, allowing greater consumer acceptance (MAYER et al., 2017;MAYSONNAVE et al., 2018). Carcass measurements allow comparisons between weight and age at slaughter, in addition to enabling the best choice of more viable feeding systems (SILVA; PIRES, 2000;SILVA et al., 2010). In this market condition and understanding the essential needs for good production performance, the selection of animals is necessary to guarantee a high growth rate, good reproduction, meat quality, and satisfactory carcass yield (MAYER et al., 2017).
Carcass characteristics are essential in the meat production system and, despite the efforts of sheep meat farmers, most of the meat offered in Brazil still originates from animals with a low carcass quality (RODRIGUES et al., 2015). Taking into account that performance and housing characteristics are associated with numerous variables, methods have been developed that allow the simultaneous use of all variables, allowing theoretical interpretation of the data set (KIRKPATRICK; MEYER, 2004). The studies carried out using multivariate analysis techniques are extremely efficient because, through the use of these, it is possible to combine a larger amount of information to predict phenomena and simplify the experimental analyses based on a database with variables linked to the experimental plan (MELO, 2017).
There is still no practical limit to indicate the number of characteristics necessary for evaluation in an experiment. However, it is relevant to examine those that could be disregarded, either for contributing little to explain the variation in the data or for being considered less discriminating or redundant (CRUZ; REGAZZI, 1997;CÂNDIDO et al., 2015;VELOSO et al., 2015).
Thus, with this study, the objective was to apply principal component and canonical discriminant analyses to Santa Inês sheep performance and carcass database, aiming at reducing the dimensionality of the sample space, identifying and selecting the variables that best explain the total variation of the data and those that best explain the variation between groups and, consequently, the power to discriminate and classify animals to their groups of origin.

MATERIAL AND METHODS
The database was obtained from information from three scientific experiments carried out by Cardoso et al. (2016), Silva (2017a), andSantos (2017) at the sheep sector of the Federal Rural University of Pernambuco. The data consisted of information on 112 uncastrated male Santa Inês sheep, with an average age of five months. From the total data, 26 variables were selected, with consumption of neutral detergent fiber (CFDN) being the variable used to group the others according to neutral detergent fiber (NDF) levels. NDF levels were divided into low, between 14-26%; intermediate, between 27-50%; and high, above 51%. The other variables were divided into two distinct groups.
In the first group, eight variables related to consumption were selected: dry matter intake (DMI), organic matter intake (MOI), crude protein intake (CPI), ether extract intake (EEI), total carbohydrate intake (TCI), non-fibrous carbohydrate intake (NFCI), total digestible nutrient intake (TDNI), total digestible nutrient (TDN) and, in the second, there were 17 variables related to carcass yield, namely: average weight (AW), external length (EL), internal length (IL), chest width (CW), croup perimeter (CP), croup width (CW), leg length (LL), leg perimeter (LP), chest depth (CD), chest perimeter (CP), cold carcass weight (CCW), carcass compactness index (CCI), hot carcass weight (HCQ), hot carcass yield (HCY), cold carcass yield (CCY), empty carcass weight (ECW), and rib eye area (REA). The data were submitted to statistical analysis performed using Statistica software, version 13.3. All variables in the database were standardized by their mean and standard deviation, according to Mingoti (2005). This action aimed to eliminate the differences in the units of measurement, avoiding redundancies in the application of the techniques of principal components and canonical discriminant analyses.
In principle, for analysis, the correlation matrix of the original X variables was used: DMI, OMI, CPI, EEI, TCI, CNFCI, TDNI, TDN, AW, EL, IL, CW, CP, CW, LL, LP, CD, CP, CCW, CCI, HCW, HCY, CCY, ECW, and REA. With the application of the main component technique as a starting point, the correlation matrix P25 x 25, from the set of variables X (X1, X2, ...., X25), was transformed into a set of k main components (Y1, Y2, ...., Yk), so that each consists of a linear combination of standardized variables, not correlated with each other, and arranged in decreasing order of variance.
The relative importance of a component was analyzed through the percentage of the total variance, obtained by the eigenvalue concerning the total eigenvalue of all components. The choice of components that explained the variance of the data set was determined by eigenvalues equal to or greater than one, according to the Kaiser criterion (1960). The first principal component aggregates the maximum variance among all linear combinations. The second component is not correlated with the first and explains the second largest variance, the same occurring with the others (HAIR et al., 2009).
Canonical discriminant analysis is a technique that seeks to identify the linear combinations that best promote separation in groups, being possible to describe the relationship between two groups of variables, calculating the combinations of the maximum correlation. Statistica software, version 13.3, was used. To select the variables to perform the discriminant model, the stepwise method was used, where the variables were inserted one by one.
As a criterion for input and output of variables, according to Johnson (1998), using the stepwise method, the recommended level of significance is 25% to 50% for input variables and 15% for output variables. The data were all standardized by the Standard procedure, a command present in the software whose purpose is to eliminate the effect of the different units of the variables present in the database.

RESULTS AND DISCUSSION
The analyzed variables were subjected to standardization by mean and standard deviation, to eliminate the differences between units of measurement and thereby avoid redundancies in the application of the aforementioned multivariate techniques. Means and standard deviation (SD) of the 25 variables analyzed are shown in Table 1. DMI -dry matter intake, OMI -organic matter intake, CPI -crude protein intake, EEI -ether extract intake, TCItotal carbohydrates intake, NFCI -non-fibrous carbohydrates intake, TDNI -total digestible nutrient TDN -total digestible nutrient, average weight -AW, EL -external length, IL -internal length, CW -chest width, CP -croup perimeter, CW -croup width, LL -leg length, LP -leg perimeter, CD -chest depth, CP -chest perimeter, CCWcold carcass weight, ICC -carcass compactness index, HCW -hot carcass weight, HCY -hot carcass yield, CCYcold carcass yield, EBW -empty body weight.
According to the results obtained with the principal component analysis of the 25 components obtained, only the first four were selected ( Table 2). The components that had the highest absolute value were selected from those obtained (KHATTREE; NAIK, 2000;MUNIZ et al., 2014). The four main components selected explain together approximately 80% of the total variation of the data, representing about 20% less explanation of the total variation.  Applying the principal components technique, Guedes, Ribeiro and Carvalho (2018) managed to explain 80.43% of the total variation of data with the first five principal components, representing less than 20% of loss of explanation of the total variation. Bonvillani et al. (2010) found that the first six components were able to explain approximately 85% of the total variation of carcass quality in their work to characterize Cordobêz kid carcasses.
The accumulated total variation in the first two principal components was 65.07%. These two linear combinations represent the characteristics with the highest weighting coefficient, in absolute value. The first component (Table 3) had as its most significant weighting coefficients the characteristics CCW, HCW, and EBW, all with 0.27, indicating that index as carcass conformation. In the second component, on the other hand, the characteristics with higher factor loading were AW (0.34), CW (0.32), and HCY (0.30), indicating that traits the best to perform an index for consumption and carcass characteristics.
These characteristics allow that some measures can be used as indicators of growth and development of the animal (TEIXEIRA NETO et al., 2016;PINHEIRO;JORGE, 2010). The variables with the most significant influence in constituting the discriminant model are shown in Table 4. For the subset of selected variables to have a higher percentage of correct classifications, the Mahalanobis Distance was applied, to ensure the maximization of the distance between the closest groups, as each variable was inserted in the Fischer equation (RAUSCH;KELLEY, 2009). The most essential variables to discriminate groups according to NDF levels were TCI, with a partial R 2 of 0.8954, and TDNI, with a partial R 2 of 0.8947. The classification errors observed were minimal and, approximately 95% of the animals were in their corresponding group, similar to a study by Ribeiro et al. (2015). Of total animals, only four individuals were not classified in their group, as shown in Table 5. Probably the changes occurred due to the greater proximity of the NDF levels t in the target groups of the four animals. The classification errors depend on the variables used in the study (YAKUBU et al., 2012;CORREA et al., 2013) so that the highest percentage of classification errors observed is due to the similarity between the variables in those groups.  Table 5. Percentage of animals classified in each group according to the mechanism studied.
The graph for canonical discriminant analysis of the groups studied, based on all variables tested, is shown in Figure 1. There was an evident distinction between the groups due to the low classification error. However, it is possible to perceive the proximity between the groups, which contributed to the classification errors that occurred. According to Castanheira et al. (2010), performance variables are good indicators of animal health; however, it is necessary to know the influence of factors linked to animal production, which range from the physiological state to the environmental conditions that they are conditioned to. The first canonical root showed a canonical correlation coefficient of approximately 82% (Figure 1), indicating a high association of this with performance variables. In a study analyzing carcass characteristics of Morada Nova sheep, Guedes, Ribeiro and Carvalho (2018) identified, as the main canonical variables, body weight at slaughter and posterior width, obtaining, with these variables, the highest canonical coefficients standardized in the module, where they had the higher discriminatory power. This fact shows the action that physiological and environmental factors have influencing the variables that have the greatest representation to explain the study outcomes.

CONCLUSION
The principal component analysis was efficient in reducing the data set into four linear combinations. The variables with the most significant power to explain the total variation of the data were: CCW, HCW, EBW, average weight, croup width, and HCY, these being the characteristics suggested for consideration in future studies. The variables with the greatest discriminatory power, selected with standardized canonical coefficients, between the groups, were TCI and TDNI.