Discriminant analysis based on sheep carcass conformation and finishing scores

ABSTRACT Carcass classification consists of grouping animals with similar carcass characteristics. When the groups are defined a priori, as in the case of conformation and finishing scores, the interest is to identify the contribution of each variable used in separating the groups. Therefore, discriminant analysis was used to discriminate Santa Inês animals according to the conformation and carcass finishing scores (score 2 = regular, score 3 = good) and to identify the variables that most contribute to the differentiation. The conformation and carcass finishing scores vary from 1 to 5. This study used scores 2 and 3, considering that the evaluated animals ranged between these two respective scales. The database consisted of information from 122 uncastrated Santa Inês sheep submitted to the confinement regime, of which 24 variables related to the carcass of the animals were recorded. Data were submitted to the Mardia test to verify multivariate normality, followed by the nonparametric k-nearest neighbor (k-NN) test. The stepwise procedure selected a particular subset of variables, and the Mahalanobis Distance (D2) was used to assess the separation of groups (p-value ˂ 0.05). The variables with the highest discriminatory power for the carcass conformation scores were cold carcass weight (CCW), external carcass length (ECL), and neck (NEC), for carcass finishing were live weight at slaughter (LWS), ECL, and thoracic perimeter (TP). The multivariate discriminant analysis proved efficient in allocating the animals in their groups of origin.


INTRODUCTION
In animal production, the carcass classification process consists of grouping carcasses with similar characteristics such as weight, finish, and conformation into classes.Finishing and conformation are necessary to classify and predict carcass quality.The conformation indicates the desirable shape of the carcass in terms of the desired profile (convex or concave), which represents the proportion of muscle and fat about the bone, and the finishing quantifies the amount of subcutaneous fat in the carcass visible to the evaluator and is used to select the destination market, compatible with consumer preference (JONES et al., 2021).
When groups are defined a priori, as in the case of conformation and finishing scores, the interest is to identify the contribution of each variable in separating the groups (scores).Cezar and Sousa (2007)  classify and differentiate objects or groups and select variables with greater discriminatory power (ALKARKHI; ALQARAGHULI, 2020).Therefore, it is used to select variables that discriminate between two or more groups and determine linear combinations of variables (JEON et al., 2013), which provide maximum discrimination between groups.
Discriminant analysis is also helpful in predicting which group an observation belongs to, based on knowledge of quantitative variables in a set of linear combinations of these variables (PARK et al., 2002).Considering that the conformation and finishing scores are primarily based on the visual assessment of the musculature and subcutaneous fat, a higher incidence of incorrectly classified observations may occur, evaluating helpful discriminant analysis.
Given the above, the study aimed to distinguish groups of conformation and carcass finishing of Santa Inês sheep, using a particular set of variables that best characterize or separate the different scores through discriminant analysis.The leading information of the experiments is presented in Table 1.The total database selected 24 independent characteristics related to the carcass of animals and two dependent variables (conformation and finishing) (Table 2).The carcass conformation and finishing scores range on a scale from 1 to 5 (score 1 = poor, score 2 = regular, score 3 = good, score 4 = very good and 5 = excellent) (CEZAR; SOUSA, 2007).In this study, conformation scores 2 and 3 were used, considering that the evaluated animals ranged between these two respective scales.

Statistical analysis
The discriminant analysis formed the groups according to conformation and finishing scores (2 and 3) of carcasses.The sample size was analyzed as suggested by Hair Jr et al. (2009), which recommends a proportion of n sample of 20 observations for each dependent variable (Table 2) and that at least the size of the smallest group of a category must exceed the number of independent variables (in this study 24 variables).
Multivariate normality was applied using the Mardia test based on asymmetry and kurtosis deviations (Table 3).The sheep carcass characteristics variables did not meet the multivariate normality assumption (p-value < 0.05).  1 When the assumption of multivariate normality is not met, nonparametric methods can be used to estimate the specific densities of the group, such as the test adopted in this study, k-nearest neighbor (K-NN) (ROSENBLATT, 1956).
The K-NN test does not require prerequisites of the normal data distribution to produce an effective model (PAN et al., 2020).This assumes that all variables correspond to points in n-dimensional space (BARBON et al., 2016)  new variable, the K-NN classifier selects k nearest neighbors in the data set according to a metric distance (TAHERI-GARAVAND et al., 2019).After k neighbors are found, an average value is calculated among neighbors and assigned a prediction value to an unknown instance (BARBON et al., 2017).
To identify the contribution of each measured variable and its importance in separating the groups, the linear combination of k variables called the discriminant function or canonical discriminant function was used to separate the groups.It is calculated by the following formula: where: The stepwise procedure was used to select independent variables with greater discriminatory power over the dependent variables.This procedure is a data mining tool that uses statistical significance to select the independent variables used in a given mathematical model (SMITH, 2018).The selection process for adding or removing variables was performed based on Wilks' Lambda statistical test (p-value ˂ 0.05).
The Mahalanobis Distance (D²) was used to separate the groups (p-value ˂ 0.05).D 2 considers the correlation of the data; it is calculated using the inverse of the variancecovariance matrix of the data set (MAESSCHALCK; JOUAN -RIMBAUD; MASSART, 2000).The function of D 2 is   =  +  1  1 +  2  2 + . . .+     calculated by the following formula: where is the inverse covariance matrix S between and .This matrix is calculated using the weighted average of the covariance matrices between the groups (OLATUNJI et al., 2019).In K-NN, the combined covariance matrix is used to calculate D².
Statistical analyzes were performed using the SAS ® Studio software.To perform Mardia's multivariate normality test, the %multnorm macro procedure was used.The nonparametric K-NN test was performed using the = npar method.The STEPDISC procedure, through the stepwise selection procedure, was used to find a subset of variables that best reveals differences between the groups, that is, the variables with greater discriminatory power.Mahalanobis D 2 was calculated using the CANDISC procedure, as well, to find linear combinations of variables that best summarize the differences between groups.To calculate the discriminant functions and classify the observations, the DISCRIM procedure was used.

RESULTS AND DISCUSSION
The independent variables selected by the stepwise procedure for the carcass conformation scores (2 and 3) and D 2 are shown in Table 4.In general, of the 24 original variables used in the study, three variables (CCW, ECL, and NEC) showed greater power of discrimination between the evaluated groups (p-value ˂ 0.05).
In the set of independent variables, there may be variables that have little influence on the dependent variables, and in this sense, the stepwise procedure is used to select those variables with the greatest contribution to the study (ALVES; LOTUFO; LOPES, 2013).According to Senra et al. (2007), this procedure is based on the observation that some variables contribute little to the average efficiency of the model, so, once identified, they can be removed.
Between the two groups of carcass conformation scores evaluated (2 and 3), we can observe a significant difference (p-value ˂ 0.05) based on D² (Table 4).An individual is assigned to a specific group if its discriminant score is lower than the cut-off value, obtained by calculating the weighted average distance between the centroids of the groups (MARDIA; KENT;BIBBY, 2000).Then, the centroids of the groups are calculated, and for each individual, the distances are evaluated, and an individual is assigned to a particular group based on the smallest distance from the centroids of the groups.Based on this, we can observe that the distance between the evaluated conformation score groups (2 and 3) was 1.45, resulting in the differentiation between groups (p-value ˂ 0.05).The canonical coefficients of the discriminant functions are linear combinations of the original variables where the coefficients maximize the separation between groups (ALONZO; ROTH; ROBERTS, 2013).
Table 5 shows the standardized canonical coefficients for the canonical variable (CAN1), referring to the three variables with the highest discriminatory power selected by the stepwise procedure for the carcass conformation scores. 1 CCW -cold carcass weight; ECL -external carcass length; NEC -neck.
Researchers often employ standardized canonical coefficients to help interpret the contribution of each response variable, especially when the variables are not proportional (ALKARKHI; ALQARAGHULI, 2020).Since the purpose of using the canonical coefficients is to determine the linear combinations that provide the maximum differentiation between groups, in this context, the coefficients of the linear combinations are the canonical coefficients that indicate the partial contribution of each original variable in the composition of the CAN (DIMAURO et al., 2013).
The next phase interprets the discriminant variables identified and the discriminant function described.We can observe that the three variables of the discriminant function exceed the value of the discriminant load recommended by Hair Jr et al. ( 2009), of ± 0.40, thus ensuring inclusion in the interpretation process since they are considered robust.
The CAN1, with a canonical correlation of 0.50, the characteristic that most contributed to composing the linear combinations was the independent variable CCW.Carcass cooling aims to delay or prevent microbial, chemical, and physical changes that reduce meat quality.When the carcass temperature is reduced (between 2 and 4 °C is recommended), the product changes its physical state and slows down the chemical and enzymatic reactions.This process can be influenced by the amount and distribution of muscle mass and fat present in the carcass, which characterize the conformation from a biological point of view.The confinement factor and aptitude for meat determine greater fat deposition and consequently lower muscle: fat ratio (SANTOS et al., 2010), as well as heavier animals.The high carcass weight associated with the breed, sex, and diet provides better fat coverage, making the CCW variable a useful tool for the correct method of conservation and meat quality and, consequently, the final marketing value.
The second independent variable to enter the model was ECL.This result reinforces what Caro et al. ( 2018) described on the importance of carcass length in the variability of carcass conformation.In the study above, the principal components were analyzed using variables of length, width, and depth of the carcass, and component 1 was responsible for 64% of the variability, with the highest scores corresponding to carcass length and loin length (scores > 0.40).Finally, the independent variable NEC also contributes to the variability of carcass conformation.The Santa Inês breed is a large animal whose neck is proportional to the body.As it is an animal with an aptitude for cutting, and in this study, the animals are submitted to the confinement regime, the amount and distribution of muscle and fat in the bone base may be higher.As described by Cezar and Sousa (2007), neck cuts, due to their later development, cause an increase in weight as the carcass weight increases.
The discriminant analysis also aims to assign observations to previously defined groups.ElMasry et al. (2011) reported that prior knowledge of predefined groups of the tested samples is a primordial prerequisite for discriminant analysis to differentiate between groups.Table 6 summarizes how many observations are classified correctly or incorrectly in the conformation score groups.
It can be seen that the groups were correctly classified in their origin group, indicating that the first canonical function presented good discrimination results for the separation of the groups.
Table 7 shows the independent variables selected by the stepwise procedure and D 2 for the carcass finishing scores (2 and 3).Three of the 24 original variables proposed for the general model of finishing scores (LWS, TP, and ECL) showed greater discriminatory power between the evaluated groups (p-value ˂ 0.05).Therefore, these respective variables presented a relevant discriminatory power in the estimation process, constituting the discriminant function.Regarding the finishing scores, the LWS variable was the first to enter the discriminant model, as it had the greatest significant difference between groups.The second variable to enter the model is TP, and there is an improvement in discrimination between groups, as evidenced by the decrease in Wilk's Lambda from 0.90 to 0.85.The addition of the third ECL variable in the discriminant function improved the quality of the model, as evidenced by the decrease in Wilk's Lambda value (from 0.85 to 0.81).For D² (Table 7), we can observe a statistically significant difference (p-value ˂ 0.05) between groups, suggesting that the groups of finishing scores (1 and 2) have different behavior.
Table 8 shows the standardized canonical coefficients for CAN1, referring to the three most discriminating variables selected by the stepwise procedure for the finishing scores. 1 LWS -live weight at slaughter; TP -thoracic perimeter; ECL -external carcass length.
As observed for the conformation scores, it can be seen that the three variables inserted in the discriminant function exceed the value of the discriminant load recommended by Hair Jr et al. ( 2009) of ± 0.40, confirming the excellent fit of the model.
CAN1 presented a canonical correlation of 0.43; the characteristic that most contributed to composing the linear combinations related to finishing was the independent variable LWS.It is known that the LWS variable has potential as an indicator that affects carcass characteristics and meat quality.As Jones et al. (2021) stated, heavier lambs are significantly more likely to have leaner and more muscular meat at weaning.However, this can interfere with the quality since carcasses with low subcutaneous coverage can cause shortening of the muscles by the cold and, consequently, less tenderness of the meat.This reinforces the importance of this independent variable in separating groups related to muscle coverage.The independent variables TP and ECL were the subsequent variables composing the discriminant function model.Restle et al. (2006) described that longer animals are preferable in the finishing stage, which is the case of Santa Inês sheep, with a greater thoracic perimeter.In summary, animals with these characteristics are consequently heavier and deposit a greater amount of covering fat (ROSA et al., 2014), justifying the importance of these independent variables in the variability of carcass finish.

Rev
As with principal component analysis, canonical variables are linear combinations of the original variables, reducing the amount of redundant information and allowing the first canonical functions to retain as much information from the original variables as possible.In this study, it was possible to observe that the CAN1, both for the carcass conformation and finishing scores (Table 5 and Table 7), involved a total variance of 100%.This is an evident indication that the analyzed database can be simplified by reducing the number of variables, eliminating those with little discriminatory power, thus facilitating the interpretation of results.
As observed in the carcass conformation scores, it is also clear that the carcass finishing groups were correctly classified in their origin group, indicating that the first canonical function presented good discrimination results for separating the groups (Table 9). 1 Based on the above, much data and information are generated about evaluating sheep carcass variables, which is usually complex.Transforming them into knowledge requires more reliable assessments; in this context, the discriminant analysis effectively analyzed the patterns of complex correlations between the variables, reducing the redundancy between them.The reduction of the dimensionality of the data, with the use of fewer variables, expresses that the variability of the data is very similar to when using all the variables, optimizing time and resources for the evaluation of the carcasses.

CONCLUSION
The multivariate discriminant analysis proved efficient in allocating the animals in their groups of origin (carcass scores).The variables with the highest discriminatory power for carcass conformation scores were cold carcass weight, external carcass length, and neck, and carcass finishing was live weight at slaughter and thoracic perimeter.These results will serve as a basis for future studies with the same objective.
The database comes from three experiments conducted in the Goat and Sheep sector belonging to the Center for Human, Social and Agrarian Sciences of the Federal University of Paraiba, located in Bananeiras, State of Paraiba, microregion of Brejo Paraibano.The database was composed of 122 uncastrated Santa Inês sheep carcasses information submitted to the confinement regime.Experiment 1 evaluated different levels of inclusion of forage cactus in the diet and restriction of voluntary water consumption on performance, carcass characteristics, and meat quality of Santa Inês sheep.Experiment 2 evaluated carcass characteristics and meat quality in Santa Inês sheep fed with increasing levels of guava agro-industrial residue (GAR).Experiment 3 evaluated the performance and meat quality of Santa Inês sheep submitted to feed restriction and refeed.The Ethics Committee approved the research protocol for the three experiments at the Federal University of Paraiba.

Table 1 .
Primary information from the experiments.

Table 4 .
Independent variables selected by the stepwise procedure for the carcass conformation scores (2 and 3) and D 2 (Distance from Mahalanobis).

Table 5 .
Standardized canonical coefficients for the canonical variable (CAN1) for the carcass conformation scores (2 and 3).

Table 6 .
Number of observations and classification percentage of carcass conformation score groups (2 and 3).

Table 7 .
Independent variables selected by the stepwise procedure for the carcass finishing scores (2 and 3) and D 2 (Distance from Mahalanobis).

Table 9 .
Number of observations and classification percentage of the finishing score groups (2 and 3).