Cluster evaluation of Brazilian and Moroccan goat populations using physical measurements

The aim of this study was to compare the genetic diversity of 12 populations of goats in Brazil and Morocco (n = 796) through the use of physical measurements and different multivariate techniques. Traits measured included wither height (WH), distance from the brisket to the ground (BH) and ear length (EL). The standardized Euclidean distance (D) was adopted. The D values were submitted to clustering analysis using hierarchical methods (from nearest neighbor and UPGMA Unweighted Pair Group Method with Arithmetic Mean) and the numbers of clusters were analyzed using the Tocher optimization method. The population clustering was different depending on the method of analysis used. Among the hierarchical methods, UPGMA showed the best fit (CCC = 0.82). The Tocher method enabled the formation of four different clusters. Although the hierarchical and Tocher methods resulted in different cluster formations, both contributed to the interpretation of the genetic cluster divergence. The results obtained through UPGMA and Tocher optimization enable their use for future studies that may include a larger number of biometric variables on greater numbers of individuals and additional populations.


Introduction
Goats can be classified according to type, use (milk, meat or dual-purpose) and geographical distribution.The most popular of the European dairy breeds are the Saanen, Alpine, Toggenbourg and Anglo-Nubian (Mason, 1988).In South Africa, the most popular breed is the Boer (Casey & Van Niekerk, 1988;Almeida & Schwalbach, 2000) and in Morocco are Drâa and Rhâali (Hossaini-Hilari & Mouslish, 2002).There are three types of goats in the northeastern of Brazil, the Marota (Freitas, 1941;Santiago, 1944), Azul (Machado, 1995) and Nambi (Barros, 1987;Santos, 1987), but the goats in northeastern Brazil consist mainly of undefined breeds (UDB) which, as the name implies, include a vast variety of coat patterns and conformation.
The physical measurements are important in studies of genetic diversity for considering the variation existing among various breed groups and also allowing for breed identification (Epstein, 1953;Mason, 1988).This methodology has been enhanced using the simultaneous measurement of several characteristics and the establishment of index between two physical measurements (Bourzat et al., 1993;Bouchel et al., 1997).The identification of different genotypes, including crossbred ones, has been successfully exploited by multivariate analysis techniques (James & McCulloch, 1990;Franci et al., 2001).Among the most used methods in genetic divergence are the hierarchical and optimization methods (Cruz & Carneiro, 2006).
The most dissimilar genetic groups based on these types of analyses can be used, for example, in the formation of the so-called composite-cattle, in which mating is performed in a way to maintain heterosis and genetic diversity.In small populations it is possible to use these results to suggest matings that may result in greater heterozygosity.Therefore, it becomes evident that the study of diversity among populations and intra-populations can be useful to genetic improvement and conservation, guiding public policies and private initiatives.
The objectives were to analyze, by data of physical measurements, some methodologies for evaluation of variability among goat populations and intra-populations, and to determine the most effective clustering method, as well as to contribute to the knowledge of divergence among the populations studied.

Material and Methods
The data used in this study were obtained from 796 female goats, of more than two years of age, from various herds in Brazil and Morocco.In Brazil, the data were from commercial breeds in Minas Gerais State and the Federal District.A total of 34 Toggenbourg, 86 Saanen, 28 Anglo-Nubian, 78 Alpine and 26 Boer goats were sampled.In Piauí State, in the northeast region of Brazil, we sampled local types including 29 Azul, 32 Marota, 35 Nambi and 123 Undefined Breed or UDB-PI goats.
In Morocco, the goats evaluated included: 102 local Drâa goats at the Center for Goat Research and in Sidi Flah village, Skoura; 34 goats called Zagora (locally considered to be crossbreds of Drâa) in Demnate, Ouarzazate, Center for Goat Research in Tahnnaout and Marrakech; and 189 Rhâali goats in Zagora.
The physical measurements were taken using a tape measure and the animal was kept in a correct vertical position (Figure 1): wither height (WH) is the distance between the withers highest point to the front the distal extremity of the leg; the brisket height to the ground (BH) is the distance from the brisket to the ground; and the ear length (EL) is the distance from the base of the ear to its end.Thoracic depth (TD) is calculated as the difference between WH and BH.Although TD could be measured directly in the field, it is not easy to do so.Indexes were also calculated between the various measures, such as TD/WH, EL/TD and EL/WH.
The absence of valid records for birth dates resulted in the necessity of the use of dental chronology evaluation to estimate age, according to the methodology described by Quittet (1978).
The data were analyzed by simple descriptive statistics, variance analysis (ANOVA) and Student Newman Keuls test -SNK (P<0.05) to compare averages of different populations (goat breeds and types) through the GLM procedure of SAS ® (Statistical Analysis System, version 8.0).Analysis of variance examined effect of population on the physical measurements.The effect of the multicollinearity linear dependence between variables, which can lead to the formation of singular or poorly conditioned matrices, was examined.
In the multivariate analysis of variance (MANOVA), the criterion of Wilks (Λ), quoted by Johnson & Wichern (1998), was adopted to evaluate the difference among the vectors of means of goat genetic groups; i.e., when there are significant differences among populations, it is expected to always have Λ<1, and the lower the estimate value, the more significant it is (Ferreira & Souza, 1997).
Clustering analysis was conducted by adopting the standardized Euclidean distance (D) as a dissimilarity measure.Three clustering methods were used, two of them hierarchical (Nearest neighbor and UPGMA -Unweighted Pair Group Method with Arithmetic Mean), and the Tocher optimization method.To circumvent both the problem of scale and the influence of the number of characters when using the mean Euclidean distance, it is recommended to standardize the data.This standardization was carried out according to Cruz & Carneiro (2006), by X i = x i /S(x i ), where S(x i ) is the standard deviations of the characteristics.Thus, 2 ) 0.5 is the standardized Euclidean distance (D) for the populations i and i', where v is the number of characteristics analyzed.
In the hierarchical method of nearest neighbor, the populations were clustered by means of smaller distances D through a process that is repeated at several levels until the dendrogram or tree diagram is established.In this case, there was some concern about the optimal number of groups, since the interest is in the "tree" and in the ramifications that are obtained (Cruz & Carneiro, 2006).
In the "average linkage" method or UPGMA (Unweighted Pair Group Method with Arithmetic Mean), the arithmetic average of D values was used to avoid characterizing the dissimilarity by extreme values (maximum or minimum) between the populations concerned.
The cophenetic correlation coefficient (CCC), proposed by Sokal & Rohlf (1962), was used in the hierarchical methods (nearest neighbor and UPGMA).The higher the value obtained for CCC, the lower the distortion caused by the population clustering.According to Rohlf (1970), in  practice, the dendrograms with CCC lower than 0.7 indicate an inadequacy of the clustering method to summarize the information from the data set.
In the Tocher optimization method, quoted by Rao (1952), the criterion is that the average measure of dissimilarity within each group must be lower than the average distances between any groups.This method differs from the hierarchical methods by mutually forming exclusive groups.
The relative importance of the physical measurements and indexes for the divergence were evaluated according to the methodology of Singh (1981).To give a trustworthy interval to the dendrograms constructed through hierarchical methods, bootstrap analysis (1000 repetitions) was performed by providing the percentage of similar replication to the original data.
Analyses were performed by SAS for WindowsNT, (Statistical Analysis System, version 8.0), licensed by Universidade Federal de Viçosa, and GENES (version 6.0).

Results and Discussion
The results of ANOVA for the measurements and biometric indices assessed in 12 goat populations revealed the existence of significant differences through F tests (P<0.05)among populations for all measures and indices evaluated (Table 1).The coefficient of variation (CV) of the characteristics and indexes indicated adequate precision of the estimations.The biometric measurements presented CV values lower than 13%.The variables that had the highest coefficients of variation were EL (12.8%) and indexes containing EL: EL/TD and EL/WH (18.2 and 13.0%, respectively).This higher variability of EL in the sample can be explained by the groups that exhibited very small ears, the Nambi breed, and the groups that presented medium and long ears.The percentage of CV for the variables was similar to those found by Dossa et al. (2007) in the study of goat populations in Western Africa (6.3% for WH; 10.5% for EL; and 11.1% for TD).
The occurrence of significant differences between goat populations (Table 1) was expected since the sample included populations that present some highly contrasting phenotypic characteristics.This situation is a favorable indication of substantial genetic divergence (Cruz & Carneiro, 2006).
As the ANOVA was significant (P<0.05) for all physical measurements (Table 1), the SNK test was applied.Five to nine groups of averages were formed (Table 2).According to the mean biometric data of the different goat populations, compared by the SNK test (Table 2), the Saanen and Alpine  dairy breeds only diverged in WH, in which the Alpine was the highest of all the populations.Toggenbourg goats were similar to Saanen and Alpine for EL and differed from both in TD, in which they were inferior to both but superior to all the other groups.The average WH for the Anglo-Nubian was similar to that found by Mello & Schmidt (2008), in a study of purebred goats (PB) of this same breed in Rio Grande do Sul State, Brazil (75.4 cm).The Anglo-Nubian differed from the European dairy breeds with respect to BH, EL and EL/TD, in which they were the highest of all the populations.Boer and Drâa were similar only in WH and EL/TD.Boer and Anglo-Nubian were only similar for TD and TD/WH.Moroccan Drâa, Zagora and Rhâali differed completely among themselves with respect to WH, and BH, EL and EL/WH.Among the Northeast Brazilian goats, UDB-PI showed long-legged with a lower, statistically different TD (24.8 cm) when compared with others breeds and to also have longer ears.The Nambi showed 7.3 cm of EL, which was shorter than all of the other populations sampled.This ear length is very similar to those found in French goats with small ears, of 6.4 and 6.7 cm, reported, respectively, by Audiot et al. (1985) and Martrès & Benadjaoud (1986) and bigger than the 2 cm reported by Paredes (1952) in Spanish goats.The American LaMancha (breed with short ear length) is divided in two types according to the ear length: the first with 2.5 cm and the other with 5.1 cm, according to ADGA (2008).The EL of the Marota does not differ from that described for the Moxotó breed and it is very close to that described for the Canindé breed (Machado, 1995;Rocha et al., 2007).The UDB-PI had the lowest TD among the populations studied and an EL similar to that of the UDB from Ceará (Machado, 1995).
Comparing the Brazilian northeastern populations under study, the Azul goat had the highest WH (62.6 cm) and the deepest chest (TD = 30.2cm).The Marota had the lowest BH (29.6 cm), the lowest WH (59.3 cm), and the second lowest EL (12.7 cm); the first lowest was the Nambi.The differences between an old naturalized Brazilian type such as the Marota (a breed type currently involved in preservation effort) and the current UDB from Piauí State (a mixed population including multiple crossbred types) showed that they diverged from each other for all the variables studied.Except for TD/WH, these data suggest a possible influence from long-legged, long-eared animals and with low chest depth in these UDB animals.Such characteristics are noticed in the Indian breeds Jamnapari and Bhuj, which have been used in crossbreeding programs in northeastern Brazil (Machado, 2001).Also, there have been reports of other breeds of taller stature such as Mambrina and Anglo-Nubianhave in the Brazilian northeast since the 1940s (Pinheiro Júnior, 1985).
Through MANOVA, significant differences (P<0.05) were also observed among population mean vectors.Thus, the rejection of the hypothesis that the population mean vectors are equal justifies the use of other multivariate techniques aimed at size reduction or the discarding of variables.
An evaluation of multicollinearity, performed using the Montgomery & Peck (1981) procedure, indicated that the condition number of the correlation matrix between the measured traits (WH, BH, EL and TD) and the indexes (TD/WH, EL/TD and EL/WH) was considered excessively high (NC≥1000).The variables that caused the greatest problems regarding multicollinearity were the TD and the indexes (TD/WH, EL/TD and EL/WH), because they are calculated by the combination of other variables and, therefore, naturally have high correlations with other variables.Thus, the measures considered in the cluster analysis were WH, BH and EL because they presented low multicollinearity (CN<100) values in this sample.A similar result was found in Spanish herds of local goats (Herrera et al., 1996), in which the chest depth was not discriminating and, therefore, should not be taken into consideration when evaluating breed differences.Through observation of the Euclidean distance matrix D (Table 3), it is possible to see that the maximum value was between Anglo-Nubian and Nambi populations (2.54), making them the most divergent breeds.Furthermore, the lowest value was between Rhâali and Azul (0.21), the most similar populations for the characteristics considered.It is important to remember that for any dissimilarity measure the value will only be comparable within the same study; the comparison of similarity with individual or sample which is not involved in its determination would not be valid (Silveira Neto, 1986).The numeric values of D in this study could not, therefore, be numerically compared to others mentioned in the literature.
Through the Euclidean distances among populations and the clustering method of nearest neighbor, a dendrogram was generated by the D values (Figure 2).The relative proportions of distances (D) were expressed in the line below the dendrogram.Only bootstrap values above 50% were utilized, because they are trustworthy for the node formation of the dendrogram presented.This dendrogram showed the formation of groups using the dissimilarity measure D, which represents the genetic similarity/dissimilarity of populations.One of those groups formed included the European dairy breed (Saanen, Alpine and Toggenbourg) along with the Moroccan Drâa population with a bootstrap over 95%; the second group formed included the Rhâali and Azul populations with bootstrap over 95%; the third group formed included the second group and the Moroccan populations (Rhâali and Zagora) with populations from Brazilian Piauí State (Azul and UDB-PI) with a 67% bootstrap; and the forth group formed included Anglo-Nubian and Boer breeds with a bootstrap over 70%.
The similarities among dairy goat breeds from Europe using biometric markers corroborates the clustering between Saanen and Alpine, and the clustering of these two with Toggenbourg (Figure 2) observed both by electrophoresis of serum proteins and erythrocytes and by using microsatellite markers (Igarashi et al., 2000;Oliveira et al., 2007).
To assess the degree of adjustment between the dissimilarity matrix and the matrix resulting from the clustering for the formation of the dendrogram, the cophenetic correlation coefficient (CCC) was estimated.
In the present study, the CCC was 0.73 for the method of nearest neighbor through D. From this coefficient, it can be concluded that the standardized mean Euclidean distance was adequate to summarize the information from the data set, the second criterion mentioned by Rohlf (1970).
The UPGMA clustering method (Figure 3) showed better CCC (0.82) than the method of nearest neighbor.Therefore, UPGMA may provide a more reliable graphical representation of the clustering.The dendrogram formed similar clusters to those obtained through the nearest neighbor method (Figure 2).Most of the bootstraps obtained in the formation of clusters from both the nearest neighbor method and the UPGMA were above 50%, which demonstrates a reliability of the inferences.
According to the groups formed in this study, there is evidence of a high similarity between European dairy population and those Moroccan Drâa populations.In studies of genetic diversity using microsatellite markers, Araújo et al. (2006) also obtained high similarity between pure and upgraded flocks of Saanen and Alpine.The formation of a stem that included Toggenbourg, Alpine and Saanen in both dendrograms (Figures 2 and 3) is consistent with the literature (Igarashi et al., 2000;Oliveira et al., 2007) and reflects a common origin among them (Igarashi et al., 2000).
The Anglo-Nubian population was isolated from European dairy breeds.This finding is justified in part because the Anglo-Nubian origin is not completely 1 -Toggenbourg; 2 -Saanen; 3 -Anglo-Nubian; 4 -Alpine; 5 -Boer; 6 -Drâa; 7 -Zagora; 8 -Rhâali; 9 -Azul; 10 -Marota; 11 -Nambi; 12 -UDB-PI.European, but also includes Middle Eastern and, probably, Indian ancestry (Jeffery, 1977;Mason, 1988).Additional studies with this breed would be useful as the sample of Anglo-Nubian goats in this study contained crossbred animals and only a small number of individuals were sampled.The Anglo-Nubian and Boer breeds were clustered with a bootstrap of 97%.The Boer breed was created in South Africa by crossing local goats with origins from eastern countries, including India (Eramus, 2000;Malan, 2000).
The clustering of the Moroccan populations Zagora and Rhâali in the dendrograms (Figures 2 and 3) indicates that the geographical proximity of these two was more important than the alleged kinship between Zagora and Drâa.The allele frequencies of visible characters also allowed Machado et al. (2000) to cluster Zagora and Rhâali, which were nearest from each other among the Moroccan goats, while the Drâa goats clustered with Mediterranean goats.Comparing French and Moroccan goats using microsatellite markers INRA and α-casein polymorphism, Oauli et al. (2002) observed that the sample of Drâa-Zagora clustered with Rhâali, and not with French goats.These authors also observed that goats from the Pyrenees formed a separate stem from Saanen, Alpine and Poitevine.
The Azul and UDB-PI were first clustered with Zagora and Rhâali populations and later with Anglo-Nubian and Boer.The Nambi and Marota are isolated cases, and Azul and UDB-PI are the most similar to each other (Figure 3).These results may be related to genetic isolation of the Marota population by controlled mating, which possibly caused genetic drift to occur.As previously mentioned, the ecotype Nambi has a unique ear size as compared with the other populations considered.
Although the results of dendrograms are easily interpreted, it was not possible to identify the optimal number of clusters.Therefore, the Tocher optimization method was used.A cluster analysis using the Tocher optimization method enables the formation of four distinct clusters using D (Table 4).It is known that the use of methods based on different dissimilarity measures may lead to different patterns of clustering (Cruz & Carneiro, 2006).Indeed, it was found that the Tocher method and hierarchical method were discordant in the partition of the groups, which corroborates the results found by other authors (Sakaguti et al., 1996;Barbosa et al., 2005).
The average distance within the group (intragroup) was calculated using the average distances between each pair of populations that compose it.Therefore, it is not possible for groups consisting of a single population to occur (group IV).The average distances between groups were obtained by averaging the distances between pairs of populations belonging to different groups (Table 5).It is observed that the intragroup average distance is always lower than intergroup average distance.Higher variability between individuals within populations than among populations was also found by Spritze et al. (2003) and by Serrano et al. (2004) in Brazilian cattle; by Paiva et al. (2005) in sheep; and by Albuquerque et al. (2006) in buffaloes.
The intragroup distance D (Table 5) ranged from 0.43 (group II) to 1.02 (group III).The presence of Nambi ecotype in group III explains its greater intragroup heterogeneity.Group II was similar to that found through hierarchic methods (Figures 2 and 3).The intergroup distance ranged from 1.20 (group I and IV) to 2.41 (group III and IV).This greater distance between groups III and IV is due to the presence of Nambi ecotype in group III and Anglo-Nubian in group IV.The Anglo-Nubian and Nambi populations were again the most divergent (D = 2.54; Table 3).
All biometric characters measured (WH, BH and EL) contributed to the determination of genetic divergence among populations, to a greater or lesser extent.Using the methodology of Singh (1981) (Table 6), ear length measurement had the highest relative contribution to divergence (61.9%), followed by WH and then BH.Ear length was the most discriminating because this variation did not only occur among individuals for all populations, but because the variation occurred especially among populations for this particular variable.The combined   effect of the traits EL and WH contributed to 92% of the assessment of diversity among populations.Similar results were found in herds of Spanish goats (Herrera et al., 1996) and in West African goats (Dossa et al., 2007) in which one of the most discriminating variables in this study was WH.The use of a different set of population groups or the use of other indicator traits/markers may result in different results regarding divergence; thus, it is necessary to investigate which traits are of importance in each situation.

Conclusions
For the set of biometric measures assessed, both the nearest neighbor and UPGMA (Unweighted Pair Group Method with Arithmetic Means) hierarchic clustering methods are satisfactory.The clusters formed through these methods are largely confirmed by the Tocher optimization method.

Figure 1 -
Figure 1 -Physical measurements collected from goats in study.

Figure 2 -
Figure 2 -Dendrogram obtained from the standardized mean Euclidean distance and the clustering method of nearest neighbor based on biometric data from 12 goat populations.

Figure 3 -
Figure 3 -Dendrogram obtained from the standardized mean Euclidean distance and Unweighted Pair Group Method With Arithmetic Mean clustering method, based on biometric data from 12 goat populations.

Table 1 -
Variance analysis (ANOVA) for biometric data of goat populations in Brazil and Morocco * Significant at 5% probability by the F test.

Table 2 -
Average biometric data from different goat populations compared by the SNK test at 5% probability Means followed by same letters in the same column do not differ (P>0.05)significantly by the SNK test.

Table 3 -
Euclidean distance matrix among goat populations assessed in Brazil and Morocco

Table 4 -
Formation of goat population clusters through Tocher optimization method using Standardized mean Euclidean distance

Table 5 -
Standardized mean Euclidean distance within and between goat population through biometric data

Table 6 -
Relative importance of the characteristics (S.j) for study of the genetic divergence in twelve goat populations