Genotypic variation of traits related to quality of cassava roots using affinity

The conservation, sustainable evaluation and use of cassava (Manihot esculenta Crantz) genetic resources are essential to the development of new commercial varieties. This study aimed to evaluate the quality of cassava roots and to estimate genetic variation and clustering in cassava germplasm using the Affinity Propagation algorithm (AP), which is based on the concept of "message passing" between data points. AP finds "exemplars" of each group and members of the input set representative of clusters. The genotypic data of 474 cassava accessions were evaluated over a period of two years for starch yield (StYi), root dry matter (DMC), amylose content (AML), and the level of cyanogenic compounds (CyC). The AP algorithm enabled the formation of nine diversity groups, whose number reflects the high genetic diversity of this germplasm. A high homogeneity of genetic distances was observed within all the groups, except for two groups in which there was a partial overlap caused mainly by a high variation of the CyC trait. In addition, no relationship between the genetic structure and CyC (sweet and bitter cassava) was observed. Analysis of variance of the nine clusters confirmed the presence of differences between the groups. Thus, the results of this study can be used in future breeding programs (hybridization or selection) to introduce new genetic variability into commercial cultivars to avoid problems related to low genetic variation and to improve the quality of cassava roots.


Introduction
Cassava (Manihot esculenta Crantz) is a crop that is widely cultivated in many tropical countries in Africa, Latin America and Asia (Ceballos et al., 2007).Being adapted to low soil fertility and marginal, irregular rain conditions and having relatively stable productivity and flexibility in harvesting, cassava has great potential both as a secure source of food as well as a tool for reducing poverty due to its use not only in traditional agriculture but also in high productivity industrial production systems that are highly technical.
Cassava is from South America, more specifically from the Upper Amazon Basin (Olsen, 2004).Therefore, most of the genetic diversity currently used worldwide comes from Brazil.Cassava germplasm has several traits potentially useful to commercial varieties, such as differential starch characteristics (Ceballos et al., 2007), disease resistance (Raji et al., 2007) and root quality (Chávez et al., 2005).In general, breeders believe that most cassava varieties have low root productivity and low competitiveness as compared with improved varieties.However, this belief must be examined via an intensive evaluation of the data on cassava germplasm regarding its various agronomic and economic interest attributes.
As these data are not fully available, the use of the genetic resources of M. esculenta in cassava breeding programs has been limited.Furthermore, while some progress has been achieved in traditional evaluations of agronomic, morphological and molecular traits (Benesi et al., 2010;Duraisamy et al., 2011), these evaluations do not necessarily reflect the diversity associated with the quality of cassava roots that is basic to their use, whether for in natura consumption or industrial purposes.One of the factors that has hindered the development of new cassava varieties is the lack of information about root quality as well as details of the amount of genetic diversity in Brazilian germplasm.
Considering that both the evaluation of the genetic resources of cassava in Brazil and the precise evaluation of genetic variation are crucial in the development of optimal management strategies for sustainable conservation and for using germplasm to generate new varieties, this study aimed to evaluate the quality of the roots of germplasm accessions and to estimate the genetic variation to be used in cassava breeding programs.

Plant material
Four-hundred and seventy four germplasm accessions belonging to the Cassava Germplasm Bank (CGB) from Embrapa Cassava and Fruits (Cruz das Almas, in the state of Bahia, Brazil, 12º 48' S; 36º 06' W; 225 m a.s.l.), originating from several ecosystems in Brazil, Colombia, Venezuela and Nigeria were evaluated.This database consists of landraces and improved varieties resulting from conventional breeding procedures, such as crossing, and a selection of landraces identified by farmers or research institutions.

Experimental design
Two field trials were carried out over two years (2011 and 2012) in Cruz das Almas.A randomized block design with three replications and ten plants per plot was used in 2011, and an augmented block design was used in 2012, with the number of accessions evenly distributed in ten blocks with ten plants per plot.As experimental checks, not-yet-recommended improved clones (9624-09, 98150-06, and 9824-09) as well as landraces (Cigana and Eucalipto) and recommended varieties (BRS Aipim Brasil, BRS Dourada, BRS Tapioqueira, BRS Caipira, BRS Verdinha and BRS Gema de Ovo) were used.
The planting was carried out at the beginning of the region's rainy season (May-July) using 15-20-cm stem cuttings in a single row.The spacing was 0.9 m between rows and 0.8 m between plants, and all the recommended cultivation practices for cassava were followed.The plants were harvested 11 months after planting.

Traits evaluated
Dry matter content of the roots (DMC): this was obtained using 100-g root samples previously washed under running water, cut into pieces, peeled and cut into quarters, which were crushed to form a homogeneous mass.Then, the samples were dried in an oven with forced air circulation at 60 ºC for 48 h until constant weight was achieved.
Starch yield (StYi): this was obtained by subtracting 4.65 % from the DMC, which corresponds to the ash content, protein content, and lipids and fibers.Next, the starch content was multiplied by the average yield of the accession to obtain the StYi in t ha -1 .

Amylose content (AML):
the starch extraction was performed manually using a 500-g root sample cut into pieces and ground in a blender with a non-cutting helix (1:1 ratio water / root) and then filtered through a 150-mesh sieve.The starch suspension was kept in a cold chamber at 5 °C for 12 h.The supernatant was then discarded, and the decanted starch was washed with 95 % ethanol and dried in an oven with forced air circulation at 40 °C for 48 h.The dried starch was analyzed with regard to its amylose content according to ISO (2005) protocol.The starch sample was dispersed in 95 % ethanol gelatinized with sodium hydroxide and acidified with acetic acid.After the addition of the iodine solution, the blue complex formed was quantitated by spectrophotometry at 620 nm (Biospectro,model SP 220).

Cyanogenic compounds (CyC):
The determination of CyC, especially free cyanide, α-hydroxynitrile and cyanogenic glycosides, present in the samples was performed by an extraction of these compounds followed by a subsequent reaction with chloramine-T and Isonicotinate/1,3 dimethyl barbiturate and spectrophotometric determination at 605 nm.Linamarase enzyme was used, which is extracted from the bark of the cassava cortex according to Cooke (1979), to release the glycosidic cyanide.

Estimation of genotypic values
A combined analysis of different experimental designs was carried out, using statistical models for incorporating block designs.In this case, all the blocks were set to represent random effects, while the design effects were treated as fixed by adjusting the experiments to a randomized complete block, and at another level for the experiments, to an incomplete block design.In this case, the model set the block effects within each design type.
The linear mixed model that was used to describe the data was y= Xb + Zg + Wp + e (Henderson, 1984), where y is the data vector, b is the vector of fixed effects associated to the mean and block effect, g is the vector of random genetic effects, p is the vector of the random effects of the plots, e is the vector of random errors, and X, Z and W are the incidence matrices that are associated with the unknown parameters b, g and p, respectively, to the y data vector.
The mixed model methodology allows for the estimation of b using a generalized least squares procedure and for g and p using BLUP (Best Linear Unbiased Prediction), which predicts the random genetic effects and uncorrelated random effects not included in the model (Henderson, 1984).
Once the quality traits in cassava roots had been correlated with each other and their statistical dependence taken into account when analyzing multivariate data, Pearson's correlation coefficients among those traits were estimated.

Genetic diversity and clustering
The cassava accessions' genotypic values, obtained by BLUP, were used to calculate the genetic distance matrix using the negDistMat () function of the APCluster package (Bodenhofer et al., 2011) from the R program version 3.0.1 (R Development Core Team, Vienna, AT).The negative Euclidean distances were calculated, in which the negDistMat () function provides the following variants to compute the distance d(x; y) between two accessions .x= (x 1 ; . ..; x n ) and y = (y 1 ; . ..; y n ), where: The affinity propagation method (AP) was used to promote the clustering of cassava accessions.The data were subjected to 100 independent runs to verify the consistency of selecting the exemplars and clustering analysis.This procedure identifies a set of centers (exemplars) from the data set, taking into account each accession as a network node, and recursively transmits realvalued messages along the edges of the network until a good set of exemplars and their corresponding cluster emerges.At any time, the magnitude of each message reflects the current affinity that a particular accession has to be chosen as a new exemplar of the group (Frey and Dueck, 2007).
The messages exchanged between data points can be from two types: "responsibility" r(i,k) and "availability" a(i,k) (Sakellariou et al., 2012)."Responsibility" reflects the accumulated evidence of how appropriate point k is to serve as an exemplar for point i, considering other potential exemplars for this same point.In contrast, "availability" reflects the accumulated evidence of how appropriate it would be for point i to choose point k as its exemplar, taking into account the other points at which point k may be an exemplar.Initially, availabilities are set to zero.The AP was implemented as follows: where the matrix s(i,k) is the similarity between the two nodes i and k, and the diagonal of this matrix represents the preferences for each node.The AP algorithm is iterated until a good set of exemplars emerges from the equation above.Each node i can then be assigned to the exemplar k which maximizes the sum a(i,k) + r(i,k), and if i = k, then i is an exemplar.AP can be applied to problems where the similarities are neither symmetric nor do they satisfy triangle inequality (Frey and Dueck, 2007).
The k-means method that has similar properties for grouping data (partition analysis) was used to compare the relative stability of clusters obtained by the AP method.The genotypic values of cassava accessions were used to obtain 100 independent runs for clustering using the k-means () function from the R program version 3.0.1 (R Development Core Team, Vienna, AT).A plot of the total within-groups sums of squares against the number of clusters was used to find the best number of cluster.

Trait correlation
Pearson's correlations among the quality traits of cassava roots were overall low to intermediate (Table 1).The correlations among cassava germplasm for the traits AML, DMC, CyC and StYi indicated that selection of populations expressing multiple desirable quality traits is possible by employing this germplasm.In contrast, the demands for cassava roots (for food or industrial purposes) have rapidly increased in recent times, and high AML is a new trait that should be improved by cassava breeders.Therefore, the important negative correlation between DMC and AML (-0.41) could be an issue once the ideal cassava had the highest AML and DMC levels.Even with low to intermediate correlations, the statistical dependence of these quality traits makes multivariate analysis an appropriate approach for analyzing the data.

Number of groups
Unlike other clustering methods, the strategy implemented by AP does not require a prior estimate of the number of groups.Instead, the AP method defines the number of exemplars in the analysis that are representative of the sample.AP has as input a similarity value s(i,k) for each data point k, where the data with the highest s(i,k) values are selected as the cluster exemplar.These values are known as "preferences".In this case, the number of exemplars identified, which refers to the number of groups, was influenced by both the genetic distance input and the message-passing procedure (Frey and Dueck, 2007).
Considering that it was possible to deduce the definition of groups based on the input value of "preference", the formation of nine diversity groups was observed when using a genetic input distance of 0.35, which was calculated on the basis of the genotypic data from the four quality traits from cassava roots analyzed in this study (Figure 1).Table 2 shows the classification of the 474 cassava accessions according to their cluster.In contrast, the K-means clustering suggested that the 6-cluster solution was a good fit to the data based on the bend in the graph related to total within-groups sums of squares against the number of clusters (data not shown).This difference in the indication of the optimal number of clusters between the methods is due to AP's tendency to discover the more representative objects in the dataset, once it performs clustering without additional parameters.
The determination of the optimal or acceptable number of groups is an important factor in determining the reliability of the groups.This process basically involves the establishment of the criteria to be used to separate groups of two or more accessions whose genetic distance must be lower within the groups when compared with the overall average distance and whose distance between groups is greater than the distance within the groups involved in the analysis (Brown-Guedira et al., 2000).
In general, the AP method allowed for the creation of a large number of groups, which certainly reflects the high genetic diversity of cassava germplasm in Brazil.In principle, this statement could contradict what would be expected from a species that is predominantly vegetatively or asexually propagated, in which case, a great reduction in genetic diversity would be expected over time, due to the accumulation of systemic pathogens and preferences for more vigorous and adapted varieties  with a higher capacity to produce stem cuttings.Moreover, throughout the evolution of cassava, its frequent outcrossings allowed for the production of a large group of spontaneous seeds, which, under natural and directed selection by conventional farmers, were selected to produce new varieties and have thereby maintained a high level of genetic diversity, which is still maintained in many traditional communities.In fact, considering the high levels of outcrossing and sexual propagation, cassava accessions are predominantly maintained by vegetative propagation, allowing for high heterozygosity and extensive plasticity in the expression of phenotypic characteristics, such as those relating to the quality of the roots that have been observed in the present study.

Comparison between K-means and AP
AP and K-means analyses were carried out to determine the performance of both algorithms in terms of effectiveness and accuracy when analyzing agronomic data from the quality of cassava root.Considering six diversity groups by the K-means method, the groups formed are inconsistent for 16 % of the independent analysis, i.e., cassava accessions were incorrectly allocated into different clusters (Figure 2).Conversely, for all analyzes performed using the AP algorithm, the selection was carried out using the same individuals as exemplars, as well as the same accessions in the nine different clusters.
A hypothesis to explain this inconsistency in the cluster analysis may be the fact that K-means revealed a weakness in the determination of the initial exemplar (center point), which is obtained randomly.Therefore, if the initial exemplar is not appropriate, then the cluster will be not be maximal.Consequently, the K-means algorithm could show a high error rate and may not provide the best cluster results (Refianti et al., 2012).In contrast, the AP simultaneously considers all accessions as possible exemplars (center points) where each message is sent to reflect the latest interest held by each data point to be able to select another data point as their exemplar.Through this message-passing process, an algorithm tests the possibility of all data to become the center of the cluster, and then each accession has the same opportunity to become an exemplar.
Frey and Dueck ( 2007) compared the reconstruction errors for the AP and K-center clustering for each number of clusters, aiming to detect putative exons comprising genes from mouse chromosome 1.The AP method achieved higher true positive rates (TPR), especially at low false-positive rates (FPR), with regard to addressing the question of how well these methods perform in detecting bona fide gene segments.According to Frey and Dueck (2007), at an FPR rate of 3 %, AP achieved a TPR rate of 39 %, whereas the best K-center clustering result was 17 %.
Considering that: i) the lack of knowledge of the true genetic structure of cassava accessions evaluated on the basis of root quality traits, ii) the presence of strong inconsistencies in the clusters formed by the K-means method, which has been supported by other authors, and iii) the low reliability and repeatability of the clusters formed by the K-means method, the genetic diversity analysis was performed based on the AP algorithm only.

Genetic diversity grouping from the AP method
According to the hierarchical clustering (Figure 1), there was great homogeneity in the genetic distances within all the groups, except for Groups 8 and 9.In the latter two groups, there was partial overlap.The hierarchical clustering using a heatmap form was based on the matrix of the genetic distance (negative Euclidean distance) (Figure 1).Therefore, the heatmap classified the accessions, taking into account the distance profile of each accession in relation to the others, where the reddish color refers to a larger divergence.Group 1 consisted of 40 germplasm accessions (mostly landraces) and the sweet variety, BRS Dourada.According to the boxplot of this group (Figure 3), the most striking traits of this group were its low level of cyanogenic compounds, moderate starch yield and dry matter content.In a study conducted in Nigeria with landraces and improved varieties for 18 traits (agronomic and quality of cassava roots), Raji et al. (2007) observed that landraces showed better root quality and superior agronomic characteristics compared to improved varieties as well as lower cyanogenic compounds, with 40.0 mg kg -1 being found for the landrace 'Isunikankiyan' and 128.6 mg kg -1 for an improved cultivar (TMC30572).In general these values were higher than those reported in this study; however, they confirm that the preference for cassava with a wide range of cyanogenic compounds is quite common, particularly among small farmers, despite a general preference for sweet varieties.increased production of more than 3 times the current Brazilian national average starch yield.Group 3 consisted of 53 germplasm accessions together with the improved varieties BRS Tapioqueira, BRS Caipira and BRS Aipim Brazil, while Group 4 included 23 germplasm accessions, and Group 5, 59 germplasm accessions plus the variety BRS Verdinha.The close relationship found between BRS Tapioqueira and BRS Caipira was based on the origin of these genotypes, which share a common parentage.In general, it was observed that the traits which distinguished these groups the most were the dry matter content, cyanogenic compounds and starch yield (Figure 3).Additionally, the amylose content of these groups was quite homogeneous.
Group 6 consisted of only four germplasm accessions and each is characterized by their higher content of amylose and cyanogenic compounds (above 60 mg kg -1 ) and lower dry matter content and starch yield.Recently, breeding programs have focused not only on developing cultivars with high yield and phenotypic stability but also on adding root qualities that best meet the different needs of the starch industry.In this regard, the function-In Group 2, 18 accessions of the germplasm were grouped and their main distinguishing traits were high starch yield and dry matter content.Moreover, a concentration of cyanogenic compounds below 40 mg kg -1 was observed.Therefore, in addition to being of industrial interest due to their high starch yield and dry matter content, there is also the possibility of using germplasm accessions from this group for use as either parental varieties or directly for fresh consumption because of the low level of cyanogenic compounds.
Starch yield is a crucial point to be considered when choosing the variety to be used by a large scale production system.Because of its use in the starch industry, it is necessary to maximize production per hectare efficiency.Therefore, it is important to consider the productive potential of the variety and its starch content, which can be translated into starch yield.If we consider the Brazilian national average yield to be approximately 13.0 t ha -1 (IBGE, 2013), which is associated with an average dry matter content of 35 %, it results in an average yield of 4.5 t ha -1 of starch.Therefore, the use of the germplasm accessions of Group 2 may contribute to an al properties of starch (viscosity, solubility, and swelling index) change according to its amylose content.Cassava starch, composed exclusively of amylopectin, has several advantages for commercial purposes, like waxy maize starch (Ceballos et al., 2007).In contrast to maize, cassava starch with its high amylose content is important in the food industry for the development of products with lower digestibility.In general, the amylose content in cassava roots ranges from 13.6 to 25.0 % (Rickard et al., 1991;Defloor et al., 1998;Moorthy, 2004;Ceballos et al., 2007).In the present study, the genotypic values of the amylose content ranged from 16.39 to 18.92 %, and this variation was from17.30to 18.15 % in Group 6 (Figure 3).Therefore, the genotypes in Group 6 can be used in crosses to increase the amylose content in recurrent selection programs.Group 7 consisted of 39 germplasm accessions characterized by the grouping of individuals with medium amylose content, dry matter and starch yield, but with low cyanogenic compound content.In contrast, the largest group (Group 8) was composed of 121 germplasm accessions, along with clones 9624-09, 98150-06, 9824-09 and Cigana landrace.A highlighted trait of this group was the elevated content of dry matter in the roots, reaching almost 44 % in some accessions (Figure 3).However, even with elevated amounts of dry matter, the average starch yield for this group was considered to be medium, which is due possibly, to the lower average yield of fresh roots.In addition, there was high variation in the cyanogenic compounds, which certainly was not a predominant trait for the grouping of these accessions.
Group 9 consisted of 106 germplasm accessions, including one improved variety (BRS Gema de Ovo) and one landrace (Eucalipto).The most noticeable trait of this group was the high average of dry matter content and the higher range for this variable in addition to a wide range of cyanogenic compounds and lower starch yield.
According to Jansz and Uluwaduge (1997), based on the cyanogenic compounds in cassava roots, cassava can be divided into three classes: low toxicity or sweet cassava (< 50 mg kg -1 ), medium toxicity (between 50 and 100 mg kg -1 ), and high toxicity or bitter cassava (> 100 mg kg -1 ).Thus, both Groups 8 and 9 clustered accessions were classified into the three toxicity classes mentioned above, and there was no relation between the genetic structure and the cyanogenic compounds (sweet and bitter cassava).Overall, these results confirm the observations of other authors indicating that this relationship is weak because of the polygenic nature of this trait (Benesi et al., 2010).

Group Significance
The analysis of variance of the nine groups identified by the AP method indicated the presence of at least one difference between the groups (p < 0.001) for all four traits evaluated (Table 3), confirming the data from the boxplot (Figure 3).Furthermore, a multivariate analysis of variance (MANOVA) of the nine groups of cassava germplasm diversity against the four quantitative variables produced a Wilks' lambda mean of 0.22 (F 32, 1619 = 25.00,p < 0.001).Therefore, considering the quality traits of the root, the nine groups identified by AP cluster analysis are consistently different.
This information has important implications for the conservation of germplasm collections, since the limited financial resources invested in most gene banks' routine activities make the curators prioritize the activities of germplasm characterization and evaluation.In such cases, evaluations can therefore be conducted in accessions from different groups.Moreover, cassava germplasm classification based on the quality of the roots may contribute to the selection of accessions to be used in cassava breeding programs, especially by optimizing opportunities for transgressive segregation from crosses between genotypes belonging to different groups with wide divergence, wherein there is a greater likelihood that the unrelated genotypes belonging to different clusters may contribute unique and desirable alleles from different loci (Beer et al., 1993).

Considerations for breeding
With the increasing number of improved genotypes and germplasm accessions used in cassava breeding programs, the use of ordering algorithms and the classification of genetic variability have gained prominence in the pre-breeding actions or in the parental preparation for use in crosses.In general, the use of multivariate algorithms that allows for the simultaneous analysis of multiple agronomic characteristics, regardless of the data set (morphological, agronomic, biochemical or molecular), is widely employed for germplasm classification, ordering the genetic variability for a large number of accessions, or for the analysis of the genetic relationship between improved genotypes (Mohammadi and Prasanna et al., 2003).
Among the various multivariate algorithms, cluster analysis, principal component analysis (PCA), principal coordinates analysis (PCoA), and multidimensional scaling (MDS) are commonly employed and seem to be particularly useful in plants (Mohammadi and al., 2003).Moreover, despite being relatively unknown in plant breeding, perhaps because of its fairly recent development, the AP analysis method has properties for dealing with multiple traits that are very interesting.
The high potential of the AP algorithm for data clustering has been demonstrated in a number of areas of knowledge, from human face image analysis to gene expression in many organisms (Frey and Dueck, 2007;Sumedha and Weigt, 2007;Borile et al., 2011), including plants (Kiddle et al., 2010).In general, the AP algorithm effectively reveals the hierarchical grouping structure present in the various types of data sets.
Unlike other clustering methods such as Kmeans, the choice of initial exemplar is not a step that undermines the groups when using the AP method because all accessions are potential exemplars to be tested.Therefore, the AP algorithm tends to produce a low error rate as compared with the K-means, as a consequence of their high robustness and invariance for the assignment of accessions within each cluster (Refianti et al., 2012).
In cassava, the information obtained from the AP cluster analysis indicated a wide variation, especially for traits such as cyanogenic compounds, dry matter content, and starch yield, which, thereby, provides extensive scope for improving this crop through hybridization and selection.Studies assessing the quality of cassava roots and the genetic variation of large numbers of germplasm accessions have not been previously undertaken in Brazil.This hampers engagement with the current policy of sustainable agricultural systems, in which it is necessary to use the components of diversity in a proper way and avoid medium and long term diversity reduction.Additionally, what is observed as regards Brazilian cassava nowadays, especially in agricultural systems that use intensive technologies, is the use of a very limited number of cassava cultivars, which certainly share common ancestors and, therefore, reduce the allelic diversity available to the production system.
Brazil, being the center of cassava's origin and diversity must characterize and evaluate its genetic resources appropriately so that they can be effectively used for developing new cultivars.Thus, breeding by hybridization and selection among accessions from the groups established in this study may contribute to the introduction of new genetic diversity to help avoid problems related to limited genetic variation and may contribute to an improvement in the quality of cassava roots.

Future perspectives
Estimating genetic variation among cassava accessions is a prerequisite for efficient germplasm management and its use in breeding programs.Agronomic traits, especially those related to root quality are very important for grouping genetic resources and are also essential to the improvement of existing varieties by introducing novel genetic variation.However, clusters formed by agronomic traits were not in accordance with molecular marker data (Esmaeilzadeh et al., 2005;Garcia et al., 2007;Barakat et al., 2013).
Possible reasons for the lack of correlation between molecular and morpho-agronomic variation could be the wide genome coverage of molecular markers, including coding and non-coding regions, and by the fact that molecular markers are less subject to artificial selection in comparison with morpho-agronomic markers (Semagn, 2002).Therefore, the contradiction of the results from these types of data indicates that germplasm clustering and selection for crossing in breeding programs should not rely on a single measurement.Hereafter, these cassava accessions will be genotyped using high-throughput approaches, such as genotyping by sequencing (GBS), for future studies focusing on germplasm characterization aiming to carry out a full analysis of genetic diversity (morpho-agronomic and molecular data).
Moreover, in general, clustering methods such as AP produce and maintain very distinct groups of accessions, with a reduced level of genetic variation within each group, but with high genetic variability among clusters.In theory, these clusters could be used to obtain a maximum heterotic response when hybrids are produced by crossing genotypes from different groups.Therefore, potential heterotic groups based on the quality of cassava roots and AP algorithm can be used in crosses to test the robustness of these clusters.In addition, clusters containing many accessions, i.e.Clusters 3,5,8 and 9 (Table 2), can be subdivided in heterotic groups based on other agronomic plant traits, such as stem growth habit, plant height, first branching height, plant shape, number of storage roots/plant, starch content, harvest index and resistance to postharvest deterioration.
These potential heterotic groups in cassava could help breeders to meet the changing needs of modern agriculture, which lead farmers, for economic reasons, to accept only a few of the highest yielding cultivars with good root quality.Thus, we hope that our focus on germplasm preservation and characterization using different approaches could ensure that farmers have access to the best varieties of cassava.

Figure 1 -
Figure 1 -Heatmap and hierarchical clustering of the genetic distance on 474 cassava accessions based on quality-related traits of the roots.Yellow color indicates a low similarity between accessions, while orange indicates a high similarity.

Figure 2 -
Figure 2 -Distribution of cassava accessions in six clusters formed by the K-means method and percentage of clustering agreement for each accession in 100 K-means analyzes.The predominant colors represent the differences among clusters.

Table 3 -
Analysis of variance of the nine groups of genetic diversity based on the evaluation of root quality traits in 474 cassava germplasm and varieties.