Genetic variability in Capsicum spp. accessions through multicategorical traits

Capsicum peppers originate from the American continent. Their culinary and medicinal potential, first exploited by indigenous peoples over 7000 years ago is vast, however more studies are needed. Despite the existence of germplasm banks that house hundreds of accessions of the genus, only a little of this diversity is seen in large markets. In this respect, the morpho-agronomic characterization of the accessions allows not only the identification of the existing variability, but also the agronomic and commercial potential of the genotypes. Therefore, this study proposes to characterize and estimate genetic divergence between accessions of the Capsicum germplasm collection at the Universidade Federal Rural do Rio de Janeiro (UFRRJ) based on multicategorical traits. The species were identified and 29 accessions were characterized in a greenhouse, based on 31 multicategorical descriptors of the crop. The collected data were subjected to the Cole-Rodgers dissimilarity index and to hierarchical clustering (Nearest Neighbor) and optimization (Tocher’s) methods. Tocher’s method was considered the most suitable to group the accessions of the germplasm bank. Results indicate that the multicategorical descriptors properly assess the diversity in the chosen method and display the existence of high variability in the collection, which can be used in Capsicum spp. breeding programs.


INTRODUCTION
The genus Capsicum spp. encompasses plenty of varieties grown in different locations in the tropics. This is especially true in Latin America, where Brazil is the primary and secondary center of diversity for species and the center of origin of several species of Capsicum spp. (Carvalho et al., 2003). In breeding programs, genetic variability is a basic prerequisite for the development of new cultivars with greater adaptability, resistance and production potential. In this context, it is essential that germplasm collections contain accessions with significant genetic diversity from each other in order to be selected as parents, when aiming to improve a given trait (Lopes & Carvalho, 2008). Ferrão et al. (2011) used multivariate techniques to evaluate morpho-agronomic traits in Capsicum peppers and identified the best genotypes for breeding aiming at use in the making of preserves. These must be aromatic, crunchy, small and have low pungency as well as an intense color (Neitzke, 2012;Heinrich et al., 2015).
Genetic divergence can be estimated via statistical methods that produce a matrix of similarity or dissimilarity between samples, which can later be grouped by optimization and hierarchical clustering methods (Bezerra Neto et al., 2010). Tocher's optimization method consists of the formation of groups whose intra-group distance values are lower than any inter-group distances. This method is widely used in research on the genetic diversity of numerous crops such as safflower, sweet potato and Capsicum spp. (Vasconcelos et al., 2007;Lira et al., 2021;Vargas et al., 2018;Pessoa et al., 2018). Among the approaches used to facilitate data visualization in studies of genetic distance are agglomerative hierarchical methods that, after successive mergers, result in the formation of a dendrogram (Carvalho et al., 2009). Rev. Ceres, Viçosa, v. 69, n.2, p. 195-202, mar/apr, 2022 According to Bento et al. (2007), the advantage of adopting multicategorical variables lies in the assumption that these traits are easy to observe and demand less use of manpower and time, thus being considered ideal in collections where little human and financial resources are available. The classification of quantitative traits into multiple categories allows the combined insertion of variables of different nature in a single dissimilarity matrix. In this way, diversity is more accurately visualized due to the greater number of groups in relation to the clusters formed for each set of variables.
The index of Cole-Rodgers et al. (1997) estimates dissimilarity for a group of multicategorical traits, where the similarity index can be established according to the agreement and disagreement and allows the simultaneous analysis of qualitative and quantitative traits. The method has been used in studies of genetic divergence of several crops, e.g. guava (Psidium guajava) (Gomes Filho et al., 2010), araçá (Psidium cattleyanum Sabine) (Bremenkamp, 2015), pepper (Capsicum spp.) (Oliveira et al., 2019), among others.
In view of the above-described facts, the present study was conducted to characterize and estimate the genetic divergence between 29 accessions of Capsicum spp. from the Germplasm Bank of UFRRJ using multicategorical traits.

MATERIAL AND METHODS
The accessions were multiplied in the greenhouse of the Agronomy Institute of the Universidade Federal Rural do Rio de Janeiro (UFRRJ) (Georeferencing: 22°45' S and 43°41' W), in the municipality of Seropédica, Rio de Janeiro State, Brazil. Sowing was carried out in October 2019, in 128-cell polystyrene trays containing commercial substrate "Mecplant", at the rate of three seeds per cell. The trays were irrigated daily until the seedlings were transplanted into pots, using soil collected from the Horticulture section at UFRRJ.
Twenty-nine accessions from the germplasm collection of peppers of the genus Capsicum were evaluated (Table 1).
Characterization was achieved using the identification key proposed by Carvalho et al. (2006) and the multicategorical descriptors for the crop recommended by the International Plant Genetic Resources Institute (IPGRI, 1995), with modifications suggested by Junior & Silva et al. (2013). For the analysis of the eight quantitative traits, the average of ten fruits from ten plants per accession was considered. Subsequently, each trait was classified into different categories. As regards the qualitative descriptors, data were obtained based on the mode of each accession for each descriptor.
After the multicategorical data of the morphoagronomic characterization based on 31 descriptors were collected, the dissimilarity index was obtained according to Cole-Rodgers et al. (1997). Then, the accessions were grouped using two methodologies: Tocher's Optimization and the "Nearest Neighbor" hierarchical approach. The validation of the clusters was determined by the cophenetic correlation coefficient (CCC) (Sokal & Rohlf, 1962). Analyses were performed using the Genes computer program (Cruz, 2016).

RESULTS AND DISCUSSION
The dissimilarity index between accessions ENAS X, ENAS 5032 and ENAS 5049 C was null (Table 2), indicating the presence of duplicates. One of the main functions of morphological characterization is identifying duplicates, which may probably be eliminated after being properly studied, thus reducing the maintenance costs of collections and germplasm banks (Burle & Oliveira, 2010). In a study on the genetic diversity between accessions of   Batista et al. (2011), on the other hand, found 100% similarity between two accessions in the studied collection of 30 accessions, so they were considered duplicates.
Once the dissimilarity matrix was estimated for multicategorical data, the accessions were grouped by Tocher's optimization method. Five groups were formed, as shown in Table 3.
Group 1 was the largest of the clusters, comprising 13 accessions, namely, ENAS X, 5049C, 5032, 5049B, 5030, 5047, 5047B, 5031, 5051, 5049, 5045, Y and 5048, all of which belong to the species C. chinense. Some of the main traits of this group were: fruits with a width of 1 to 2.5 cm (Figure 1), wall thickness of 1 to 2 mm and placenta over half their length; flowers with the anthers violet and filament light violet; and fruiting occurring between 91-120 days. The plants of most of these accessions showed an intermediate growth habit, except for ENAS X, 5049C, 5032, 5049B, 5049 and 5048, which showed upright growth. Most accessions exhibited an oval-shaped leaf and flowers positioned in all shapes, except ENAS X, 5049C, 5032, 5048, whose flowers are pendant, and ENAS 5047B, with erect flowers.
Group 2 consisted of three accessions, namely, ENAS 5020, 5015 and 5028, all of which are of the species C. baccatum. Overall, they have one flower per leaf axil in different positions, anther yellow with filament white, corolla greenish yellow, stigma inserted, fruits pendant, pointed, of elongated shape, shoulder obtuse and calyx entire, of low pungency, moderate aroma, equal fruiting time, placenta over half the length of the fruit, intermediate persistence between fruit and pedicel, plants over 65 to 85 cm in length and fruit length between 4 and 8 cm.
The third group contained five accessions, namely, ENAS 5023, 5024, 5002, 5035 and 5009, all of which also belong to the species C. baccatum. These are characterized by having an intermediate growth habit, forming tall plants (over 85 cm); leaves deltoid-shaped; flowers erect (anthers yellow and filament white); fruiting time from 91 to 120 days; fruits red, containing three locules, with the placenta reaching over half of their length, and calyx entire and wide, in the range of 1 to 2.5 cm.
ENAS 5002 and 5009 produce two flowers per leaf axil, whereas the others produce one to two flowers.
Group 4 comprised seven accessions, namely, ENAS 5041, 5043, 5007, 5044, 5019, 5017 and 5018, which belong to three distinct species. Accession ENAS 5017 belongs to the species C. frutescens; accession 5044, to C. chinense; and all the others to the species C. annuum. The plants of these accessions have an intermediate growth habit, produce fruits weighing 3 to 9g, with a peduncle of 2 to 4 cm in length and flowers with the corolla white and stigma exerted. Only one accession had an oval-shaped leaf (ENAS 5019), whereas the others had a lanceolate-shaped leaf. All showed only one flower per axil, except ENAS 5041 (one and two) and ENAS 5044 (two).
The only component of group 5 was ENAS 5010, of the species C. annuum, representing one of the few genotypes to bear fruit more than 120 days after planting. This genotype has only 15 seeds per fruit, short stalks (up to 2 cm) and fruit width and weight up to 1 cm and 1g, respectively. Based on dissimilarity analysis, the genotype showed high divergence values in relation to the others, varying between around 52% and 84%.
Groups 2 and 5 were the most distant (0.774) ( Figure  2), due to high divergence, which is explained by the distance between accessions ENAS 5010 and 5028 (83.87%).The first has the largest size and weight of the collection, in addition to low pungency, whereas the second produces fruits that despite being highly pungent, are light, small, thin-walled, and of low yield for the production of sauces and preserves and fresh consumption, but hold ornamental potential due to the erect and vibrantly colored fruits. Due to the high divergence between them, evidence indicates they can be selected as parents in pepper breeding programs aimed at heterotic gains and the production of new cultivars (Alvares et al., 2012). On the other hand, the shortest distance was between groups 1 and 5 (57.80%). However, because their distance was greater than 50%, they diverge in many variables to belong to the same group.
The CCC found was 0.84, which suggests good agreement between the dissimilarity matrix and dendrogram values (Sokal & Rohlf, 1962). Therefore, it is acceptable for the morphological characterization of Capsicum. Vasconcelos et al. (2012), Villela et al. (2014) and Neitzke et al. (2010) conducted studies of characterization of the genus and obtained similar CCC values: 0.83, 0.79 and 0.91, respectively. Four distinct groups were formed in the dendrogram (Figure 3). Both clustering methods-Tocher's Optimization and Nearest Neighbor-isolated accession ENAS 5010, due to the traits previously mentioned. Previous studies on the current topic for the genus Capsicum spp. also describe similar results in relation to the agreement between Tocher's and the Nearest Neighbor clustering methods (Sudré et al., 2005;Sudré et al., 2006;Bento et al.,2007). However, in the Nearest Neighbor method, there was a merge of groups 3 and 4 from Tocher's analysis. The new group was characterized by intermediate plant growth habit and white anther filament color. Nonetheless, for all other characteristics the accessions differed from one another. Moreover, the distance in the formed group (0.579) was greater than the intra-group distances of groups 3 (0.313) and 4 (0.419), indicating a decrease in homogeneity within the new group (Cruz et al., 2014).
According to Manfio et al. (2012), the greater the dissimilarity between the parents selected for breeding programs, the greater the chances of obtaining favorable allelic recombinations, so the result of the crosses expresses hybrid vigor. ENAS 5047 B (C. chinense) and 5043 (C. annuum), from the first and third groups of the dendrogram, respectively, have a distance of 55%, and possess interesting traits for the making of pepper preserves, such  as small, thick-walled, red fruits with low pungency, with emphasis on the intense aroma of ENAS 5047B. Thus, the results of the present study may assist in the selection of parents in Capsicum spp breeding programs.
In studies on production and molecular, morphological and reproductive characterization of hybrids between Capsicum species, the hybrid combination of C. chinense×C. annuum var.annuum showed 98.8% pollen viability (Campos, 2006).The authors concluded that the species belong to the same gene complex, which facilitates the crossing and fertility of the hybrid. Monteiro et al. (2011) undertook a reproductive characterization of interspecific hybrids obtained between Capsicum species and found that the C. chinense×C. frutescens and C. annuum×C. baccatum crosses generated fertile hybrids. As a result, the researchers recommended using the accession of the species C. chinense as a female parent to obtain good hybrid fertility.

CONCLUSIONS
Morpho-agronomic characterization provided greater knowledge about the materials available in the germplasm collection at UFRRJ.
The Cole-Rodgers dissimilarity index was adequate to determine the genetic divergence between the genotypes using multicategorical data.
The clustering methods used partially agreed with each other. Tocher's method stood out due to its greater discriminatory capacity as compared with the Nearest Neighbor clustering method.