Microsatellites for detecting inconsistencies in Capsicum cultivars registration in Brazilian database: more than meets the eye

In Brazil, cultivars are registered by National Register of Cultivars (RNC), which besides enabling commercialization of cultivar propagative material, also guarantees the producers genetic purity and identity of propagules. However, it is possible that the information about registration and commercialization of some cultivars is inaccurate. This study aims to analyze the use of microsatellite markers to detect inconsistencies in data of Capsicum spp. cultivars obtained from the official database (CultivarWeb). Seven cultivars were evaluated, three of them were through genetic identity analysis (Amarela Comprida, De Cayenne and Cayenne Long Slin) and the others were used as standard for the species C. annuum, C. frutescens and C. chinense. Thirty-three microsatellite loci were polymorphic and presented 76 alleles (an average of 2.3 alleles/ locus). Fixation Index (F) showed high homozygosis and estimators of genetic diversity (Ho and I) presented low genetic diversity among cultivars. The molecular analysis, represented in a dendrogram and in Principal Coordinate Analysis Chart (PCOA), showed that the investigated cultivars belong to C. annuum, contrary to what is registered in CultivarWeb, which indicates that such cultivars belong to the species C. frutescens. Thus, the authors recommend that the data in the CultivarWeb should be checked and enhanced.

Through a database maintained on its website, called CultivarWeb, the Ministério de Agricultura, Pecuária e Abastecimento [(MAPA) Ministry of Agriculture, Livestock and Supply] allows to visualize the cultivars registered in Brazil according to genus or species of interest. In June 2019, the system indicates that 39,331 cultivars were registered in the Country. However, as the available information is obtained by formal declaration from the maintainer through the cultivar application documentation, some inaccuracies or inconsistencies may occur.
This study was carried out since a suspicion of inconsistency arose. This was regarding the species information attributed to three cultivars: Amarela Comprida, De Cayenne and Cayenne Long Slin, which would be used in a DNA fingerprinting assay with cultivars developed by Capsicum spp. breeding program at Universidade Estadual do Norte Fluminense Darcy Ribeiro (UENF). The suspicion is due to the fact that both in seed packaging and in official database (CultivarWeb) these cultivars are presented as belonging to Capsicum frutescens, however, the agronomic traits indicate that such cultivars actually belong to Capsicum annuum.
Molecular markers, mainly the microsatellite markers, are useful tools to characterize Capsicum spp. germplasm and other species of vegetables. Moreover, an application of molecular characterization, which is in expansion, consists of using the characterization to test or prove the identity and genetic purity of commercial cultivars for intellectual property and for advocacy of breeder or maintainer. Regarding the security and protection of intellectual property rights, studies of molecular characterization of pumpkin (Sim et al., 2015), potato (Favoretto et al., 2011), pepper and sweet pepper (Kumar et al., 2001;Kwon et al., 2005) cultivars can be found.
The aim of this study was to use microsatellite markers to characterize Capsicum spp. genotypes, in order to check at DNA level, which species of genus Capsicum spp. belongs, specifically the genotypes represented by the cultivars Amarela Comprida, De Cayenne and Cayenne Long Slin and, solve issues of available information on Cultivarweb related to the species of these cultivars.
Cultivars UENF Campista and Cascadura Ikeda were used as standards for Capsicum annuum species, since there is no doubt they belong to this species. Cultivar Malagueta and UENF 2154 were used as standards for C. frutescens and C. chinense species, respectively.
Plants were grown for 45 days in 500 mL pots in a greenhouse at Unidade de Apoio à Pesquisa of Campus Leonel Brizola from Universidade Estadual do Norte Fluminense Darcy Ribeiro, in the municipality of Campos dos Goytacazes, Rio de Janeiro, Brazil.
Then, leaf samples were collected for DNA extraction. In order to check homozygous level of the sampled cultivars and the genetic purity of the seeds, each cultivar was represented by a bulk composed of five plants.

DNA extraction
For DNA extraction, leaf samples of each one of seven cultivars and, approximately, 300 mg leaf tissue was macerated and transferred into 1.5 μL tubes and immersed in liquid N 2 for DNA extraction according to Doyle & Doyle protocol (1990).
DNA was quantified using Qubit 3.0 fluorometer (Invitrogen). Afterwards, samples were diluted and standardized at 5 ng μl -1 to be submitted to polymerase chain reactions (PCR).

Amplification reactions
Eighty microsatellite markers available in literature (Lee et al., 2004;Minamiyama et al., 2006;Yi et al., 2006) were selected based on information about polymorphism level, specificity for C. annuum and position in the genome ( Table 1).
The authors decided to use microsatellite markers due to constant recommendation in International Union for the Protection of New Varieties of Plants (UPOV, 2010).
Then, the reactions were conducted in a thermocycler model Veriti, Applied Biosystems, following: 4 min at 94°C for initial denaturation; 38 cycles including 94°C for 1 min, 52-60°C for 1 min (depending on primer used), 72°C for 3 min; and a final extension at 72ºC for 7 min.
Amplified DNA fragments were separated in high resolution agarose gel concentrated at 4% by a horizontal electrophoresis system. The PCR products, before being electrophoresed, were stained with Blue Juice and Gel Red (1:1) solution. Then, the gels were submitted to ultraviolet light photocumentation (Minibis Pro Photocumenter -Bio-imaging System).
Afterwards, only polymorphic microsatellites markers for the material under evaluation were used for the elaboration of numerical spreadsheet based on the pattern of the bands observed in the gel images.

Data analysis
At beginning, the authors used programs GenAlEx (Peakall & Smouse, 2012) and PowerMarker (Liu & Muse, 2005) for math determination of the following genetic diversity estimators for the evaluated accessions: number of alleles, number of effective alleles, number of loci with private alleles per access; fixation index; observed heterozygosity and Shannon's Index.
Then, genetic diversity analysis among cultivars was performed using Genes program (Cruz, 2013), data were processed by complement of the weighted similarity index. This analysis generated a matrix with measures of dissimilarity between genotypes, which was used for clustering analysis by hierarchical method of medium group bonding [(UPGMA) Unweighted Pair Group Method with Arithmetic Means] using Genes program and Principal Coordinate Analysis in the GenAlEx Program.

RESULTS AND DISCUSSION
Among 80 microsatellites markers tested, only 33 showed polymorphism for the evaluated material; for this reason, only these markers were computed and used for data analysis (Table 2).
These 33 analyzed microsatellites markers generated 76 alleles (Na), showing an average of 2.3 alleles per locus. This result is due to the fact that the evaluated genotypes are inbred lines at a high level of homozygosis. The number of effective alleles (Ne) ranged from 1.153 (CAMS-451) to 3.000 (Hpms E016), with an average of 1.691. These values are in accordance to what is expected when investigated genotypes belong to autogamous species and are genetically closely related.
Considering the values of genetic diversity estimates (Ho and I), the authors noticed that the genetic diversity among the accessions can be considered medium, which corroborates the fact that these evaluated genotypes are autogamous species belonging to the same gene complex (C. annuum complex). This complex comprises C. annuum, C. frutescens, C. chinense and C. chacoense. Therefore, loci were expected to tend to lower level of polymorphism and, as a result, to find low level of genetic diversity among cultivars of different species and among cultivars of the same species.
The values observed in Fixation Index (F) (F = -0.750 to 1.000) show medium to high homozygosis of cultivars per se considering investigated loci. The authors highlight that Fixation Index (F) may show values from -1 to +1. Values close to zero show random crossings; negative values show excess of heterozygosity, due to heterozygous selection and biased mating between similar phenotypes; finally, high positive values show high inbreeding (Peakall & Smouse, 2012). In addition, we might infer that Fixation Index still corresponds to an estimate of differentiation between and among cultivars, as well as promote diagnosis of the variability of each locus in terms of the level of homozygosis or heterozygosis.
In another analysis, the authors noticed private alleles (Ap) for each cultivar among the evaluated loci. 'De Cayenne' showed the largest number of private alleles (Ap = 18), whereas 'Malagueta' showed the lowest number of private alleles (Ap = 6) ( Table 3). C. annuum cultivars showed a number of private alleles much superior to the observed for C. frutescens and C. chinense.
The detection of private alleles, besides being an indicator of the occurrence or not of gene flow, reflects the level of genetic relationship between the evaluated accessions or populations (Szpiech & Rosenberg, 2011). Thus, the greater the number of private alleles, the lower the gene flow and, consequently, the carrier of the largest number of private alleles tends to be the most genetically distant from the others of the same species, genus or population.
Considering the analysis of private alleles, we observed great genetic divergence among cultivars which admittedly belong to the species C. frutescens and C. chinense, respectively, 'Malagueta' and UENF 2154, cultivars Amarela Comprida, De Cayenne and Cayenne Long Slin, as well as the cultivars used as standard for C. annuum species ('Cascadura Ikeda' and 'UENF Campista').
Due to an expressive number of private alleles which were detected through genotype, these markers can be recommended for studies on molecular characterization of Capsicum ssp. accessions, or for other trials aiming to obtain DNA fingerprints.
The analysis of genetic diversity a m o n g a c c e s s i o n s g e n e r a t e d a dissimilarity matrix, in which a correlation of 0.99 with cophenetic value matrix was verified. The closer the value of the cophenetic correlation coefficient (CCC) is to 1, the smaller the individual cluster distortion using the UPGMA method (Silva & Dias, 2013). Thus, the high value of the cophenetic correlation coefficient observed corresponds to a high consistency and reliability of clusterings observed in the dendrogram.
Using the dissimilarity matrix, a dendrogram was generated by the clustering analysis performed using MEGA software applying UPGMA method. Three groups of cultivars were formed (Figure 1) which are reunited considering the species. The establishment of the groups was done subjectively, based on the sharp changes in levels, associated with prior knowledge of the material under study.
The first group consisted of cultivars related to C. annuum: UENF Campista, Cascadura Ikeda, Amarela Comprida, De Cayenne and Cayenne Long Slin. The composition of the first group was expected, since cultivars UENF Campista and Cascadura Ikeda were used as standard for C. annuum and due to the hypothesis that cultivars Amarela Comprida, De Cayenne and Cayenne Clustering analysis, specifically regarding the composition of the first group, proves that cultivars Amarela Comprida, De Cayenne and Cayenne Long Slin do not belong to C. frutescens, as it can be found on information board of these cultivars in CultivarWeb (data bank), but, in fact, they belong to C. annuum.
In Principal Coordinate Analysis (PCOA) (Figure 2) Microsatellites for detecting inconsistencies in Capsicum cultivars registration in Brazilian database: more than meets the eye In some studies on morphological characterization, a differentiation among genotypes of different Capsicum species can be obtained (Campos et al., 2016). Thereunto, morphological and agronomic descriptors proposed for Capsicum by IPGRI (International Plant Genetic Resources Institute, renamed Biodiversity International) were used. However, this methodology requires more time and resources than the use of molecular markers.
Results obtained in these studies show a probable inaccuracy in identifying tested cultivars (Amarela Comprida, De Cayenne and Cayenne Long Slin) in relation to the species in CultivarWeb database bank. And it is still worse when one searched on the platform CultivarWeb using the expression "Cayenne long", two different results can be found, apparently, in relation to different cultivars: In one result, the cultivar Cayenne Long Slin is associated with C. frutescens species (this cultivar was evaluated in this study), whereas the second result presents cultivar Cayenne long Slim as related to C. annuum species. When restricting the search to keyword "Cayenne", seven results are displayed; these are distributed in C. frutescens (cultivars Ardida Cayenne, Ardida Vermelha Cayenne, Cayenne, Cayenne Long Slin and De Cayenne) and C. annuum (cultivares Cayenne Long Slim and Dedo de Moça -Cayenne).
Search results on CultivarWeb, as considered in previous paragraphs, show incorrect information, principally because it is recognized in literature that cayenne or cayenne peppers belong exclusively to the species C. annuum (Barbero et al., 2014). Specifically, when comparing the Cayenne Long Slin and Cayenne Long Slim cultivars, the question remains whether there is a difference between them or not, besides the misspelling in writing the word slim (thin in thickness, physically thin) in the name of cultivar Cayenne Long Slin. In addition, cultivar Amarela Comprida, although registered as belonging to C. frutescens, is recognized and marketed as belonging to C. annuum (Pimentas artesanais, 2017).   uncommon to observe misnamed and identification of species. We suggest here that MAPA might establish stricter mechanisms for checking and certifying the information provided by the applicant in the cultivar registration application form, in order to avoid the disclosure of inaccurate information. We also recommend a thorough review of the information contained in all cultivar registration processes which have already been completed and whose information is already available on the CultivarWeb platform.
genetic purity of propagative material to be marketed.
We also highlight that many cultivars of Capsicum and other vegetables registered in Platform CultivarWeb are imported from other countries, not being the result of genetic breeding program carried out in the country. This would not be one of the reasons for the inconsistencies in the available information, as this incorrect identification of the species could be a mistake from the country where the cultivar was developed. Especially in the case of Capsicum plants, it is not These reports, and the results of this paper, are just a simple sample of the confusion found in the seed market in Brazil. Identically, it is possible that other inconsistencies may exist in the registered cultivar information provided by MAPA in CultivarWeb regarding Capsicum cultivars, as well as for other genera and species. The detection of inconsistencies of this nature undermines the confidence of both farmers and breeders in the guarantees offered to them by government agencies, namely: reliability of the cultivar's genetic identity and quality, and the