Use of microsatellites for evaluation of genetic diversity in cherry tomato

Much of the diversity of tomato is found in wild forms, the most important being the Solanum lycopersicum L. var. cerasiforme and S. pimpinellifolium. The objective of this research was to assess the genetic diversity of 30 introductions of cherry tomato with 36 microsatellite molecular markers. The study was conducted at the Plant Transformation Research Center (PTRC) of the University of California. A dendrogram was built using the Dice-Nei and Li similarity index and the UPGMA clustering method, where introductions were differentiated without preserving a distribution pattern obeying the PlAnT BReeding Article Use of microsatellites for evaluation of genetic diversity in cherry tomato Nelson Ceballos Aguirre1*, Walter López2, Martha Orozco-Cárdenas2, Yacenia Morillo Coronado1, Franco Vallejo-Cabrera3 1. Universidad de Caldas Producción Agropecuaria Manizales (Caldas), Colombia. 2. University of California, Riverside Riverside (CA), United States. 3. Universidad Nacional de Colombia Palmira (Valle del Cauca), Colombia. *Corresponding author: nelson.ceballos@ucaldas.edu.co Received: Apr. 8, 2016 – Accepted: Aug. 16, 2016 geographical area of origin. A coefficient of genetic differentiation was found (Fst = 0.3474), showing a high genetic differentiation of the introductions; those from Brazil, Ecuador, and Peru were the most genetically diverse, presenting 100% of polymorphic loci. The molecular variance analysis indicated a variation of 11% between the groups and 89% within the same. The broad genotypic variability of the evaluated introductions favors the possibility of selecting those for genetic improvement and sustainable use of the species.


inTROdUCTiOn
Solanum lycopersicum L. is a dicot species belonging to the Solanaceae family and the genus Solanum.All wild species are related to tomato originated in the Andean region of Chile, Bolivia, Peru, Ecuador, and Colombia, also including the Galapagos Islands.The most promising wild forms of Solanum are S. lycopersicum var.cerasiforme and S. pimpinellifolium (Vallejo 1999;Nuez 1999), due to the ease with which crosses are obtained.It is a self-pollinating species with high genetic variability (Pratta et al. 2003).According to Miller and Tanksley (1990), most of the tomato variety is found in its wild forms which show variability for the characteristics of fruit quality such as flavor, aroma, color, and texture.
It is globally estimated that 80% of germplasm has no characterization data, and 95% has no agronomic evaluation data (Xu 2010).The collection and conservation of plant genetic resources without being accompanied by information about agronomic characteristics and genetic potential makes the collections in simple deposits of material without much utility (Abadie and Berretta 2001).Molecular biology is a powerful tool to study genetic diversity, which allows a better understanding of the relationships between species within the same genus, successful taxonomic classification, and greater ability to identify species and cultivars.The tools provided by molecular biology have provided new genetic diversity studies by breeders, so it is important to investigate the genetic variation of non-characterized and evaluated wild species (Graham and McNicol 1995).
Microsatellites or simple sequence repeats (SSRs) are genetic markers with tandemly repeated motif of 2 to 6 bp; they are used in many cultivated plants and are found in all prokaryotic and eukaryotic genomes (Zane et al. 2002).The main criterion for their use is that they are highly polymorphic, codominant, and evenly distributed throughout the genome (Azofeifa-Delgado 2006;Aranguren-Méndez et al. 2005).
The genetic diversity and relationship of 42 tomato varieties sourced from different geographic regions was examined with 29 EST-SSR markers (Korir et al. 2014).Zhou et al. (2015a) evaluated a total of 29 cultivated tomatoes (S. lycopersicum), 14 wild tomatoes, and 7 introgression lines (ILs) developed from a cross between S. pennellii and S. lycopersicum with 9 morphological traits and 15 genomic-SSR and 13 EST-SSR primers.Zhou et al. (2015b) compared the genomic simple sequence repeat (gSSR) and EST-derived SSR (EST-SSR) markers when analyzing genetic variability of 48 tomato cultivars from different countries.Miskoska-Milevska et al. (2015) studied the applicability of 8 DNA microsatellite loci in genetic differentiation of 6 morphologically different tomato varieties of Lycopersicon esculentum Mill.Kumar et al. (2016) studied the genetic variation of 19 genotypes of tomato (Solanum lycopersicum L.) with 11 polymorphic SSR markers.They concluded that these markers may be used for a wide range of practical application including varietal identification and prescreening for distinctiveness of tomato genotypes.
The gradual replacement of primitive varieties for hybrid cultivars has produced a reduction in the number of cultivars.Genetic erosion can have serious consequences for farming, particularly regarding their sanitary vulnerability (Ceballos and Vallejo 2012).This process can be mitigated by establishing gene banks, which are a vital component of the improvement programs.Once the genetic resources available have been characterized, it is possible to decide which crosses would contribute most to the expansion of the genetic base (Carrera et al. 2010).Therefore, it is necessary to assess the genetic diversity of germplasm of cherry tomato type in order to select promising introductions to the market, with good yield, fruit quality, and high antioxidant content.The objective of this research was to assess the genetic diversity of cherry tomato and population structure of introductions by country of origin.

Extraction and quantification of DNA
The research was developed at the Genetic Engineering Center (Plant Transformation Research Center -PTRC) of the University of California, Riverside, USA.Thirty introductions of cherry tomatoes were evaluated and seeded in the greenhouse of the PTRC Experimental Center (Table 1).The extraction of genomic DNA was performed using the protocol reported by Chetty et al. (2013).The DNA was quantified and analyzed in a ThermoScientific nanodrop 2000C UV-Vis spectrophotometer; 36 microsatellite markers were used from the Solanum Genomic Network (2011) gene database and reported by Kwon et al. (2009) (supplementary file).

Application of the high resolution melting technique in real time PCR
The reaction conditions of real-time PCR for amplification of microsatellite markers and control gene was performed in a thermocycler MyIQ2 (Biorad) in 96-well plates, using the Løvdal and Lillo (2009) protocol as follows: initial denaturation at 95 °C for 10 min followed by 40 cycles of denaturation at 95 °C for 10 s and coupling at 55 °C for 60 s.The melting temperature was determined 81 times starting at 55 °C and finishing at 95 °C.Fluorescence data were recorded at the end of each step of annealing during the PCR cycles.The PCR was performed using a reaction mixture (24 μL) which contained 12.5 μL of SYBR Green, specific primers forward and reverse 10 μM for each microsatellite (0.5 μL), DNA template of 5 μL (5 ng•μL −1 ), and water; the control gene used for reaction was elongation factor1-alpha (EF1).The expression levels for each sample were calculated based on 3 analytical replicas.

Analysis of the information
The results were obtained by the software iQ5 Optical System 2.1 (Biorad); based on the average melting temperature (Tm) reached in 3 replicates/introduction, a matrix was constructed in Microsoft® Office Excel, assigning a row for each of the alleles found, and each individual was given a maximum of 2 values per locus, depending on the introduction (homozygote -heterozygote).It was considered a polymorphic locus at which the most common allele frequency was lower than 95%.From this matrix and using the programs Numerical Taxonomy System for Personal Computer version 2.02 (NTSYS-pc), Tools For Population Genetic Analysis version 1.3 (TFPGA), and Population Genetic Analysis version 1.31 (POPGENE), statistical analyzes were performed.
The similarity index of Nei and Li (1979), according to Leung et al. (1993), was used to estimate the degree of similarity between individuals -also known as the Dice similarity (Sneath and Sokal 1973;Sorensen 1948).The power of polymorphism of microsatellites was measured based on the polymorphism information content (PIC), evaluated as follows: PIC values above 0.5 indicate highly polymorphic loci; PIC values between 0.25 and 0.5 are considered moderately informative; and PIC values less than 0.25 are considered uninformative (Botstein et al. 1980).A cluster analysis was carried out by the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) method.The dendrogram indicating the grouping of introductions was done with the program TREE NTSYS-pc (version 2.02).Bootstrap values after 1,000 replicates were obtained to measure the strength of clusters in the dendrogram.

ReSUlTS And diSCUSSiOn
Of the 36 markers tested, 13 were polymorphic, which allowed the analysis of genetic diversity of the introductions from different countries.Kwon et al. (2009) found 54 polymorphic markers in 10 varieties of tomato of 250 SSRs selected from the Solanum Genomic Network (2011) genomic database.The number of alleles observed in the introductions was between 2 and 8 with an average of 4.92 alleles per locus, the most informative ones being SSR26 and SSR47, with 8 alleles, followed by SSR19; SSR128 and SSR86, with 6 alleles each.The microsatellites with the lowest value of observed alleles were SSR288 and SSR942, with 3 alleles each (Table 1).Muñoz et al. (2010) reported that a significant number of alleles with microsatellite markers can be related to the origin of the material or its genetic nature, either because of its genetic diversity or geographical distance, so a low number of alleles found can be explained by a narrow geographical collecting area and the analysis of the materials.On the other hand, Kwon et al. (2009)  **α = 0.05; 10,000 permutations; na and ne = Number of observed and effective alleles, respectively, according to Kimura and Crow (1964); Ho and He = Observed and expected heterozygosity, respectively, calculated based on Levene (1949); PIC = Polymorphic information content; Fst = Population structure according to Wright (1978).
the wide distribution of introductions ranging from Bolivia to Mexico, going through Brazil, French Polynesia, Peru, Ecuador, and Cuba, could explain the observed number of alleles and their relationship with the number of effective alleles averaging 4.92 and 3.54, respectively.

Indices of genetic diversity: observed and expected heterozygosity and polymorphic information content
The average number of observed heterozygotes was 0.1128; only 3 of the 13 microsatellites evaluated presented heterozygotes, indicating that the species being evaluated has more diversity by being facultative self-pollinating wild introductions with PCN greater than 15% (Rick 1958), which, by the crossing of individuals, can lead to heterozygotes and the generation of genetic diversity (Table 2).Tomato and its ancestors are reported as prevalently autogamous plants, that is, they have a percentage of cross pollination equal to or lower than 5% and self-pollination of 95% or greater, showing a very low percentage that results in crosses and heterozygotes form naturally.Salmerón (2000) described that, if inbreeding is strict, there will be no heterozygotes.The average expected heterozygosity (He) was 0.6946, being higher in microsatellites SSR26, SSR47, SSR19, and SSR86, with values above 0.8; this indicates that loci with high number of alleles are more likely to have heterozygotes.Kwon et al. (2009) reported markers SSR47 and SSR86 as the most informative, which can be very useful for other studies evaluating tomato germplasm.These results allow generating groups with high genetic diversity from crosses of wild materials from a wide geographical distribution.
The average value of polymorphic information content (PIC) for the markers was 0.6304; 77% of the markers evaluated obtained PIC values higher than 0.5; 2 of them close to 0.5, and 1 one was in the limit of 0.25.The highest PIC values (greater than 0.5) were found at the markers SSR19, SSR26, SSR45, and SSR47 (0.7822; 0.8154; 0.7997 and 0.7342, respectively), which are considered highly polymorphic; markers SSR9 and SSR94 showed values of 0.4562 and 0.4200, respectively, considered as moderately informative; only marker SSR288 resulted little informative, with a value of 0.2498 (Table 2).Bredemeijer et al. (2002) obtained PIC values of 0.40 evaluating 500 varieties of tomato with SSR markers, and García-Martínez et al. (2006) reported PIC values between 0.035 and 0.775 for tomato germplasm evaluated with amplified fragment length polymorphism (AFLP).These results may be associated with the genetic relationships of the introductions of the study.Kwon et al. (2009) found PIC of 0.628 evaluating genetic diversity of tomato with microsatellite markers, with a range of 0.210 to 0.880very similar values to those obtained in the present study with markers from the same source (Solanum Genomic Network 2011); 90% of the polymorphic markers evaluated in cherry tomato were highly polymorphic and ideal to conduct assessments aimed at understanding the genetic diversity of the introductions.
Wright index (Fst) values (Fst = 0.3474; DS = 0.0905; 10,000 replications) showed a high genetic differentiation among the groups (Wright 1978).The markers showing the highest Fst values that explain the total differentiation of the accessions were, in orde: SSR253; SSR86; SSR45; and SSR26; with values of 0.4783; 0.4371; 0.4219 and 0.4029, respectively; however, all values, except for those reached by microsatellite SSR288, were above 0.25 (Table 2), showing that the accessions by country of origin, given their geographical remoteness, have no gene flow between groups and therefore they can be genetically differentiated.Nakazato et al. (2008) used 11 AFLP primer pairs to assess polymorphism among S. lycopersicum var.cerasiforme, S. pimpinellifolium, and all entries, respectively.Fst among all populations and all accessions was meaningful (p < 0.01) but small (0.052 and 0.023 between S. lycopersicum var.cerasiforme and S. pimpinellifolium, respectively), indicating little genetic differentiation among the accessions of the populations assessed, despite the high phenotypic differentiation thereof.These estimates were low compared with those reported by Nuez et al. (2004) for 6 microsatellite markers in S. pimpinellifolium with Fst = 0.17; even above these values, the Fst of this research was found, indicating that the molecular markers used allow the detection of population structure and have a high power to differentiate groups and accessions of cherry tomato.

Descriptive analysis
The Dice and Nei and Li (1979) coefficients, to a level of similarity of 0.40, differed introductions into 6 groups, and these, in turn, into 31 haplotypes.The group formed from the introductions from Brazil showed a cluster in the dendrogram according to the geographical area into 3 subgroups; the nearest introductions together with the highest similarity coefficient were IAC445 and IAC1685, with a value of 0.75 (Figure 1).
In general, introductions are distributed in different ways; with the exception of the introductions from Brazil, they did not keep a pattern of distribution obeying the geographical area of origin.This behavior may indicate the origin of the introductions reported for Brazil: since this country is not part of the center of tomato origin, it is most likely coming from other countries such as Peru, Ecuador, and Mexico among which the introductions of Brazil are embedded (Figure 1).
These results allow us to observe that the genetic diversity of the introductions reflects a good percentage of the total diversity of tomato S. lycopersicum var.cerasiforme, which covered more than 90%, indicating the absence of duplicates within and, therefore, could be included in a core collection as basis for further studies of improvement of quality and performance.This indicates the high degree of genetic diversity of the introductions evaluated, which tend to cluster when the Dice and Nei and Li (1979) coefficient is 0.5.
Studies by Nakazato et al. (2008) detected no correlation between the coefficient of genetic differentiation and the geographic distance of the introductions of S. lycopersicum var.cerasiforme (r = 0.090, p = 0.388), but a significant association was identified in S. pimpinellifolium (r = 0.482, p = 0.002), indicating that the equilibrium model of gene flow is probably inappropriate for S. lycopersicum var.cerasiforme, possibly because there is a separate distance of the gene flow between the countries of origin (e.g.due to anthropogenic dispersal) or insufficient time to stabilize the equilibrium conditions in these species; this argument can explain the behavior found in the introductions here evaluated.Domingos (2011), evaluating S. esculentum var.cerasiforme, highlighted the greatest genetic diversity in the case of introductions from Mexico (HT = 0.32) compared to those from Ecuador (HT = 0.20), explaining that the result could be due to the origin of the introductions within each country.Thus, according to the author's evaluation, the 9 entries from Ecuador came from 3 different villages, while those from Mexico were from 8 different locations; however, genetic diversity values obtained for S. esculetum were considered abnormally high and could be influenced by the small number of introductions included in the study.In our case, the introductions that reflected greater genetic diversity were from Peru, from 5 different provinces (San Martin, Arequipa, Apurimac, Cusco, and Molinopata).The countries of Bolivia, Cuba, and French Polynesia, although, have a high genetic diversity, which can be attributed to the small number of individuals included in the evaluation for these countries and the genetic distance between them, which is consistent with the stated by Domingos (2011).
The clustering of the introductions did not maintain a distribution pattern that obeys the geographical area of origin.This indicates the high degree of genetic diversity of the introductions evaluated, which tend to group when the Dice and Nei-Li coefficient is 0.5 (data no shown).
The analysis of molecular variance indicated a variation percentage of 11% between the groups and of 89% within the groups (Table 3).The differences between groups were associated with the geographical distances of the origins of the introductions, and the difference within the groups may indicate genetic diversity originated from hybridization between plants of the var.cerasiforme and tomato cultivars introduced in the same region, indicating, after the initial crossing, successive backcrosses to the var.cerasiforme, showing greater genetic diversity or variation within the same (Table 2).
Fst values can range between 0 and 1; values close to 1 indicate that a majority of the total variation is distributed among populations, while values close to 0 indicate a dominant variation within the populations (Pérez de la Vega and García 2000).
A Fst value of 0.3474 indicates that, besides the great genetic differentiation found in the accessions, these, in turn, were more diverse within countries of origin than among groups.Rick (1958) gave evidence of stressed crossed pollination levels, varying from some areas to others, facilitated by the variability in floral characters as stigmatic exertion; values of 25.7 and 14.8% of crossed pollination have been obtained by this author in Calana and Tacna, on the border with Chile.This has contributed not only to significant variability in cultivars of S. esculentum specific to this region, but it also has allowed the existence of natural crosses between tomato and other sexually compatible species, S. pimpinellifolium, having detected introgression phenomena in the tomato cultivated from the latter species (Rick and Fobes 1975).
These factors could explain the high variability found within countries in contrast to the variability between countries shown by the ANOVA.
Studies by Williams and Clair (1993) on S. esculentum and S. esculentum var.cerasiforme showed the different levels of variability found in different groups of cultivars according to their place of origin (higher variability among cultivars from South America compared to old cultivars and other upgraded ones).Rick and Holle (1990) have suggested that some accessions from Brazil, Peru, Ecuador, and Galapagos Islands originated by hybridization between plants of the var.cerasiforme and tomato cultivars introduced in the same region, indicating that, after the initial crossing, successive backcrosses toward the cerasiforme variety native to these areas would account for the introgression of certain alleles responsible for morphological characters of the fruit.This helps to explain the contrast of the high genetic diversity generated within the countries of origin with respect to that found among them.These results indicate that tomato pre-breeding assessments requiring high variability accessions of cherry tomato can be taken individually for each country, covering the existing natural variability.The next step is to move to crop improvement, with the goal of producing hybrids based on traits of interest according to market requirements.Likewise, it shows the need for more accurate sampling to assess the genetic diversity present in tomato groups, that is, micro-geographical studies to explore more genetic potential of these materials.

COnClUSiOn
Thirteen of the 36 markers evaluated were polymorphic; 90% of the polymorphic markers had polymorphic information values (PIC) of about 0.5, indicating high level of polymorphism, which enabled the analysis of genetic diversity.The most informative markers were SSR47 and SSR26 with 8 alleles, followed by SSR19; SSR86 and SSR128, which showed 6 alleles each.
A coefficient of genetic differentiation was found (Fst = 0.3474), showing a high genetic differentiation of the introductions; those from Brazil, Ecuador, and Peru were the most genetically diverse, presenting 100% of polymorphic loci.The molecular variance analysis indicated a variation of 11% between the groups and 89% within the same.The broad genotypic variability of the evaluated introductions favors the possibility of selecting introductions for genetic improvement and sustainable use of the species.

ACKnOWledgeMenTS
To the Plant Transformation Research Center (PTRC) of the University of California, the Tomato Genetics Resources Center of the University of California, Davis, Genebank and the National University of Colombia, Palmira (Unapal), as well as the Vice-Rector for Research and Graduate Studies of the University of Caldas.

Figure 1 .
Figure 1.Evaluation of the genetic diversity of cherry tomato through the high resolution melting technique.Dendrogram of 31 introductions based on Nei and Li (1979) coefficient.

Table 1 .
Introductions of cherry tomato assessed by genetic diversity using microsatellite markers with the high resolution melting technique.

Table 2 .
evaluated 63 varieties of tomato with 33 SSR markers, finding 8 SSRs with 2 alleles, 6 with 2 alleles, 8 with 4 alleles, and 11 markers 5 alleles; in total, 132 alleles were identified with an average of 4 alleles versus 64 found in this study with an average of 4.92 alleles per locus.In this case, Evaluation of the genetic diversity of cherry tomato through the high resolution melting technique.

Table 3 .
Analysis of molecular variance (ANOVA) in assessing genetic diversity of cherry tomato using microsatellites through the high resolution melting technique.