Molecular characterization of common bean accessions using microsatellite markers

ABSTRACT The common bean, a legume of significant economic importance, is renowned for its extensive genetic variability. It is crucial to comprehend genetic diversity, analyze population structure, and understand relationships among commercial classes of accessions to facilitate genetic improvement. This study aimed to molecularly characterize 143 common bean accessions by employing 25 SSR molecular markers. The objectives were to estimate genetic diversity, analyze genetic structure, and cluster populations using the UPGMA and PCoA methods. A total of 105 alleles were amplified using microsatellite loci, and the observed heterozygosity was lower than expected across all loci, indicating inbreeding within the populations. Among the loci, 22 were highly informative, demonstrating their effectiveness and polymorphism in detecting genetic diversity. The genetic variability within the population was found to be the highest, while variation between populations was the lowest. The analysis of population structure revealed the presence of three populations with a notable rate of gene introgression. The UPGMA analysis categorized the accessions into 15 groups, but they did not form distinct clusters based on their geographic regions or gene pool. The first two principal coordinates accounted for 13.95% of the total variation among the accessions. The SSR markers employed effectively detected genetic variability among the common bean accessions, revealing that their genetic diversity was not correlated with their geographic distribution in this study.


INTRODUCTION
The common bean (Phaseolus vulgaris L.) is indigenous to Central and South America, and its domestication occurred independently in the Andean and Mesoamerican regions, resulting in two distinct gene pools (Voysest, 2000;Gioia et al., 2013;Shamseldin;Velázquez, 2020).Among the legumes belonging to the Phaseolus genus, the common bean holds significant global importance.In Brazil, it is a staple in daily diets and a primary source of vegetable protein (Pitura;Arntfield, 2019;Melo;Paixão, 2020).
This crop is renowned for its morphological variability and adaptability to diverse environments, resulting in a wide range of local varieties or landraces, each with distinct seed properties (Perina et al., 2014;Pereira et al., 2019).It is essential to characterize the genetic variability of common bean accessions from Germplasm Banks and working collections.Such studies facilitate understanding of the genetic relationships among accessions, identification of redundancies and mixtures in the germplasm, and determination of optimal genetic distances between pairs of parents (Coelho et al., 2010;Delfini et al., 2021).
Molecular markers are valuable tools for characterizing species, especially when morphologically similar individuals with agronomic potential need to be distinguished (El Kadri;Mimoun;Hormanza, 2019).Molecular markers, including microsatellites or simple sequence repeats (SSRs), have become the markers of choice for studying genetic diversity, species identification, phylogenetic analysis, gene mapping, and classification of genetic resources in various crops, including the common bean (Alghamdi et al., 2019).
SSR markers are frequently employed in common bean studies due to their abundant and relatively uniform distribution in the genome, codominant inheritance, high polymorphism, and reproducibility levels, and ease of analysis and comparison across studies and germplasm sets (Gioia et al., 2019).Previous studies conducted by Šajgalik et al. (2019), Mir et al. (2021), Oliveira et al. (2021), and Catarcione et al. (2023) have demonstrated that SSR markers are an effective genetic tool for assessing genetic diversity and subpopulation structure in common bean germplasm collections.Therefore, the objective of this study was to estimate the genetic diversity of common bean accessions using microsatellite markers.

MATERIAL AND METHODS
The research was carried out at the Laboratory of Genetic Resources and Biotechnology (LRG&B), located at the Cáceres, Mato Grosso State Campus of the State University of Mato Grosso "Carlos Alberto Reyes Maldonado" (UNEMAT), with coordinates at 16°05'08.1"latitude in the South and 57°39'00.3"longitude in the West.
A comprehensive assessment was conducted on a total of 143 accessions of the common bean sourced from the germplasm collection of the LRG&B (Table 1).
The common bean accessions were planted in trays filled with the commercial substrate within the LRG&B greenhouse.Once the seedlings reached the V2 stage, characterized by the partial opening of the first trifoliate leaves, they were transferred to 2 mL microtubes.The collected plant samples were then taken to the LRG&B and stored in an ultra-freezer at -80 °C.
DNA extraction was conducted using the Promega Wizard® Genomic DNA Purification kit following the manufacturer's guidelines.The trifoliate leaves of the common bean were retrieved from the microtubes, and two beads were added to each tube.Approximately 600 µL of "Nucleilysis Solution" was then introduced into each microtube, which was macerated for 15 min using the tissue Lyser.
After the maceration period, the beads were removed from each microtube, and the tubes were incubated in a water bath at 65 °C for 15 min.Subsequently, 3 µL of RNAse was added gently to each microtube five times, and the microtubes were further incubated in a water bath at 37 °C for 15 min.Following this, the tubes were allowed to cool for 5 min at room temperature, and 200 µL of "Protein Precipitation Solution" was added.The microtubes were then vortexed at high speed for 20 s.
The microtubes were placed in a centrifuge at -4 °C and 14,000 rpm for 5 min, and the resulting supernatant was transferred to new 1.5 mL microtubes containing 600 µL of isopropanol.The solution was mixed by inversion at room temperature.
Subsequently, the microtubes were centrifuged again at 22 °C and 14,000 rpm for 5 min, and the supernatant was discarded.Then, 600 µL of 70% ethanol was added to the precipitate, and the microtubes were centrifuged at room temperature for 3 min at 14,000 rpm.The supernatant was removed by decanting, leaving only the pellet attached to the microtube wall.The microtubes were placed on absorbent paper for 15 min to dry the precipitate.
DNA was extracted from all common bean accessions, and the precipitated DNA was resuspended in autoclaved ultrapure water and refrigerated overnight for future use.DNA quality was assessed using electrophoresis on a 1.0% agarose gel stained with Gel Red and Blue Juice 6X, with 1% Tris borate EDTA used as the running buffer.The gel was run at 60 Volts for 1 h to ensure the extraction of high-quality DNA.To assess the genetic variability of common bean accessions from the collection, a set of 25 primers (Table 2) was employed for amplifying the SSR loci, as previously established by Valentini et al. (2018).DNA samples for the PCR were diluted to a concentration of 10 ng-µL -1 using autoclaved ultrapure water.The PCR mixture was prepared with a final reaction volume of 25 µL, comprising 2 µL of DNA (10 ng-µL -1 ), 0.5 µL of deoxyribonucleotide mix (dATP, dCTP, dGTP, and dTTP) (10 mM), 1.25 µL of each forward and reverse primer (10 µM), 5 µL of buffer (5X) containing magnesium (7.5 mM), 0.2 µL of Taq polymerase (5U), and 14.8 µL of autoclaved ultrapure water.
The polymerase chain reaction (PCR) was conducted following the methodologies outlined by Williams et al. (1990), Gaitán-Solís et al. (2002), and Blair et al. (2003) using the Perkin Elmer Model 9600 thermocycler.Each PCR cycle consisted of an initial denaturation step at 94 °C for five minutes, followed by 35 denaturation cycles at 94 °C for 30 s.The annealing step was performed at the specific primer temperature (1 to 18) for 30 s, followed by an extension phase at 72 °C for 30 s.The final extension phase lasted five minutes at 72 °C.
For primers 19 to 23, the cycle began with an initial denaturation step at 92 °C for five minutes, followed by 30 cycles of denaturation at 92 °C for one minute.The annealing step was performed at the specific primer temperature for one minute, followed by an extension phase at 72 °C for two minutes.The final extension phase was conducted at 72 °C for five minutes.Primers 24 and 25 followed a cycle consisting of an initial denaturation step at 94 °C for two minutes, followed by 35 cycles of denaturation at 94 °C for 15 s.The annealing step was performed at the specific primer temperature for 15 s, followed by an extension phase at 65 °C for 15 s.The final extension phase lasted 15 s at 72 °C.The amplified PCR products were visualized on a 3% agarose gel stained with 6X Gel Red and Blue Juice.Electrophoresis was conducted using 1% Tris borate EDTA as the running buffer, and the gel was run at 60 Volts for 5 h.The gels were photographed using the Locus Biotechnology/LPix Image photo-documentation system, version 2.7.The heights of each allele (band) were measured using the LabImage 1D Revision 1.10 program (Locus Biotechnology), and DNA bands were compared to a 1000 base pair (bp) standard ladder.
Genetic diversity was estimated based on various parameters, including the number of alleles per locus (Na), observed (Ho) and expected average heterozygosity (He), inbreeding coefficient (f), and polymorphic information content (PIC), following the methods of Weir and Cockerham (1984).These data were analyzed using the Power Marker program, version 3.25 (Liu;Muse, 2005), and used to calculate the genetic distance matrix among the 143 accessions, following the approach of Nei, Tajima and Tateno (1983).The genetic distance matrix was then imported into the Genes program (Cruz, 2016) to construct a dendrogram using the Unweighted Pair Group Method with Arithmetic Average (UPGMA).
To analyze the distribution of genetic diversity between and within populations, the analysis of molecular variance (AMOVA) was performed, as described by Excoffier, Smouse and Quattro (1992).The significance of the differences between means was tested using 1,000 permutations with a 95% confidence interval.Additionally, genetic diversity was evaluated using principal coordinate analysis (PCoA) at the accession level, employing the GenAlEx6.5software (Peakall;Smouse, 2012).
The Structure program (Pritchard;Stephens;Donnelly, 2000), employing Bayesian statistics, was utilized to estimate the number of groups (K).The analyses involved 20 runs for each K value, with a burn-in period of 5,000 and 100,000 Markov Chain Monte Carlo (MCMC) simulations.The criteria proposed by Evanno, Regnaut and Goudet (2005) were applied to determine the most probable value of K based on the groups defined by the program.

RESULTS AND DISCUSSION
Utilizing 25 SSR primers for genotyping 143 common bean accessions, a total of 105 alleles were successfully amplified, with an average of 4.20 alleles per primer.The maximum number of alleles per primer was seven (PV-cct001 and PV-BR185), while the minimum was three (BM143, BM187, PV-BR025, PV-BR60, PV-BR112, PV-BR167, SSR-IAC14, and BMd-20).Each evaluated accession demonstrated the presence of at least three alleles when assessed with all primers (Table 3).The observed heterozygosity was consistently lower than expected across all loci, ranging from 0.0000 to 0.0500, with a mean of 0.0026.In contrast, the expected heterozygosity ranged from 0.2677 to 0.7695, with a mean of 0.6406.These high expected heterozygosity values, coupled with low observed heterozygosity values, suggest a substantial degree of genetic diversity among the accessions and a notable inbreeding rate, as indicated by the elevated fixation indices estimated from each analyzed locus.Gioia et al. (2019) assessed the degree of genetic variance and relatedness among 192 advanced cultivars of common bean, using 58 SSR markers.Their findings indicated that the anticipated heterozygosity exceeded the observed heterozygosity.On the contrary, Savić et al. (2021) performed an examination of the genetic diversity inherent in Serbia's common bean germplasm, employing 27 SSR markers.They noticed that for 17 primers, the observed heterozygosity surpassed the expected heterozygosity, suggesting a surplus of heterozygotes within the population.
It is significant to note an apparent excess of homozygotes among the common bean populations.This is evidenced by the positive fixation coefficient observed across all loci, with an average of 0.9960.This statistic points to the occurrence of inbreeding within the populations.These findings align with those from a study by Catarcione et al. (2023), where they investigated the genetic diversity and population structure of common bean landraces in Italy's Lazio region, they observed positive values ranging from 0.951 to 1.
The PIC value varied between 0.2418 (PV-at006) and 0.7334 (PV-cct001), with an average of 0.5821.Out of the loci evaluated, 22 were deemed highly informative, whereas two were identified as moderately and slightly informative.Overall, the SSR loci under evaluation proved to be efficient and polymorphic, reflecting the genetic diversity within the common bean accessions.It is important to understand that the PIC value correlates with the count of alleles, which in itself is directly linked to the genetic divergence and the quantity of accessions evaluated in the study.
Findings comparable to those previously discussed were reported by Vidak et al. (2021), who investigated the origin and diversity of the Croatian common bean germplasm.They employed phaseolin type, SSR and SNP markers, along with morphological traits, and reported PIC values spanning from 0.310 to 0.862, with an average of 0.497.These results are consistent with those reported by Catarcione et al. (2023), where observed values ranged from 0.263 to 0.868, and an average value of 0.572 was calculated.
To measure the variation both between and within populations, AMOVA was undertaken.This showed that the most significant variability, accounting for 84% of the total, was detected within populations.In contrast, the least variation, contributing 16% to the total, was discerned between different populations.Additionally, the FST index, a measure of genetic differentiation between populations, was determined to be low (FST = 0.157), indicative of a relatively low degree of population structure (Table 4).
These results are in concordance with the relatively closed reproduction in this specie, similar results were reported by Özkan et al. (2022) on the genetic diversity and population structure of common bean landraces in Turkey using SSR markers, AMOVA results revealed that the within-population variance (66%) was higher than the between population variance (34%).
The Bayesian model-based clustering method was employed to evaluate the population structure of 143 common bean accessions.The number of subpopulations (known as the K value) was identified based on maximum likelihood and Delta K values.This approach divided the accessions into three distinct groups (K = 3) (Figure 1).Group I contained several Andean accessions, while Group III incorporated several other Mesoamerican accessions.Group II was composed of a mixture of gene pools.
Corroborative results were evident in a study conducted by Delfini et al. (2021), who explored the population structure, genetic diversity, and genomic selection within the Brazilian common bean germplasm.Their observations revealed a tripartite division of accessions (K = 3).Conversely, Vidak et al. (2021) studied the origin and diversity of Croatian common bean germplasm using phaseolin types, SSR and SNP markers, and morphological traits.Their population structure analysis yielded K values of 2 and 3 for the different molecular markers, respectively.Introgression presence is correlated with accessions, which are depicted by varying colors within the same bar plot (Figure 1).Among the 88 accessions evaluated for introgression, the most pronounced intensity was noted for accession 16 (Peruano) in Group II.Moreover, it was found that Mesoamerican accessions exhibited the highest rate of introgression.
As Blair et al. (2013) highlight, the notion of pure gene pools for the common bean germplasm from Brazil is not necessarily accurate, given that many accessions result from introgression.Thus, certain Brazilian varieties (jalo and carioca) cannot be considered representatives of a single gene pool.Almeida et al. (2020) assessed genetic diversity, population structure, and Andean introgression in common bean cultivars, noting the presence of introgression among evaluated accessions following population structure analysis.Here, Mesoamerican accessions displayed the highest rate of introgression relative to their Andean counterparts.
Hierarchical clustering analysis, employing the UPGMA method, unveiled the formation of 15 groups based on a similarity criterion of 78% (Figure 2).Group I was the largest, comprising 50 accessions from states such as Acre, Mato Grosso, Mato Grosso do Sul, Pará, Paraná, and Rio Grande do Sul.Within this group, 41 were Mesoamerican and 9 were Andean.This was followed by Group V, housing 25 accessions from Acre, Amazonas, Mato Grosso, Mato Grosso do Sul, and São Paulo populations, among which 17 were Andean and 8 Mesoamerican, as well as Group IV, with 14 accessions from Mato Grosso and Mato Grosso do Sul, including 5 Andean and 9 Mesoamerican.
Groups IX and XII each consisted of 10 accessions, with those of IX coming from Mato Grosso and Tocantins, and those of XII from Acre and Mato Grosso.All accessions of Group IX were Mesoamerican, while in Group XII, only one was Andean, with the rest being Mesoamerican.Group II encompassed 9 accessions from Mato Grosso do Sul and Paraná, 6 of which were Mesoamerican and 3 Andean, whereas the accessions within Group XI hailed from Mato Grosso and Mato Grosso do Sul populations, including 4 Andean and 2 Mesoamerican.
Groups X and XIV each contained 5 accessions.Those from Group X originated from Acre, Mato Grosso, and Santa Catarina, while those from Group XIV were from Mato Grosso and Mato Grosso do Sul.In both groups, 4 accessions were Mesoamerican, with just 1 being Andean.Groups VI and VIII comprised 4 accessions each, deriving from Goiás and Mato Grosso do Sul (all Mesoamerican), and Mato Grosso (1 Andean and 3 Mesoamerican), respectively.Within the 15 groups established, 3 groups were each made up of 3 accessions (III, VII, and XIII).Accessions within groups III and VII originated from Goiás and Tocantins populations and were all Mesoamerican.In contrast, the ones in Group XIII were from the states of Mato Grosso, Mato Grosso do Sul, and Santa Catarina, consisting of 1 Andean and 2 Mesoamericans.Uniquely, Group XV was comprised of a single Mesoamerican accession from Mato Grosso, suggesting its distinct divergence from the other evaluated groups.The creation of groups containing only a single accession implies greater divergence when contrasted with others, offering potential avenues for the execution of improvement programs.
It was noted that accessions did not cluster specifically based on their geographical regions of acquisition/collection or their gene pool.This observation aligns with the findings reported by Özkan et al. (2022), who found no significant correlation between geographical origin and genetic similarity in their cluster analysis of genotypes, indicating a recent introduction from a common source population.
Ekbiç and Hasancaoğlu (2019) carried out the morphological and molecular characterization of 33 local common bean genotypes from the Ordu province in Turkey via SSR markers.Their UPGMA cluster analysis, based on geography (city), revealed geographical separation of genotypes due to local bean producers typically producing their own seeds, with seeds from a single locality being supplied to various local markets.
The clustering method was implemented according to the cophenetic correlation coefficient (CCC) via the t-test.The UPGMA hierarchical clustering method revealed a significant value (P ≤ 0.01), with (r ≥ 0.66); the distortion of 1.86% was deemed minimal, and the attained stress level (13.65%) suggested a high degree of accuracy in the graphic projection of genetic distance among populations.These results established that the generated dendrogram suitably mirrored the genetic dissimilarity matrix, thus, validating the efficacy of the statistical methods utilized and illustrating the dissimilarity among the studied common bean populations using SSR markers.
The CCC is utilized to assess the reliability of the generated clusters . Vaz Patto et al. (2004) posited that values exceeding 0.56 can be deemed ideal.Santos et al. (2022) evaluated genetic diversity among local varieties and common bean cultivars, observing significant CCC values of 0.77 (P ≤ 0.01) based on the t-test, indicating that the dendrogram suitably mirrored the genetic dissimilarity matrix.
Employing principal coordinate analysis (PCoA), it was feasible to discern the spatial distribution of the 11 populations and the 143 common bean accessions within those populations.The first two coordinates accounted for 13.95% of the total variation among accessions, with dimensions 1 and 2 accounting for 8.18% and 5.77%, respectively (Figure 3).
Accessions from Paraná, Rio Grande do Sul, Santa Catarina, São Paulo, and Tocantins populations coalesced, signifying population homogeneity, while other populations appeared separately on the graph.This suggests internal divergence within the population itself, as some accessions reside at the top and others at the bottom of the axis, implying considerable genetic variation within populations.These results echo the findings of Gyang, Muge, and Nyaboga (2020), who investigated the genetic diversity and population structure of common bean germplasm in Kenya.They noticed that genotypes did not cluster into distinct groups, thus, providing evidence of population intermingling in the scatter plot of the principal coordinate analysis (PCoA).
The PCoA analysis outcomes endorsed the genetic variability as inferred from AMOVA and the UPGMA clustering analysis, illustrating a blend of populations.Furthermore, the populations Acre, Goiás, Mato Grosso, Mato Grosso do Sul, and Pará coalesced based on both gene pools (Andean and Mesoamerican), hinting at the possibility of these accessions originating from other geographical areas.

CONCLUSIONS
The examined common bean accessions displayed considerable genetic variability, which bolsters their use in crop enhancement programs.The SSR molecular markers implemented in this study demonstrated efficacy in quantifying genetic variability; hence, they can be deemed as supplemental instruments to foster a comprehensive comprehension of diversity among accessions from the BAG families and working collections.

Figure 1 :
Figure 1: Population structure of 143 common bean accessions evaluated based on 25 SSR molecular markers (K = 3); Each vertical bar represents access and the percentage of adhesion to each group.

Figure 2 :
Figure 2: Dendrogram for 143 common bean accessions obtained by the UPGMA hierarchical clustering method using SSR markers.

Figure 3 :
Figure 3: Analysis of principal coordinates obtained using 25 SSR markers in 143 common bean accessions.

Table 1 :
Identification of a total of 143 accessions of common bean from the germplasm collection of LRG&B -UNEMAT.

Table 2 :
SSR primers and sequence information used for genetic diversity analysis of common bean accessions.

Table 3 :
Genetic diversity of 143 common bean accessions assessed using 25 microsatellite markers.
N a = The number of alleles; H e = Expected heterozygosity; H o = Observed heterozygosity; f = fixation index (endogamy); PIC = Polymorphic information content.

Table 4 :
Analysis of molecular variance (AMOVA) of 11 common bean populations using 25 microsatellite markers.
DF = degree of freedom; SQ = Sum of squares; CV = Variance components; VT = Total variance; P = probability of having a variance component greater than the values observed randomly; The probabilities were calculated based on 1000 random permutations.