variability Genetic structure analysis of Mauritia flexuosa natural population from the Lençóis Maranhenses region using microsatellite markers

: Buriti ( Mauritia flexuosa L.f ), a palm tree native to South America and widely distributed in Brazil, displays significant ecological, economic and biotechnological importance. However, the disorderly extractivism and environmental degradation of its endemic areas are leading to reductions in, and/or extinction of the buriti palm tree, causing ecological imbalance with significant economic losses for rural communities and genetic diversity. Consequently, populational genetic diversity studies have become relevant as a strategy for conserving the species. Therefore, this study evaluated the genetic structure of 10 populations from the Lençóis Maranhenses region, in the state of Maranhão, Brazil, using microsatellite markers. Results indicated that eight pairs displayed a high level of polymorphism in the populations evaluated (98.5 %). The genetic diversity estimations allowed for identifying 220 alleles (average of 9.5 alleles/loci), and the heterozygosity averages observed (Ho) were lower than the heterozygosity expected (He) in the population (0.16 and 0.64) and loci levels (0.15 and 0.65), respectively. The Shannon Index (I) mean value of 1.36 indicated high diversity and genotypic richness in the populations evaluated, while the population index (F ST ) indicated a low value (0.05) and the fixation index pertaining to individuals indicated a high value (FIS = 0.79) exhibiting a moderate population distribution structure, and pointing to greater diversity between individuals. Based on these results, populations denominated as PA and PB presented a high genetic similarity (0.219), while populations denominated as PF and PJ exhibited more distant genetic characteristics (0.519). These results can be correlated based on prioritization of conservation of this non– domesticated species, influenced by environmental characteristics, suggesting that the genetic diversity found should be conserved in a germplasm bank, and subsequently exploited in breeding programs.


Introduction
Buriti (Mauritia flexuosa L.f.) is an undomesticated palm tree widely spread across the Amazon Forest and Brazilian cerrado, playing a relevant economic and ecological role in these regions (Virapongse et al., 2017). Its fruit is used extensively in the production of candy and ice cream, while the oil is used in cooking and serves as raw material for the cosmetics industry (Freire et al., 2016;Milanez et al., 2016;Speranza et al., 2016). However, the agrobusiness expansion, unsustainable extractivism and tourism ventures are leading to a decrease in buriti populations, promoting an environmental imbalance, and affecting rural family income as well as losses in the populations' genetic diversity within this species (Ritter et al., 2017;Virapongse et al., 2017;Saura et al., 2018).
Genetic studies of populations, habitat fragmentation, and genetic diversity based on molecular markers reveal a loss of genetic diversity in native plant species both in the Amazon region (Gomes et al., 2011;Federman et al., 2012; and in the state of Mato Grosso (Rossi et al., 2014).
Therefore, data on M. flexuosa from different regions are relevant due to their habitat fragmentation, natural isolation and genetic diversity of populations, considering that it is an undomesticated species with high agronomic and biotechnological potential (Freire et al., 2016;Milanez et al., 2016).
In this context, microsatellite markers (SSR -Simple Sequence Repeat) allow for efficient identification of polymorphisms to map genetically and populationally species of economic, ecological and biotechnological interest (Soldati et al., 2014;Neri et al., 2015).
In the Lençóis Maranhenses region, buriti is an important economic input as a source of income for rural communities due to the production of fiber and extrativism of fruit. However, this region is subject to significant anthropogenic action promoting environmental degradation that impacts the sustainability of its genetic diversity (Mendes et al., 2017). Based on the ecological and economic potential of M. flexuosa, studies are conducted as formulations of strategies for environmental preservation, genetic conservation, and sustainable use of this species. Thus, this study aimed to analyze the diversity and genetic structure of populations of M. flexuosa from the Lençóis Maranhenses (MA, Brazil) region using microsatellite markers.

Plant materials
One hundred plants from ten M. flexuosa populations were selected from the Lençóis Maranhenses region (Figure 1). Each population was distributed in an area of approximately 40 km 2 and their geographic coordinates are presented in Table 1.

DNA extraction
Genomic DNA was extracted from 200 mg of young leaves macerated in liquid nitrogen, using a modified CTAB method described by Romano and Brasileiro (2003). DNA concentration was estimated by spectrophotometry and its integrity analyzed by electrophoresis in 1 % (w/v) agarose gel stained with ethidium bromide (Sambrook et al., 1989).

DNA amplification by PCR-SSR
The genetic diversity of M. flexuosa was analyzed through 13 pairs of SSR markers described by Federman et al. (2012), of which eight pairs were selected and used in intra-and interpopulation analyses. Each amplification reaction was performed under the conditions described by Federman et al. (2012) with adaptations suggested by the manufacturer's instructions. Briefly, a reactional  Pop. = Populations volume of 20 µL containing: 1X of buffer sample (10X); 0.2 mM of each dNTP (2.5 mM); 1.5 mM of MgCl 2 (50 mM); 0.5 µM of primer (25 pmol); 2.5 units of Taq DNA Polymerase (5 U L -1 ); approximately 30 ng of mold DNA, and Milli-Q was also used. Amplification followed in a thermocycler under conditions proposed by Federman et al. (2012). Amplicons were visualized in 1 % agarose gel stained with ethidium bromide (Sambrook et al., 1989) before submitting them to capillary electrophoresis using a dsDNA 905 Kit in a Fragment Analyzer System for fragment visualization.

Data Analysis
The electrophoresis profile of each SSR gel was transformed in a data matrix, elaborated from observed pattern bands, according to the peaks observed in the capillary electrophoresis. Data were used to estimate genetic diversity. For cluster analysis, genetic similarity assessment (Sgij) and Principal Coordinates Analysis (PCoA) were performed using the GenAlex software program v. 6.5 (Peakall and Smouse, 2012). The individual cluster, the correlation (r) and stress (E) values, the Polymorphic Information Content (PIC), the Heterozygosity value and the Shannon index were estimated by the UPGMA method, using the GENES software package (Cruz, 2008). The Molecular Variance Analysis (AMOVA) was calculated by the Euclidian distance matrix (EDM) to estimate the occurrence of inter-and intrapopulation variance composing two hierarchy levels, performed by GenAlex v. 6.5 (Peakall and Smouse, 2012). For genetic structure analysis, the Bayesian method was used as applied by the STRUCTURE software program (Pritchard et al., 2000). Multi-loci genetic data were used to investigate the population and consider the presence of different populations and estimate their allele frequency. The Monte Carlo chain (MCMC) algorithm was implemented to verify the presence of species structure to allocate individuals into clusters according to their genotypes at multiple loci, and the logarithm [ln Pr (X|K)] was applied to calculate genotype (X) probability according to the number of clusters (K) subjacent the data under study (Ellegrem and Galtier, 2016). This evaluation suggested the posterior probability of K being based on genotype to estimate the number of populations with a preserved structure (K), considering the most precise K for which [ln Pr (X|K)] is the maximum (Manel et al., 2005).

Results
It was possible to identify 220 alleles with 98.5 % of polymorphism, grouped into 23 loci exhibiting an average of 9.5 allele/loci. The resultant reliability was verified through stress values (0.044) and cophenetic correlation (0.985), whose stability had already been reached with amplification of 15 loci (r = 0.9) referring to the analysis of all M. flexuosa individuals ( Figure 2).
The evaluation of the efficacy of the markers was also observed through the average value corresponding to the number of effective alleles per polymorphic locus (Ne). This value was 3.73, ranging from 3.23 (Mf 17F) to 4.22 (Mf 22F) for He values, with a variation between 0.55 (marker Mf 17F) and 0.72 (marker Mf 28F), with a mean of 0.65, and an Ho value between 0.05 (Mf 30F marker) and 0.22 (Mf 28F marker) ( Table 2).
All populations presented an 'I' higher than 1.0, varying from 1.24 (PG) to 1.48 (PJ) ( Table 3). The results demonstrated great diversity or genotypic variety within each population analyzed. The PB, PH and PJ populations exhibited a number of private alleles. F values were positive and different from zero in all populations, where the population with the lowest genetic variability was PF (F = 0.78, Ho = 0.15 and PIC = 0.50) and the population with the highest genetic variability was PB (F = 0.71, Ho = 0.21 and PIC = 0.64).
Using the statistical F coefficient (Wright, 1951), made it possible to estimate the allele frequency interand intra population in the ten M. flexuosa populations studied. The Population Fixation Index (FIT) was 0.800 (p < 0.0001): for populations (FST) 0.052 (p < 0.0001) and for individuals (FIS) 0.789 (p < 0.0001), indicating a high level of genetic differentiation in individuals and  a low level in populations (Yeh, 1999). Based on this coefficient, it was possible to obtain the AMOVA (Table  4) to estimate the highest genetic variety in individuals (95 %) and a value of 5 % for populations. As regards the matrix generated from the Nei genetic dissimilarity coefficient (1978) ( Table 5), it was possible to evaluate the genetic diversity in the ten populations under analysis. Results indicated that the PA and PB populations displayed more genetic similarity (0.219), while this similarity was lower in the PF and PJ populations (0.519). This dissimilarity coefficient also allowed for the clustering of these M. flexuosa populations into four distinct categories based on the Principal Coordinates Analysis on genetic similarity ( Figure 3).
The results of the analysis of clustering populations according to the UPGMA method showed the formation of five main groups, which, in most cases, are in agreement with the dendrogram (Figure 4) and the distance matrix obtained. Group 5 exhibited the largest number of populations (PA, PB, PC and PD), and the other      (Rossi et al., 2014) and SSR (Federman et al., 2012;Menezes et al., 2012). However, studies of this species are still scarce, with no evaluation of the genetic structure regarding the distribution of natural buriti populations from Lençóis Maranhenses. Therefore, this study is the first one related to the evaluation of diversity, genetic structure and conservation status of M. flexuosa in this region. The relevance of this study is due to the marked environmental degradation of this region caused by advances in road infrastructure and the expansion of agribusiness and urban borders. Specific SSR markers for this species were developed and applied to populations in other areas (Federman et al., 2012;Menezes et al., 2012; groups were allocated only the smaller populations of M. flexuosa (Figure 4). Thus, the populations of this species were easily separated using SSR markers, indicating genetic differentiation and correlations between the natural populations of this plant. The results show the formation of five main groups and, in most cases, are in agreement with the dendrogram (Figure 4) and the distance matrix obtained.
On the other hand, the cluster Bayesian analysis of the populations' genetics allowed for visualizing the formation of distinct groups pertaining to the allele distribution in each individual ( Figure 5). The separation into four population groups was further supported by the AMOVA results, indicating considerable variability between individuals (Table 4).
Thus, it emerged that the PF, PH and PI populations displayed more homogeneity based on the individual allele distribution (with a higher predominance of a single colored bar, namely, one allele). The PI and PJ populations were analogous to each other in terms of allele distribution (with predominance of red bar) as well as the PH and PI populations (which displayed a predominance of blue bar). The remaining populations exhibited wide-ranging variability in their allele, attributable to a color mixture of bars. Thus, in these populations individuals presented a higher allele distribution.
Data from the analysis of the 100 individuals of M. flexuosa were used to elaborate a dissimilarity matrix based on the weighted average index, with a minimum value of 0.296 and maximum of 0.984 for the pairs formed between individuals PD7 × PH7 and PC10 × PJ7, respectively, from the different population groups (Table 6). From the genetic distance matrix, calculated using the GenAlex software program, a coordinate analysis graphic was elaborated to evaluate genetic diversity among individuals ( Figure 6).
As regards the analysis by the UPGMA clustering method for all individuals were separated by similarity into five different groups (Figure 7). Thus, the individual PI10 and PI6 from group 1; PH7 and PD7 in group 2; PC4 and PH8 from group 3; PB1 and PF4 in group 4; and PC1 with PD3, and PF3 with PF7 from group 5 displayed more similarity between themselves within the group.   Federman et al., 2014). As regards this study, the specific markers used, as described by Federman et al. (2012), as well as the reliability of the results obtained from the co-phenotype correlation analyses (E = 0.044, r = 0.985), suggested optimal precision in the estimates due to the E < 0.05 (Kruskal, 1964). Although only eight SSR markers were used, the allele mean value (9.5 allele per loci) was higher than those reported in the literature for the same species compared to Brazilian native populations (Federman et al., 2012) and other countries (Federman et al., 2014). These data indicate high rates of genetic diversity correlated to the high content of genetic information obtained from an average number of alleles. Nevertheless, there are also reports in the literature  which indicate values lower than those described in this work, using other SSR markers (Menezes et al., 2012). N values higher than the Ne shown in Table 2 may indicate an unequal distribution of allele frequency among individuals, a rare allele presence, which occurs in frequencies < 0.05 in a population, as well as the presence of common allele with frequencies ≥ 0.05 (Whitlock et al., 2016). The identification of rare alleles is useful information in genetic conservation studies for singling populations that deserve special management (Kalinowski, 2004). However, for a better inference of the rare alleles in M. flexuosa, sample collections using a larger population sample are required to observe if there is variation in this allele frequency since this event is closely related to sample size (Marshall and Brown, 1975).
The values of He and Ho help in the genetic diversity evaluation since these parameters measure the homozygote and heterozygote allele proportion in analyzed loci. The values for He were higher than Ho in all the markers analyzed, indicating a high rate of homozygous alleles between the loci. Results with low heterozygote rates using SSR markers are common in studies on this species (Menezes et al., 2012;Federman et al., 2014), and also studies on other species (Ramos et al., 2012). However, reports have presented contradictory values, showing an slightly lower average for He than for Ho (Gomes et al., 2011). Higher He mean values than those of Ho indicate moderate genetic diversity in M. flexuosa populations. This is due to the excess of homozygotes as a result of marked endogamy in the reproduction of these populations. This inbreeding level may be associated with pollination, seed dispersal, inter-and intra-specific competition as well as the heterogeneity of the environment of each population selected, which directly affects the spatial distribution of individuals and, consequently, their population's genetic diversity (Freeland et al., 2011). These data are relevant to M. flexuosa exploration and domestication processes since this is a species exposed to ecological risk (Pickersgill, 2007).
As regards the PIC values (Table 3), the PB and PA populations exhibited more polymorphic information, indicating their high genetic variability. This parameter is applied to assessments of genetic polymorphism, since low values display low level genetic variation per loci, while PIC values ranging from 0.25 to 0.50 are considered merely informative (Botstein et al., 1980). This index showed that eight microsatellite markers efficiently evaluated genetic diversity in the M. flexuosa population. Thus, the PIC mean value of 0.548 denotes high variability in the ten populations analyzed, mainly in the PB population, displaying a higher polymorphism percentage (0.64). A number of reports have estimates of lower PIC values using SSR markers (Alves et al., 2013) or similar to those described in this study (Loiola et al., 2016), indicating the effectiveness of these markers for diversity analysis. Furthermore, the Shannon Index (I) is also applied to determine diversity within a population to indicate genotypic richness through the certainty value of the genetic similarity estimate between individuals in each population (Holcomb et al., 1977). Consequently, I values ranging from 1.0 to 1.5 indicate moderate genetic diversity in individuals from a population, while a variation between 1.0 -1.5 reveals high diversity (Shannon and Weaver, 1963). Therefore, I values ranging significantly from 1.24 (PG) to 1.48 (PJ) were observed in all populations analyzed (Table 3), which indicates moderate diversity or genotypic richness of the M. flexuosa populations from the Lençóis Maranhenses region. The I value determined in this study is higher compared to other studies using ISSR markers in natural populations of this species (Rossi et al., 2014). Thus, data indicate that natural populations from the Lençóis Maranhenses region exhibit more genotypic richness compared to those from the state of Mato Grosso. This result may be related to the extension of the Lençóis Maranhenses region and, consequently, the gene flow in the individuals of each population. Furthermore, the SSR marker specificity may contribute to careful evaluation in terms of the populations' genetic structures. Therefore, the results of this study indicate efficient discrimination in the genetic structure of the populations under analysis.
As regards the allele fixation index (F), data showed a decrease in genetic variability and an increase in homozygosity in the different subpopulations. Moreover, factors influence the genetic diversity level, affecting the allele frequency by randomized crossover as a result of evolutionary processes, such as effective size, mutations and natural selection. Thus, the estimate of the correlation of inter-and intrapopulation allele frequencies of the ten M. flexuosa populations analysed presented values of 0.052 and 0.789, respectively, using the statistical F coefficient (Wright, 1951). This result provided high F values (fixation index) for each population evaluated (average 0.77). This suggested a deviation in the Hardy-Weinberg balance equilibrium, due to homozygote excess, probably as a result of parental crossbreeding or auto fecundation among the individuals analyzed.
The endogamy coefficient (F) estimated for the M. flexuosa populations from Lençóis Maranhenses differs from the populations reported in other genetic studies. These studies evaluated the genetic variation of populations in the Amazonian river regions, which has impacted the dispersion and distribution of the species (Melo et al., 2018;Sander et al., 208). Nonetheless, a comparison of climatic conditions and the discontinuity of the Lençóis Maranhenses habitat region showed marked differences as it is, geographically, a very isolated place and under intense anthropic degradation, factors which are relevant to the M. flexuosa population.
As for the M. flexuosa population assessed, the tendency to endogamy may have been the result of crossbreeding between parental individuals (full-sibs or half-sibs), since the auto fecundation possibility Buriti genetic structure analysis Sci. Agric. v.79, n.1, e20200112, 2022 was discarded, due to the cross-fertilization method of this species (Storti, 1993). Therefore, the high endogamy rate among individuals may be related to the geographic isolation of plant species, which decreases interpopulation genic flow. This geographical isolation is mainly due to extension of the dunes with heights of up to 60 m (Figure 1), and the soil quality, which hinders pollen and seed dispersion and seed germination.
The moderate intrapopulation genotypic diversity found can indicate a lower gene flow among the individuals. This feature must be carefully analyzed for conservation strategies since individuals and/or populations with divergent alleles are required for future crossing to increase genetic diversity among themselves. Therefore, it is a requirement that Germplasm Banks store accessions with lower F values (Alves et al., 2013;Loiola et al., 2016). Another parameter analyzed was the total fixation index with a low value of 0.052, considering all populations (FST) which is a common feature in natural M. flexuosa populations (Federman et al., 2012;Menezes et al., 2012;Federman et al., 2014;Rossi et al., 2014). This gene flow limitation may be associated with population environment conditions, such as swampy or dry and sandy regions, which can act as a flow barrier.
As regards genetic variability, the analysis by AMOVA, as depicted in Table 4, showed that there was more variability between individuals (95 %) than between populations (5 %). These results are in agreement with other studies of M. flexuosa in which the highest genetic diversity occurrence was observed at the intrapopulation level (Gomes et al., 2011;Rossi et al., 2014). These results may be related to the type of species analyzed since tree species with mixed breeding systems and efficient pollen and seed dispersal mechanisms exhibit greater intrapopulation genetic variation (Ellegren and Galtier, 2016). This is the result of a long distance allelic flow, which decreases the variation between populations, increasing the intrapopulation variation (Loveless and Hanrick, 1984).
Comparing the PCoA results to the geographic distance map (Figures 1 and 6, respectively), a correspondence was observed between the groups formed, suggesting a correlation between both the genetic and the geographic diversity of populations. The dendrogram also showed this correlation in the genetic similarity recorded on application of the Nei coefficient. Events correlating geographic and genetic distances reported previously indicated that low levels of genetic similarity among populations may be due to increased spatial fragmentation (Lemos et al., 2015). The spatial fragmentation in the Lençóis Maranhenses region is due to natural factors (extensive sandy bank areas) ( Figure  1), and by anthropogenic actions (e.g. road construction, urban and agricultural advances) that hinder gene flow.
Genetic diversity analysis among individuals using the genetic dissimilarity matrix (Table 5) also allowed for estimating the presence of pairs of individuals with higher and lower genetic similarities. Based on this, no crossing was determined between pairs of individuals belonging to the same population with high genetic diversity. This reinforces the connection between this species' population with the dispersal system and its geographic distribution.
Data from genetic diversity analysis between individuals are consistent with those observed with genetic structure analysis ( Figure 5) and the principal coordinate analyses (Figure 6). These results emphasize the genetic diversity of the individuals analyzed and the formation of four genetic clusters associated with different genetics. Thus, geographically close populations tend to share alleles at the same rate. As previously described, this may be a consequence of the pollen dispersion of this species' population in the geographical environment analyzed and the fragmentation process of the Lençóis Maranhenses region, which may be affecting reproductive patterns and increasing the inbreeding rate as well as the loss of this species' reproductive capacity. Overall, the results evidenced the need to conserve the natural M. flexuosa populations in this region to avoid losing their genetic variability.

Conclusions
Current results using samples of the M. flexuosa species from the Lençóis Maranhenses region exhibited moderate genetic diversity. Nevertheless, the low FST value in populations indicated the occurrence of a decrease in gene flow in these individuals, which was reflected in high F values. Therefore, considering the accelerated degradation of its habitat, there is urgent pressure to preserve plant genetic diversity in ex situ conservation collections. Thereafter, data suggest the inclusion of M. flexuosa population genetic material from the Lençóis Maranhenses region in a Germplasm Bank for their conservation, aiming at application in breeding programs to preserve its genetic variability.