Genetic diversity and population structure of Euterpe edulis by REML/BLUP analysis of fruit morphology and microsatellite markers

Euterpe edulis is an endemic species of the Atlantic Forest that is threatened by the unsustainable exploitation of palm heart. Fruit management is an alternative to overcome this problem, promoting income generation, preserving the trees in forest remnants and motivating the implementation of farms for commercial production. In this study, the genetic diversity and structure of four natural populations of E. edulis were evaluated using microsatellite markers and six morphological fruit traits, analyzed with and without the REML/ BLUP method. The longitudinal diameter had the strongest influence on the differentiation of genotypes. The genetic differentiation among populations was low and inbreeding was detected within populations among sites. Molecular and morphological data indicated high genetic diversity in the E. edulis populations. The REML/BLUP analysis increased the accuracy of morphology-based estimates of genetic diversity, thus contributing to improve breeding strategies for fruit quality and genetic conservation by use in E. edulis.


INTRODUCTION
Euterpe edulis Mart., popularly known as juçara palm, is widely distributed in the Atlantic Forest (Leitman et al. 2015). However, as a result of the intense forest fragmentation (Joly et al. 2014), only 11.4-16.0% of the original Atlantic Forest cover in Brazil is now left (Ribeiro et al. 2009). Consequently, based on the International Union for Conservation of Nature (IUCN) criteria (reduction in population size and area of occurrence), E. edulis was classified in the category of endangered Brazilian species (Leitman et al. 2015). The high commercial value of the juçara palm heart is the major cause of its exploitation (Schulz et al. 2016).
Euterpe edulis is allogamous and protandrous (Mantovani and Morellato 2000), with an outcrossing rate of approximately 1.0 (Reis et al. 1996, Gaiotto et al. 2003, Conte et al. 2008. Within the Atlantic Forest, the species a key food source for a large number of birds and mammals (Galetti et al. 2013). The juçara fruit is considered a "superfruit" owing to its high levels of functional MS Carvalho et al. compounds and antioxidants (Santos et al. 2014) and is used to produce pulp, juice, drugs, cosmetics and functional foods (Felzenszwalb et al. 2013). Moreover, the sensorial properties of juçara fruits are similar to those of the açaí fruits of Euterpe oleracea Mart. and Euterpe precatoria Mart., although their nutritional properties are better (Schulz et al. 2016(Schulz et al. , 2017. The sustainable exploitation of fruits can be an important alternative for conservation by use in Euterpe edulis. However, studies of fruit traits of the species are required to identify highly productive genotypes that are well-adapted to the regional cultivation conditions to ensure high fruit yields. This can be achieved by harvesting fruits of different genotypes and identifying desirable traits associated with fruit yield and quality (Manfio et al. 2011).
The genetic diversity indicated by morphological estimates may not reflect the true genetic variation, since these traits are influenced by environmental factors and the stage of plant development; in addition, some morphological traits may be controlled by polygenes (Last et al. 2014). Mixed-model methods, such as restricted maximum likelihood and best linear unbiased prediction (REML/BLUP), are efficient in estimating variance components and predicting genetic effects from morphological data, providing accurate results (Resende 2002). Studies on important native tree species, including some palm species such as E. edulis (Guilhen et al. 2019), E. oleracea (Navegantes et al. 2018), E. precatoria (Ramos et al. 2019) and Bactris gasipaes (Rodrigues et al. 2017), are hampered by the difficulties of collecting the plant material, illegal tree cutting and data imbalance. To overcome these problems and gain information about the true genetic values, the simultaneous application of REML and BLUP is recommended (Resende 2002). The REML/ BLUP procedure is considered the best estimation or prediction method for unbalanced data sets in perennial plants.
Since it maximizes the probability of genetic variance after correction of the fixed effects, it enables an evaluation of the genotypic, or otherwise phenotypic, contributions (Rodrigues et al. 2017). The REML/BLUP methodology can be associated with mixed models with fixed and random genetic effects, maximizing the correlation between predicted and true genetic values, thus minimizing errors (Resende 2002).
The genetic diversity can also be characterized efficiently by molecular markers. In E. edulis, it has been assessed using microsatellite markers (Gaiotto et al. 2003, Conte et al. 2006) and isozymes (Conte et al. 2008). Morphology-based diversity studies on E. edulis have also been carried out and detcted a wide variation in fruit morphology among natural populations (Cardoso et al. 2000, Oliveira et al. 2015, Marçal et al. 2016, Brancalion et al. 2018).
This study was carried out in four remnants of Atlantic Forest in the southern region of the state of Espírito Santo, Brazil, where only 11.0% of the original forest cover is left and secondary forests are abundant. However, native vegetation can be found in some difficult-to-reach areas (e.g., hilly terrains at high altitudes). The objective of this study was to evaluate the genetic diversity and structure of E. edulis populations using microsatellite markers and fruit morphological traits by the mixed-model method REML/BLUP.

Sampling sites
Fruit and superficial cortex samples of E. edulis trees were collected in four forest remnants in six municipalities of the state of Espírito Santo, Brazil ( Figure 1A): Ibitirama (site 1); Guaçuí and Alegre (site 2); Mimoso do Sul, Jerônimo Monteiro and Alegre (site 3); and Mimoso do Sul and Muqui (site 4) ( Figure 1A). The sampling sites were located on private land in uncultivated areas. For fruit morphology evaluation, 30, 68, 70 and 20 trees were sampled at the sites 1, 2, 3 and 4, respectively. For molecular evaluations, 29, 58, 58 and 15 trees were respectively sampled on the sites 1, 2, 3 and 4.

Morphological analysis
Adult plants in the reproductive stage with at least one cluster containing mature fruits were sampled. Fruit clusters were collected using a pruner. The fruit longitudinal diameter (FLD; mm), fruit equatorial diameter (FED; mm) and fruit fresh weight (FFW; g) were measured in four replicates of 25 fruits per plant. The seed longitudinal diameter (SLD; mm), seed equatorial diameter (SED; mm) and seed fresh weight of depulped seeds (SFW; g) were measured.

MS Carvalho et al.
Fruit morphology data of the studied populations were analyzed using the REML/BLUP mixed-model method. Software Selegen-REML/BLUP was used with the mixed model for repeated measures no. 63, which assumes no experimental design (Resende 2002). Corrected means were calculated for the permanent phenotypic effect, corresponding to the sum of the genetic and permanent environmental effects. A box plot of corrected means was generated to characterize the morphological traits of fruits at each of the four sampling sites. Principal coordinate analysis (PCA) of the multiple morphological traits was performed using R software (R Core Team 2019) in two data sets, i.e. the set with corrected means (BLUE) and the collected (phenotypic) data. The PCA, also known as metric multidimensional scaling, consists of the decomposition of eigenvalues and eigenvectors calculated from the distance matrix, in this case the standardized mean Euclidean Distance matrix. This analysis is relevant because multivariate data can be sorted based on any distance function. The PCA for simple sequence repeats (SSR) is similar to that performed with BLUEs and phenotypic data; however, it only distinguishes the distance matrix calculated by simple matching dissimilarity with 1.000 bootstrap interactions implemented in the program DARwin 5.0 (Perrier and Jacquemoud-Collet 2006).

Molecular analysis
A portion of superficial cortex was used for extraction of genomic DNA, as described by Carvalho et al. (2017). After DNA extraction, the quality and concentration were assessed spectrophotometrically (NanoDrop™ 2000/2000c), and DNA integrity was determined on 0.8% agarose gel.
Thirteen SSR loci developed by Gaiotto et al. (2001) were used. Polymerase chain reactions (PCRs) were performed in a total volume of 13 μL containing 30 ng of genomic DNA, 1× I0 buffer (Phoneutria, Brazil), 0.2 μM of each primer (forward and reverse), 1.5 mM MgCl 2 , 0.25 mM dNTPs and 1.2 U of Taq DNA polymerase. The thermal cycler was programmed as follows: 94 °C for 4 min, followed by 30 cycles of 1 min each at 94 °C; primer annealing temperature for 1 min (Gaiotto et al. 2001); 72 °C for 1 min; and final extension at 72 °C for 7 min. The amplification products were separated by electrophoresis on 6.0% polyacrylamide gels. A 100-bp molecular weight marker was used as standard (Kasvi, K9-100 L). After electrophoresis, gels were stained with ethidium bromide (10 mg mL −1 ) and visualized with an imaging system (BioRad Gel DocTM EZ Imager). Images were processed and bands analyzed using Image Lab 6.0 software (Bio-Rad Laboratories, Inc.) to estimate the PCR fragment sizes.
The null allele frequency was estimated and possible genotyping errors were identified by software MICRO-CHECKER 2.2.3 (Van Oosterhout et al. 2004). The loci with a high mean frequency of null alleles (>20%) were removed and seven SSR loci were used for the diversity and structure analysis. Microsatellite genetic diversity was measured using the parameters mean number of alleles per locus (A/locus), expected heterozygosity (H e ), observed heterozygosity (H o ) and fixation index (F) with the software Genetic Data Analysis (GDA) (Lewis and Zaykin 2001). The population subdivision was estimated by Wright's F-statistics, according to the methods described by Weir and Cockerham (1984), using the parameters fixation index within populations (F IS ), total fixation index (F IT ) and genetic differentiation among populations (F ST ). The bootstrap resampling method was applied with 95% confidence interval and 1.000 permutations, using software FSTAT v. 2.9.3.2 (Goudet 1995, Goudet 2001. Pairwise genetic differentiation (F ST ) was estimated to compare the degree of differentiation among populations (10.000 permutations). Population differentiation and distribution of genetic variation among and within populations were evaluated by analysis of molecular variance (AMOVA) with the software Arlequin v. 3.0 (Excoffier et al. 2005), testing significance with 1.000 permutations. The standardized genetic differentiation (G ST ) and allelic richness were estimated using software FSTAT (Goudet 1995, Goudet 2001).
The population genetic structure was determined using software Structure v. 2.2 (Pritchard et al. 2000, Hubisz et al. 2009). In this analysis, the number of populations is predefined and populations are characterized by a set of allele frequencies at each locus, assuming Hardy-Weinberg equilibrium. The most likely number of populations (K) was determined by varying the K value from 1 to 8; analyses were performed 20 times for each K. Markov chain Monte Carlo (MCMC) algorithms were run for 10 5 iterations and 10 6 repetitions. The ancestry model "Admixture" as well as the independent allele frequency model were used. The most likely K value was determined as proposed by Evanno et al. (2005), using software Structure Harvester (Earl and Vonholdt 2012).

Morphological analysis of fruits
Morphological traits varied among and within sites (Table 1, Figure 1B). In general, the fruit longitudinal (FLD) and equatorial (FED) diameters were similar (14.00 -17.29 mm and 14.32 -17.90 mm, respectively). Seed longitudinal (SLD) and equatorial (SED) diameters were also similar (11.62 -14.37 mm and 11.91 -14.81 mm, respectively). Fruit fresh weight (FFW) varied between 1.89 and 3.30 g and seed fresh weight (SFW) from 1.11 to 2.06 g. The trait with greatest variation among populations was SFW, followed by FFW. Fruit equatorial diameter had the lowest coefficient of variation, whereas SFW had the highest (Table 1). Box plot analysis showed great variation in fruit and seed traits among genotypes within sites. In general, little variation was found among sites. The coefficient of variation was highest for all traits at site 2 (Alegre and Guaçuí), and lowest for all traits at site 4 (Mimoso do Sul and Muqui) (Table 1, Figure 1B).
Together, the first and second principal coordinates (PCAs), represented in a two-dimensional plane, explained 92.61% of the total variation in the six quantitative fruit traits ( Figure 1D). This analysis detected high variability, mostly between sites 1 and 2, and the populations were evenly distributed across all quadrants. Fruit longitudinal diameter (FLD) was the trait that contributed most strongly to the discrimination of genotypes. In general, most genotypes were clustered in the fourth quadrant, including trees from all sampling sites. The genetic divergence based on fruit morphology was highest at site 2 (red dots) and the genotypes were distributed in all quadrants. Genotypes from site 1 (blue dots) were distributed mainly in the fourth, and only one in the second quadrant. Trees from site 4 (yellow dots) were grouped in the third and fourth quadrants, confirming the box plot results, which showed a lower diversity at this site than the others ( Figure 1D). A comparison of the data set of morphological traits with the corrected means (BLUE) and the phenotypic data showed a distinct distribution of the trees across all quadrants ( Figure 1D and 1E). The distribution of the trees across all quadrants based on SSR data was also provided (Online Resource 4). Table 1. Minimum, maximum, and mean values (± standard deviation) from adjusted means, based on the REML/BLUP approach, for each morphological trait of E. edulis fruits: fruit longitudinal (FLD) and equatorial (FED) diameters (mm), fruit fresh weight (FFW) (g), seed longitudinal (SLD) and equatorial (SED) diameters (mm), and seed fresh weight (SFW) (g)

Microsatellite marker analysis
The analyses were performed with a total of 160 trees, based on seven SSR. The SSRs comprised a total of 43 alleles, with 6 to 7 alleles per locus and a mean of 6.14 alleles. The observed heterozygosity (H 0 ) varied from 0.36 to 0.68 and the expected heterozygosity (H e ) from 0.60 to 0.79. Fixation indices per locus varied from 0.14 to 0.34 (Online Resource 1), showing that the observed proportion of E. edulis homozygotes was higher than that predicted by the Hardy-Weinberg equilibrium.
The mean number of alleles per population ranged from 4.29 (site 4) to 6.00 (site 2) and moderate allelic richness was observed, which was highest (4.96) at site 2 and lowest (4.07) at site 4. The mean H e of the populations was 0.68 and the mean H 0 0.48. The fixation index ranged from 0.22 (site 4) to 0.39 (site 3) ( Table 2). The genetic differentiation among sites was low (F ST = 0.061; G' ST = 0.074). A comparison of F ST values at the sampling sites showed that all pairs of sites were significantly divergent from each other (p < 0.05). The differentiation varied from low (F ST = 0.025, site 4 × site 3) to moderate (F ST = 0.112, site 1 × site 4) (Online Resource 3). By ANOVA, high variation was detected within (93.29%) and low variation (6.11%) among populations (Online Resource 2).
Population structure analysis using software Structure identified three Bayesian groups (K = 3). All three groups and all populations were found to share alleles, with no evident cluster formation. However, site 1 had the highest proportion of blue dots, site 2 the highest proportion of red and sites 3 and 4 the highest proportion of green dots ( Figure 1C).

DISCUSSION
Euterpe edulis has a fundamental ecological function, as the fruits produced by this palm species are essential for the survival of bird and mammal species (Galetti et al. 1999, Galetti et al. 2013). According to Carvalho et al. (2016Carvalho et al. ( , 2017, several factors threaten the survival of E. edulis trees: illegal extraction, Atlantic Forest fragmentation, climate change and extinction of large frugivorous birds. The loss of large dispersers leads to changes in the selection pressure on seeds and thus to a differentiation of phenotypic characteristics such as size reduction (Galetti et al. 2013), and to microevolutionary changes e.g., in gene flow and allele turnover (Carvalho et al. 2016).
In this study, the mean longitudinal and equatorial diameters of E. edulis fruits and seeds were higher than those reported by Galetti et al. (2011), though fruit and seed weights were similar. According to Galetti et al. (2013), the mean fruit diameter (12.72 mm) and seed diameter (11.7 mm) were smaller than those found in this study. Based on the fruit traits, PCA detected variability among the populations, mostly at sites 1 and 2, although without differentiation among populations. At site 2, the genetic diversity among trees was wide and allelic richness the highest, while at site 4, diversity was lowest. Interestingly, the range of altitude above sea level of the trees sampled within the same population was higher for site 1 (altitude range: 601 to 815 m asl, difference of 214 m) and lower for site 4 (altitude range: 365 to 404 m asl, difference of 39 m). It was observed that, the greater the altitude range, the greater the genetic diversity of the traits of the evaluated juçara fruits. Great variability in morphological fruit traits has been observed among natural E. edulis populations (Galetti et al. 2013, Oliveira et al. 2015 and the fruit size traits, analyzed in this study, are regarded as important for the genotypic differentiation of natural E. edulis populations (Galetti et al. 2011). Here, a low to moderate differentiation among populations was detected by SSR markers as well as low variation in morphological traits among sites. In addition, morphological fruit traits of the trees analyzed by REML/BLUP identified different genetic diversity estimates than the common phenotypic data approach. These results demonstrate the suitability of the REML/BLUP methodology to estimate genetic variance based on permanent phenotypic variance, indicating that the use of this methodology can improve the accuracy of genetic diversity estimates for Euterpe edulis.
Our results indicate that the genetic diversity of Euterpe edulis is lower than it could be, according to the allelic richness of the populations, as similarly found by Coelho et al. (2020), and lower than the diversity reported elsewhere (Conte et al. 2008, Carvalho et al. 2017. In general, the genetic differentiation between the four E. edulis populations was low and the highest percentage of variation was observed within sites (93.3%). In another analysis of E. edulis populations, Gaiotto et al. (2003) found an F ST value of 0.06, similar to our result. This low to moderate differentiation among populations reflects the genetic diversity and historical gene flow of E. edulis populations, possibly explained by the climatic stability since the mid-holocene until today (Carvalho et al. 2017). Nevertheless, the populations of the studied trees are undergoing a differentiation process, resulting from the degradation of the Atlantic Forest in the south of Espírito Santo.
In this research, the fixation index was positive, as also stated in other studies on E. edulis (Conte et al. 2006(Conte et al. , 2008. The species has been reported to be allogamous, protandrous (Mantovani and Morellato 2000), and to have an outcrossing rate of approximately 1.0 (Reis et al. 1996, Gaiotto et al. 2003, Conte et al. 2008. Nevertheless, the heterozygosity deficiency may be related to ecological and anthropogenic factors, e.g., a decrease or inefficiency of dispersers and pollinators and/or reduction of the effective population size, which are related to habitat loss and fragmentation or illegal extraction of the species (Young et al. 1996, Browne et al. 2015, Carvalho et al. 2016, Ellegren and Galtier 2016, Carvalho et al. 2017. Inbreeding can also be attributed to a high fruit yield (Mantovani and Morellato 2000) and seed recalcitrance, leading to the formation of a seedling bank under the canopy of parent trees (Reis et al. 1996). When seedlings reach adulthood, the spatial proximity among related individuals (adults and siblings) increases the possibility of crosses among closely related plants (Gaiotto et al. 2003). These factors may promote inbreeding depression, with an increase in homozygous (identical)-by-descent alleles (Tambarussi et al. 2017), reducing the adaptation capacity and performance of the plants, increasing mortality rates and decreasing genetic diversity (Freeland 2005). In the long run, this situation may culminate in the extinction of the population. In order to increase the survival chances of the populations, the presence of pollinators in the respective areas is fundamental to make crosses between genetically distant trees possible. However, forest fragmentation negatively affects the movement of E. edulis pollinators and seed dispersers (Carvalho et al. 2016).
With regard to the determining factors for survival and growth, the inbreeding level resulting from crosses between related trees has practical implications for genetic conservation and breeding as well as environmental restoration (Tambarussi et al. 2017). Thus, in view of the possible impacts of forest fragmentation on several species, including the E. edulis populations at the sites evaluated in this study, strategies to intensify the gene flow in these areas must be developed and applied, consisting of tree planting, conservation of natural forest remnants and the creation of ecological corridors.

CONCLUSION
The REML/BLUP analysis contributed to a more accurate genetic diversity estimate, based on the fruit morphology of natural E. edulis populations of four Atlantic Forest remnants in the Southern region of Espírito Santo, Brazil.
A high diversity was detected in populations with low genetic differentiation between but inbreeding within sites.
The diversity in fruit traits and molecular markers was highest at site 2; the genetic diversity is an important tool for the use, maintenance and conservation of genetic resources in E. edulis, as well as to recover and preserve the Atlantic Forest.