Genetic structure and gene flow in Eugenia dysenterica DC in the Brazilian Cerrado utilizing SSR markers

The “cagaita tree” (Eugenia dysenterica) is a plant found widespread in the Brazilian Cerrado. Its fruit is used for popular consumption and for industrial purposes. This study opens a new perspective for the generation of population genetic data and parameters estimates for devising sound collection and conservation procedures for Eugenia dysenterica. A battery of 356 primer pairs developed for Eucalyptus spp. was tested on the “cagaita tree”. Only 10 primer pairs were found to be transferable between the two species. Using a polyacrilamide gel, an average of 10.4 alleles per locus was detected, in a sample of 116 individuals from 10 natural “cagaita tree” populations. Seven polymorphic loci allowed estimation of genetic parameters, including expected average heterozygosity He = 0,442, among population diversity, RST = 0,268 and gene flow Nm = 0,680. Results indicated a potential of SSR locus transferability developed for Eucalyptus to other species of different genera, such as in the case of the “cagaita tree”. The high genetic diversity among populations detected with SSR markers indicated that these markers are highly sensitive to detect population structure. Estimated Nm values and the existence of private alleles indicated reduced gene flow and consequently possible damage to the metapopulation structure.


Introduction
Eugenia dysenterica, commonly known as the "cagaita tree" is a fruit species native to the Cerrado (Brazilian Savannah) region belonging to the Myrtaceae family that presents potential for use in agricultural production systems (Almeida, 1998).It is outstanding for its social and economic potential for processing many sub-products and may contribute to increase income as well as jobs in the regional communities.Besides being an ornamental and honey-bearing plant, it can be used for cork extraction and in small buildings or in the manufacture of charcoal, providing good quality firewood; its bark is used in tanneries.Its leaves have anti-diarrhea and jaundice properties and its fruits are laxative (Heringer and Ferreira, 1974).
The fruit trees native to the Cerrado are species from several genera and families that produce fruits of interest for food and industrialization.There is a potential and growing market for the fruit trees, which, however, is not much exploited by farmers.Fruit harvesting is mostly extractive or predatory.
The Cerrado vegetation in Brazil has been fragmented by expanding agricultural frontiers, which have affected the population dynamic of many species, including the "cagaita tree" populations.Alteration in these areas may reduce the genetic variability by founder or bottleneck effects.Genetic drift and restricted gene flow increase inbreeding and also the genetic divergence among populations.Inbreeding can lead to the fixation of deleterious alleles, threatening certain populations present in this habitat with extinction (Gilpin and Soulé, 1986;Young et al., 1996).
The diversity among populations, gene flow and other genetic parameters should be evaluated in a study of natural populations of native species.Knowledge of native species populations has been widened with the advent of molecular markers.Microsatellite markers (SSR) have been used in natural population studies (Collevatti et al., 1999;Daynandan et al., 1997) as they are highly polymorphic when compared with other classes of markers.SSR markers have been widely used as a tool to answer several questions on population genetics, such as gene flow and paternity analysis (Wright and Bentzen, 1994).
However, advances in the use of microsatellites have been hindered due to the high cost and time taken to develop specific primers for each locus of the native species.The chance of success in the transferability (heterologous amplification) of DNA sequences by PCR is inversely related with the evolutionary distance between the two species.Many studies have shown the possibility of using pairs of primers designed for one species belonging to the same genus (Cipriani et al., 1999;Isagi and Suhandono, 1997) or even among different genera (Roa et al., 2000;White and Powell, 1997).
In the present study the transferability to Eugenia dysenterica of primer pairs developed for Eucalyptus spp., which belongs to the same family, was investigated in order to identify microsatellite markers in the "cagaita tree" for studies of the genetic variability, population structure, gene flow and reproductive system.The main objective of this research was the generation of information for domestication and breeding of the species and its conservation.

Plant material
The study material was collected in ten locations in southeast Goiás state, forming ten populations, represented by 116 trees (matrices).Plant material (leaves) was collected from each one of the 116 "cagaiata trees" for genotypic characterization of the plants.Table 1 shows the locations where the populations were collected, including some characterization of the areas.

SSR locus transferability and amplification by PCR
In this study the transferability of primer pairs developed for Eucalyptus was assessed for Eugenia dysenterica, which belongs to the same Mirtaceae family but to different genera.
Three hundred and fifty-six primer pairs developed for Eucalyptus spp.(Eucalyptus grandi x Eucalyptus urophilla) by Brondani et al. (1998) were used.These pairs of primers were tested for amplification in Eugenia dysenterica.
For the reaction of amplification by PCR, 15 ng of genomic DNA were used in 25 µL volume containing 50 mM KCl, 20 mM Tris-HCl pH 8.8; 1.5 mM MgCl 2 ; 10 mM dNTPs; 0.2 µm of each primer (forward and reverse) and 1U of Taq polymerase.
The PCR protocol consisted of a prior initial denaturation at 96 °C for 3 min, followed by 30 cycles at 94 °C for 30 s, 60 °C for 1 min, 72 °C for 1 min, and the last step for extension at 72 °C for 7 min.The amplified fragments were separated in 4% polyacrylamide gel, in a run with 1X TBE at 2000 v for 2 h, and stained with silver nitrate.

Statistical analysis of the data
Allelic and genotypic frequencies for each locus were obtained from the data readings in the gels.These frequencies were submitted to a goodness-of-fit test (Fisher's exact test) to the proportion of Hardy-Weinberg equilibrium as defined by Weir (1996) using the TFPGA program (Miller, 1997).Fisher's exact test was performed by the conventional Monte Carlo method using 10 batches with 1,000 permutations per batch.
Genetic diversity and F statistics were estimated under a random model according to Weir (1996) where the sampled populations are considered as representatives of the species with a common evolutionary history.The allelic frequencies, the number of alleles per locus (A), the observed (H o ) and expected (H e ) heterozygosities and the F statistics of Wright (F IS , F ST and F IT ) were estimated using the GDA program (Lewis and Zaykin, 2000).450 Zucchi et al.The mutation process in microsatellite loci is not in line with the expectations under an infinite alleles model with low mutation rates.Therefore the analogue of the F ST statistics, namely the R ST parameter (Slatkin, 1995) developed specifically for microsatellite data, was also used.Parameters R ST and gene flow (Nm) were estimated using the R ST Cal program (Goodman, 1997).The variability structure was visualized using dendrograms constructed from the matrix of Nei's genetic distances and the UPGMA clustering criteria, using the NTSYS program (Rolf, 1989).The stability of the clusters was also tested by a re-sampling procedure with 10,000 bootstraps.
The patterns of spatial variation were analyzed using Pearson's coefficient of correlation (r) between Nei's genetic distance matrix (Nei, 1972) and the geographic distances between populations matrix.The significance of this correlation was tested through Mantel's Z statistic (Mantel, 1967), using 9,999 random permutations.

Transferability of SSR Eucalyptus primers to Eugenia dysenterica
The 356 primers tested were classified according to the quality obtained in the PCR: 2.8% (ten pairs of primers) amplified clear SSR products, 30.0%presented non specific band amplification and 67.2% did not amplify any band (Zucchi et al. 2002).
The selected ten pairs of primers were used for the population genetic structure study.First, all primers were submitted to the basic program with 56 °C for primer annealing.Those that did not amplify satisfactorily were submitted to amplification cycles with lower annealing temperatures.EMBRA 17 and EMBRA 134 amplified satisfactorily at 52 °C.
Table 2 shows the amplification conditions used for the ten SSR loci assessed and their respective allelic amplitudes.The greatest allelic amplitude was 167 base pairs.The allelic frequencies are shown in Table 3.

Genetic Variation
The average number of alleles per polymorphic locus was 10.43, with a range from three alleles (EMBRA 73 locus) to 22 alleles (EMBRA 14 locus).
Table 4 shows that the observed heterozygosity ranged from 0.253 (population 1) to 0.599 (Population 10) with a mean of 0.458.The lowest expected heterozygosity was 0.276 for population 1, whereas population 9 presented the greatest expected heterozygosity (0.670).The mean value obtained was 0.442.
Fisher's exact test (Table 5) showed that some populations were not in Hardy-Weinberg Equilibrium for most of the studied loci.For population 10 this test was significant for six out of seven loci for which the Hardy-Weinberg equilibrium condition was therefore rejected.where populations 9 and 10 diverged genetically from the other populations.This population structure is congruent with the results obtained by Telles et al. (2001) using isoenzymes, with progeny data of the same populations.The matrix correlation between Nei's genetic distances and the respective geographic distances was high and positive (r = 0.872) and significant at 1% level of probability (Figure 3).This result indicates that the genetic variability pattern among the populations is structured in space.A similar result but with a lower correlation (r = 0,725) was reported by Telles et al., 2001, with isoenzyme markers.

Discussion
Knowledge of the genetic variability distribution between and within natural Eugenia dysenterica populations is essential to adopt efficient strategies for ex situ and in situ germplasm conservation.
SSR markers are a powerful tool for this type of study.However, the progress of using markers based on microsatellites has been hindered because of the high cost and time spent for developing species specific primers.The chance of success of heterologous amplification for any DNA sequence by PCR is inversely related to the evolutionary distance between two species.Many studies have shown, however, that there is the possibility of using pairs of primers designed for one species or another species of the same genus (Cipriani et al., 1999) or even of different genera (Roa et al., 2000).The transferability of microsatellites among related species is a consequence of the homology of the DNA sequence in the regions that flank the microsatellites.Other studies on tropical trees have demonstrated a high ratio of SSR loci transferability among taxonomically related tree species, as occurs with   Leguminosae (Dayanandan et al., 1997), Meliaceae (White and Powell, 1997) and among Eucalyptus species (Brondani et al., 1998).In this study, only 10 primer pairs of 356 developed for Eucalyptus spp.were found to be transferable the two species (Zucchi, et al. 2002).Roa et al. (2000) studied transferability in cassava (Manihot esculenta) for six different species (all wild) of the Manihot genus.Only two in eight of the amplified loci (or two pairs of primers) did not amplify for the two more distant wild cassava species.It has been found that many microsatellite primers can be used to amplify heterology among different genera.These authors showed the possibility of using SSR primers to amplify heterology in different species and genera (Byrne et al., 1996;Katzir et al., 1996;Isagi and Suhhandono, 1997;Smulder et al., 1997;Steilnkellner et al., 1997).Dayanandan et al. (1997) used pairs of primers developed for a tropical tree, Pithcellobium elegans, to detect SSR loci that amplified for other species in the same family (Leguminosae).Thirteen species from the Leguminosae family, 12 from the Mimosidae subfamily and one from the Pappilionoidae family, were used.The six pairs of primers developed for P. elegans were successful in amplifying for species of the same genus and of different genera.
Regarding genetic variation, a relatively high level of multiallelism in all the seven polymorphic loci was observed in the present study.The average number of alleles per locus was 10.4 and the mean expected heterozygosity reached 0.442 which is greater than the value found in a similar study with the same "cagaita tree" populations using isoenzyme markers.
The F IS value found here for the Eugenia dysenterica populations was negligible.This value suggests that the species is predominantilly allogamic.This result contrasted F IS = -0.017value obtained by SSR markers with the F IS = 0.243 value obtained by Telles (2000) with isozymes from seedlings of the same population.The conflicting results may be due to the use of data at different stages of development, seedling or adults plants could show different isozymes patterns or even because of differences in the nature of the genetic marker used.Enzymes may be related to adaptive traits and subject to natural selection, while microsatellites are non coding regions of the genome and, therefore, are selectively neutral.Proença & Gibbs (1994) studied the reproductive biology of E. dysenterica and concluded that it is pollinated by large bees.The flowers open in the morning for one day and following the pattern called "big bang", the plants flower intensely for a relatively short period.Mainly monkeys and humans disperse the seeds, although some other animals also perform this function (Ferreira & Cunha, 1980).
The apparent cross-fertilization rate here estimated was high ( $ t = 1,08) and greater than that found with isozyme markers ( $ t = 0.83) and seedling data.Pollinators or dispersing agents may also be responsible for this difference.
For example, the trees analyzed with SSR markers are probably plants that had been part of the natural Cerrado and were, in most cases, 50 old years or more.On the other hand seedlings (progenies genotyped with isoenzymes) are recent plants from recent pollination events performed by insect populations that probably are different from the insect population of 50 years ago.Govindajaru (1989) distinguished three levels of gene flow: high Nm > 1, intermediate (0.25 < Nm < 0.99) and low Nm < 0.25.The value found here (Nm = 0,68) was therefore intermediate.As this flow was estimated on the basis of the R ST parameter it cannot not be considered contemporaneous, but a consequence of the genetic history of these populations.The restricted gene flow can be also explained by main pollinater, the large bees, and the flowering manner of the "cagaita tree", that happens fast and abundantly, that does not allow a great number of flights of the pollinater to supply itself with pollen, and is thus restricted to small distances.
Regarding the measure of diversity among populations, it was noted that the R ST and F ST estimates were very similar.It is believed that more results of this nature would be necessary to ascertain the tendencies in the difference between R ST and F ST .These values were higher than those obtained with isozyme markers for the same populations suggesting that microsatellites are more sensitive than isozymes for measuring differentiation among populations than isozymes.This was also evident from the detection of private or exclusive alleles in this study that were not detected in the data reported by Telles (2001).
Although this is a species with a high degree of allogamy, the estimated gene flow among the populations was relatively small and possibly a consequence of human settlement of the Cerrado.An interesting fact to discuss is the high frequency of exclusive alleles in populations 9 and 10 that may have been caused by genetic drift and absence of gene flow.In fact, population 9 is located in an urban region (Table 1) and population 10, although natural, is completely isolated from the others.
Of the localities studied except areas 1, 8 and part of area 7, the others were present in areas with bigger alteration in consequence of human settlement of the Cerrado, either for locating inside the urban area of Goiânia (area 9), or for locating in implanted pastures.Some areas have been fertilized when these grasses were planted, as for example area 2, 4 and 10, that presented higher calcium (Ca), magnesium (Mg) and phosphorus (K) values in comparison with natural areas (Silva, 1999).
This species is widely used by human populations for its wood and fruit, it possesses a reproductive system (preferentially allogamous), and it is pollinated mainly by large bees (Bombus sp.).Its demographic and biological characteristics, associated with habitat fragmentation, tend to pro-duce a relatively high amount of genetic divergence among the local population (Loveless and Hamrick 1984;Proença and Gibbs, 1994).Due to recent expansion of agricultural activities in the Cerrado and high rates of biodiversity loss and endemic species, the region is considered one of the world's hotspots for conservation (Myers et al. 2002).The region is highly fragmented by increased agricultural activities, and there is no clear information about the spatial distribution of Eugenia dysenterica trees before recent human occupation (Diniz-Filho and Telles, 2002).
Another hypothesis to be considered regarding the high rate of exclusive alleles in population 9 is that its genetic constitution could have been altered by human introductions.Slatkin (1985) described a methodology to assess gene flow from rare (or private/exclusive) alleles.The distribution of the frequencies of these rare alleles (alleles that appear in a single population) is used to estimate the mean number of migrants exchanged among local populations.The logarithm of Nm is approximately linearly related to the logarithm of the mean of the frequency of the private/exclusive alleles (Slatkin, 1985;Slatkin and Barton, 1986).
The high correlation coefficient between the genetic and geographic distance matrices suggested that there is a spatial pattern of genetic variability among the populations.This structure probably originated from a stocastic differentiation process, with higher levels of gene flow among closer populations and decreased flow as distances increased (isolation by distance).
Results indicated a potential of SSR locus transferability developed for Eucalyptus to other species of different genera, such as in the case of the "cagaita tree".The high genetic diversity among populations detected with SSR markers indicated that these markers are highly sensitive to detect population structure.Results of the present study were not entirely congruent with those obtained with isozyme markers especially with respect to gene flow and diversity.However estimated Nm being less than 1.0 and the existence of private alleles call attention to damages of the metapopulation structure that may have occurred in these populations.

Figure 1 -
Figure 1 -Number of private/exclusive alleles for 73 alleles obtained in seven polymorphic loci of the SSR markers, in ten Eugenia dysenterica populations.

Figure 2 -
Figure 2 -Genetic divergence pattern among ten "cagaita tree" populations, defined by the UPGMA clustering, based on the genetic identity obtained from Nei's genetic distances (1972).Cophenetic correlations equal to 0.943.

Figure 3 -
Figure 3 -Correlation (r) between the geographic distance matrix and the genetic distance matrix obtained by SSR markers.

Table 1 -
Locations in the state of Goiás, number of "cataiga trees" sampled and respective geographic position.

Table 2 -
Genetic structure and gene flow in Eugenia dysenterica 451 Sequence of primer pairs developed for Eucalyptus* that amplified microsatellite loci in Eugenia dysenterica with allele size range, number of alleles per locus (A), expected heterozigosity (H e ), observed heterozigosity (H o ) and annealing temperature (T a ).*Brondani et al. (1998).Zucchi et al.

Table 3 -
Table of allelic frequencies of the seven SSR loci, estimated from 116 individuals of 10 populations of Eugenia dysenterica.