GENETIC STRUCTURE AND DIVERSITY OF Copaifera langsdorffii Desf . IN CERRADO FRAGMENTS OF THE SÃO PAULO STATE , BRAZIL

The loss of large areas of Cerrado (Brazilian savanna) in Brazil can lead to reduced biodiversity and to the extinction of species. Therefore, the present study aimed to investigate the genetic fragility of populations of Copaifera langsdorffii Desf exposed to different anthropic conditions in fragments of Cerrado in the state of São Paulo. The study was carried out in two Experimental Stations operated by the Forest Institute (Assis and Itirapina), in one fully protected conservation unit (Pedregulho) and in one private property (Brotas). Analyses were conducted using leaf samples from 353 adult specimens and eight pairs of microsatellite loci. The number of alleles per locus ranged from 13 to 15 in all populations, but the mean number of effective alleles was approximately half this value (7.2 to 9-1). Observed heterozygosity was significant and lower than the expected in all populations. Consequently, all populations deviated from Hardy-Weinberg expected frequencies. Fixation indexes were significant for all populations, with the Pedregulho population having the lowest value (0.189) and Itirapina having the highest (0.283). The analysis of spatial genetic structure detected family structures at distance classes of 20 to 65 m in the populations studied. No clones were detected in the populations. Estimates of effective population size were low, but the area occupied by each population studied was large enough for conservation, medium and long term. Recent reductions or bottlenecks were detected in all four populations. Mean Gst' (genetic divergence) indicated that most of the variation was within populations. Cluster structure analysis based on the genotypes detected K= 4 clusters with distinct allele frequencies patterns. The genetic differentiation observed among populations is consistent with the hypothesis of genetic and geographic isolation. Therefore, it is essential to adopt conservation strategies that raise the gene flow between fragments.


INTRODUCTION
The Cerrado (Brazilian savanna) is the second largest biome in Brazil.It is a biodiversity hotspot and represents an important conservation target (MITTERMEIER et al., 2005).Although it once covered 2 million km 2 of the Brazilian territory, approximately 39% of its original area has been completely destroyed (MACHADO et al., 2004), and only 3.18% of the remaining vegetation is protected in 48 federal, state and municipal conservation units (BRASIL, 2006).In the state of São Paulo, the percentage of land covered by Cerrado vegetation has decreased from 14 to 0.74% (KRONKA et al., 2005).The remaining fragments of Cerrado are isolated by agriculture (sugarcane) and other types of forest, or surrounded by grazing land.
The fragmentation of the Cerrado biome could potentially lead to decreased biodiversity and to the extinction of species.According to Antiqueira (2013), the genetic consequences of habitat fragmentation have led the scientific community to question the genetic vulnerability of populations and to reevaluate the strategies used in the conservation of endangered species.
Copaifera langsdorffii Desf (Caesalpinioideae), known in Brazilian Portuguese as "copaíba," is a tropical tree species with wide distribution in the Brazilian savanna and Atlantic rain forest (CARVALHO, 2003;QUEIROZ;SILVA, 2013).It is hermaphroditic, and is pollinated by Apis melifera and Trigona sp bees as well as other insects.It has a predominant outcrossing mating system with up to 8% of selfing (SEBBENN et al., 2011).Its seeds are dispersed by zoochory (OLIVEIRA et al., 2002).Copaíba trees grow slowly, and can reach up to 40 m in height and 100 cm diameter, with some individuals living for up to 400 years (CARVALHO, 2003).Copaíba trees have significant economic value as a source of wood and resin oil, which is widely used by the pharmaceutical industry to produce cosmetics and phytotherapeutic remedies.
In the state of São Paulo, C. langsdorffii specimens in both private properties and Conservation Units are limited to small fragments isolated by intensive agriculture and cattle pasture (CARVALHO et al., 2010).
The goal of the present study was to characterize the intra-and inter-population genetic variation of Copaifera langsdorffi Desf.from four Cerrado regions in the state of São Paulo and to verify if population distance and isolation can affect genetic differentiation.

Regions Studied
The present study was conducted in four Cerrado regions in the state of São Paulo (Table 1).Three of these (Assis, Itirapina and Pedregulho) are operated by the Forest Institute of São Paulo, while one is in a private property (Marimbondo Farm, Brotas).The study site in Assis (22°35'14''S, 50°22'38''W, 550 m altitude) is surrounded by sugarcane plantations, grazing land and Pinus and Eucalyptus forests.Samples were collected in an area of 0.9 hectares.A similar environment surrounds the fragment located in Marimbondo Farm, in Brotas (22°22'18''S, 48°01'24''W, 650 m altitude, 400 ha).However, the anthropic pressure is particularly high in this area, and the fragment itself is often used as grazing land.The size of the sampling area in Brotas was of approximately 1.6 hectar.
In Itirapina (22°13'03''S, 47°50'15''W, 750 m altitude), the 2.3 ha sampling area was also located within an Experimental Station.Like the other forest fragments, this Experimental Station is surrounded by Pinus, Genetic structure and diversity of...
Eucalyptus and sugarcane plantations.The study site in Pedregulho (Furnas do Bom Jesus State Park, 20°13'40''S, 47°26'18''W, 1042 m altitude) has 5 hectares and is the only fully protected Conservation Unit.However, it is especially vulnerable to wildfires, the last of which (in 2011, unknown cause) burned 500 ha of native vegetation.
All study sites can be classified as Cerrado sensu strictu and Cerradão, according to the categories proposed by Ribeiro and Walter (1998).According to Köppen's classification (ROLIM et al., 2007), all sampled areas are located in Cfa climate zones (humid subtropical climate, no dry season), except for Pedregulho, which has an Aw climate (tropical climate with dry winters).The distances between all fragments are listed in Table 3.

Sampling and DNA Extraction
Leaf samples from 100 adult specimens of C. langsdorffii were collected in each area except for Pedregulho, where only 53 specimens were located.All subjects were geocoded.The distance range between individuals is 5-47 m in Assis, 5-48 m in Brotas, 5-58 m in Itpirana and 7-68 m in Pedregulho.Total genomic DNA extraction was carried out according to Doyle and Doyle (1990) with modifications described by Ferreira and Grattapaglia (1998).Loci were amplified using eight pairs of microsatellite loci designed for C. langsdorffii by Ciampi et al. (2000).
DNA fragments were amplified and separated using denaturing polyacrylamide gel electrophoresis in 1X TBE buffer for one hour and thirty minutes.A silver nitrate stain (CRESTE et al., 2001) was used to observe banding patterns.Allele size was calculated using a molecular weight standard (10bp ladder -Invitrogen®).Fragments of different sizes were considered different alleles.

Data Analysis
The SPAGeDI software, version 1.3 (HARDY; VEKEMANS, 2002), was used to estimate genetic diversity parameters and to analyze spatial genetic structure at the population levels.The following parameters were considered: number of alleles per locus (A), effective number of alleles per locus (A e ), presence of exclusive alleles (A ex ), mean observed (H o ) and expected (H e ) heterozygosity, and fixation indices (f) according to Weir (1996).This software was also used to analyze spatial genetic structure within populations from the estimates of the coefficients based on recent coancestry Loiselle et al. (1995) among plants within the defined distance classes k for each allele in each pair of individuals, x and y.Kinship coefficients were estimated using the method described in Loiselle et al. (1995).Statistical significance was assessed by 10,000 permutations of genes at the Bonferroni-corrected 5% significance level.Means for each locus were compared between populations using 95% confidence intervals adjusted using the Jackknife method over loci.
Adherence to Hardy-Weinberg equilibrium was calculated by Fisher's exact test, using the TFPGA software (MILLER, 1997), as well as genetic distance between populations, according to Nei (1987).The presence of clones was assessed using the CERVUS 3.0 software (KALINOWSKI et al., 2007).The Bottleneck software (PIRY et al., 1999) was used to identify recent reductions in effective population size based on the excess of heterozygosity.Analyses were based on the Mutation Step (OHTA; KIMURA, 1973), Infinite Allele (KIMURA; CROW, 1964) and Two Step models (DI RIENZO et al., 1994), using the Wilcoxon test with 1,000 iterations (LUIKART; CORNUET, 1998), as recommended for analysis with fewer than 20 SSR loci (PIRY et al., 1999).
Effective population size was calculated as in Cockerham (1969), and the minimum viable area (MVA) for in situ genetic conservation was estimated as a function of the effective population sizes suggested by Lynch (1996), as recommended by Whittaker and Fernández-Palacios (2007).It is considered that MVA = Ne (ref) / d (Ne / n) where: Ne / n = ratio of effective population size and sample size; d = density of individuals per hectare.Genetic divergence (G st' ) was calculated using the FSTAT software (GOUDET, 2001) by applying the Hedrick (2005) correction, which is the most suitable parameter for microsatellites.Apparent gene flow between populations was estimated according to Crow and Aoki (1984).
The cluster (K) structure was analyzed using the Structure software (PRITCHARD et al., 2000), which uses individual genotype information to identify genetic clusters, by calculating the ÄK values described by Evanno et al. (2005).Genetic structure and diversity of...

RESULTS
The number of alleles per locus ranged from 13 to 15 in all populations, but the mean effective number of alleles was approximately half this value (7.2 to 9.1).Exclusive alleles were also detected.The Assis population had the lowest number of exclusive alleles (1 allele, locus CL27) while Pedregulho had the highest (17 alleles, loci CL1, CL6, CL20, CL32, CL34 and CL39).Observed heterozygosity was significantly lower than the expected (H e ) in all populations.Therefore, all populations deviated from Hardy-Weinberg Equilibrium (HWE), except for the CL1, CL6 and CL20 loci in the Assis population and the CL32 locus in the Pedregulho population.Fixation index values were significant for all populations, with the Pedregulho population having the lowest value (0.189) and Itirapina having the highest (0.283), as displayed in Table 1.
The analysis of spatial genetic structure (Figure 1) detected significant family structures at distance classes of 20 m in Assis, 25 m in Brotas, 65 m in Itirapina and 45 m in Pedregulho, with a tendency toward spatial structure also observed in distance classes of 80 to 90 m.Pedigree formation was observed at these distances.In the distance classes described above, genotypes were randomly distributed.

Clones were not detected in any population.
Estimates of effective population size (N e ) were low in all populations, and ranged from 14 to 21 individuals.The estimated minimum viable area (MVA) for in situ genetic conservation, based on the estimated effective population size, sample size and density of individuals per hectare, ranged from 3.8 and 397 ha, based on a short-term effective population size of 50 (MVA 50 ).Medium (MVA 500 ) and long term (MVA 1000 ) estimates were between 38 and 3,970 ha, and between 76 and 7,936 ha, respectively (Table 2).
The analyses of reductions in population size using the Stepwise Mutation Model (SMM) did not detect bottlenecks in the examined populations.However, bottleneck effects were observed in all four populations when the Infinite Allele (IAM) and Two Phase Models (TPM) were used (p < 0.05, Table 2).Linhas tracejadas indicam o limite inferior e superior do intervalo de confiança do erro a 95% de probabilidade, e a linha contínua representa a estimativa do coeficiente de coancestria, segundo Loiselle et al. (1995).

ANTIQUEIRA, L.M.O.R. et al.
Genetic divergence among populations (HEDRICK, 2005) was high (0.82 to 1.11), with the lowest divergence observed between Brotas and Pedregulho, and the highest between Assis and Pedregulho (Table 3).Cluster analyses identified K = 4 clusters with distinct allele frequencies (Figure 2a), although some of the genotypes observed in Brotas and Itirapina were similar to each other (Figure 2b).However, in spite of the smaller genetic distance between these two populations, they are different enough to be considered distinct from each other.

DISCUSSION
The analyses of genetic diversity revealed an average of over ten alleles per locus, although there was a low effective number of alleles.The low values may also be attributable to the presence of private and rare alleles, which may indicate restricted gene flow and the beginning of population genetic differentiation.The Genetic structure and diversity of... highest number of private and rare alleles was found in the Pedregulho population, followed by the populations of Itirapina and Brotas.
Observed heterozygosity values were lower than expected in all populations, indicating an excess of homozygotes.All populations deviated from Hardy-Weinberg Equilibrium (HWE).The deviations may have been caused by selfing and crossing among relatives.Fixation index values were very high in Brotas and Pedregulho (0.198 and 0.189), and higher in Assis and Itirapina (over 0.230).
The analysis of spatial genetic structure (SGS) in populations of C. langsdorffii suggested isolation by distance.Sebbenn et al. (2011) also reports the occurrence of SGS in populations of C. langsdorffii studied in São Paulo State.According to Loiselle et al. (1995), the formation of genetic structure could be attributable to limited seed dispersal or vegetative reproduction.The latter possibility should, however, be discarded, as clones were not detected in the populations studied.
The effective population size (number of individuals unrelated and not inbred) was low.The analysis of effective population size changes using the IAM and the TPM identified genetic bottlenecks in all four populations; that is, heterozygote deficits in these populations were observed in a significant number of loci.These findings were not detected by analyses using the SMM.According to Kimura and Crow (1964), the excess or deficit of heterozygosity has only been demonstrated for loci evolving under the IAM.In loci evolving under the SMM (OHTA; KIMURA, 1973), deviations in heterozygosity may not always be detected (CORNUET; LUIKART, 1997). Di Rienzo et al. (1994) suggested that the TPM is more suitable for the analysis of microsatellite loci, as it incorporates the SMM, but takes wide variation in the magnitude of mutations into account, by considering that some mutations involve more than one repeat unit.According to Lee et al. (2002), it is important to identify populations experiencing recent bottlenecks, as they may be at risk of extinction.Luikart et al. (1998) also suggest that when bottlenecks are detected early, their deleterious effects are more likely to be successfully reduced or avoided altogether.Genetic divergence between populations was high, showing a large genetic isolation between populations.Sebbenn et al. (2011) andTarazi et al. (2013) found similar results, suggesting that the spatial isolation of populations by habitat fragmentation may reduce genetic diversity and effective population size, restrict pollen and seed gene flow and increase the SGS of new generations.This result was complemented by the genetic structure analysis, which defined each population as a discrete cluster, although Brotas shared some similarities with Itirapina.However, it is important to note that these were the two geographically closest populations, and experienced probably a high level of gene flow.The genetic distance dendrogram also suggested that these populations were the least genetically distant.

CONCLUSIONS
The high genetic differentiation observed among populations is consistent with the hypothesis of genetic and geographic isolation.This isolation coupled with anthropic pressure that causes fragmentation of remaining Cerrado could be causing the decrease of genetic diversity.It is important to perform this study in different populations of Copaifera langsdorffii Desf. to check if the results are repeated.Likewise, it is essential to adopt conservation strategies that raise the gene flow between fragments.

Figure 1 -
Figure 1 -Correlogram of coancestry coefficients in nine distance classes in the four populations of C langsdorffii.Traced lines indicate the limits inferior and superior of the error confidence interval with 95% probability, while solid lines indicate coancestry coefficients estimated according to Loiselle et al. (1995).Figura 1 -Correlograma do coeficiente de coancestria em nove classes de distâncias nas quatro populações de C langsdorffii.Linhas tracejadas indicam o limite inferior e superior do intervalo de confiança do erro a 95% de probabilidade, e a linha contínua representa a estimativa do coeficiente de coancestria, segundoLoiselle et al. (1995).

Table 1 -
Description of SSR loci in the populations of Assis, Brotas, Itirapina and Pedregulho, SP.A = number of alleles; A e = effective number of alleles; A ex = number of exclusive alleles; H o = observed heterozygosity; H

Table 2 -
Estimates of effective population size and minimum variable area (MVA) for in situ genetic conservation of C. langsdorffii Desf., and Wilcoxon test for bottlenecks in the populations of Assis, Brotas, Itirapina and Pedregulho.