FOREST MANAGEMENT ECOLOGY Genetic characterization of remaining populations of paratecoma peroba, an endangered and endemic species of the atlantic forest

Background: The objective of this study was the genetic characterization of remaining populations of the species Paratecoma peroba occurring in fragments of the Atlantic Forest, by estimating parameters of diversity and genetic structure. The study was carried out in two forest fragments, the Atlantic Forest Environmental Education Pole (area 1) and the Pacotuba National Forest (area 2), where 93 adult trees were identified. Results: Ten Inter Simple Sequence Repeats (ISSR) primers were used in genotyping, obtaining 112 amplified bands with 87.5% polymorphism. The genetic diversity estimated for the populations from the Nei (H*) and Shannon (I*) indices was higher for area 1. For the joint data, moderate genetic diversity was observed, referring to the average values of 0.26 and 0.40, obtained for the H* and I* indices, respectively. Molecular variance analysis resulted in moderate differentiation between populations ( Φ ST = 0.143), while gene flow analysis (N m = 6.69) revealed the occurrence of similar alleles between them. However, the predominance of a single genetic group in area 2, revealed from Bayesian approach, indicates that its genetic structure, possibly generated by the current scenario of fragmentation of the Atlantic Forest and the distancing of populations, affecting the contemporary gene flow. Conclusion: Despite the moderate genetic diversity of the species for the area 2 population, actions toward inclusion of seedlings obtained from seeds from neighboring fragments, including area 1, and the increase in the connectivity of forest fragments through ecological corridors, could assist in augmenting its genetic variation.


INTRODUCTION
The fragmentation of tropical forests generated by the advance of anthropic activities has threatened the conservation of biodiversity, modifying the composition and structure of the landscape.The causes for this scenario range from fires, agriculture, population density, selective logging and climate change (Pimm et al., 2014).
In general, plants are known for their ability to survive and reproduce under several enviromental disturbances, however, this maintenance is closely related to the diversity composition and genetic structure of their populations (Basey et al., 2015).Knowledge of these mechanisms has been highlighted in decisions making, as they reflect the presence of different alleles in the gene pool distributed in time and space, therefore, different genotypes within or among populations (Torres-Florez et al., 2018).
The Atlantic Forest is characterized by the intense fragmentation of its vegetation covering, which has the potential to reduce the genetic diversity of species, increasing their population structure (Almeida, 2016).To what concerns tree species, which in genera are predominantly allogamic, forest fragmentation tends to result in spatial distancing and, consequently, reproductive isolation (Worsham et al., 2017).In addition, evolutionary factors such as genetic drift tend to be more pronounced in small populations, increasing species vulnerability (Vencovsky et al., 2003;Basey et al., 2015).
Among the tree species native to Brazil and endemic to the Atlantic Forest, Paratecoma peroba (Record) Kuhlm stands out.Popularly known as peroba do campo, it belongs to the Bignoniaceae family, and is distributed in the states of Espírito Santo, Minas Gerais and Rio de Janeiro, Brazil (Lohmann, 2020).It has ecological potential because it is an early or late secondary species, emerging tree and can be used in forest recovery and restoration projects.It also has economic potential, recognized in the wood sector, mainly for the manufacture of fine furniture (CNCFlora, 2022).
Due to its potential for timber, P. peroba was intensively exploited, classifying it as an endangered species in the red book of flora in Brazil (Martinelli and Moraes, 2013) and in the book fauna and flora threatened with extinction in the state of Espírito Santo (Fraga et al., 2019).Currently, it is considered extinct in the state of Rio de Janeiro and it is estimated that the effective number of individuals of the species is approximately 8.500 trees (CNCFlora, 2022).
Despite the risk of extinction of the species, studies of genetic characterization involving its remaining populations have not been found so far.For this, molecular markers are used efficiently, as they cover variations at the DNA level and which, therefore, can imply the maintenance and conservation of species (Filippos, 2016).
Among the molecular markers, inter simple sequence repeats (ISSR) stand out, being widely used in genetic studies of forest species (Rajasekharan et al., 2017;Silva Júnior et al., 2020).As the first molecular and population genetics characterization of P. peroba, ISSR markers are useful as no knowledge on the genome sequence of the species is required.In addition, they are characterized as dominant and multilocus markers, with high rates of polymorphism (Ng and Tan, 2015).
Therefore, the objective of this research is the genetic characterization of populations of the species P. peroba occurring in Atlantic Forest remnants, by estimating parameters of diversity and genetic structure.This information will allow the establishment of conservation strategies, such as the identification of possible matrices in the collection of seeds and production of genetically divergent seedlings.

Study area and population sampling
The study was carried out in two Atlantic Forest fragments located in the south of Espírito Santo state: the Atlantic Forest Environmental Education Pole (20°45' S and 41°27' W), called fragment 1 (area 1) and, the Pacotuba National Forest (20°45' S and 41°17' W), named fragment 2 (area 2).Forest fragment 1 has an area of 109.6 hectares (ha) and forest fragment 2 has 449.44 ha, being separated by a distance of approximately 17 km.Both areas have Seasonal Semideciduous Forest type vegetation, with different anthropic disturbances, with area 1 being an area monitored by the Instituto Federal de Educação, Ciência e Tecnologia do Espírito Santo (IFES -Campus Alegre), impacted by selective logging and agricultural activities (Paschoa, 2016), while area 2 is characterized as a Conservation Unit (UC) where sustainable exploitation practices are allowed, with pasture areas in the surroundings and eventual penetration of cattle on the edges (Icmbio, 2022) (Figure 1).
Ninety-three adult trees of the P. peroba were sampled, 62 individuals in area 1 and 31 in area 2, from which leaf samples were collected.Sampling was performed assigning at least 50 m of distance between individuals, in order to avoid kinship.

Genotyping with ISSR markers
Genotyping was performed with ten ISSR primers produced by the University of British Columbia, Vancouver, Canada.The extracted genomic DNA was partitioned into aliquots with a concentration of 10 ng/ µL -1 , and later used in polymerase chain reactions (PCR).The total reaction volume was 20 µL, containing: 1X buffer (10 mM Tris-HCl (pH 8.5) and 50 mM KCl), MgCl 2 (2.5 mM), dNTP (1 mM), primer (0.2 µM), 1 unit of Taq DNA polymerase and 50 ng of genomic DNA.
Amplifications were performed in a thermocycler (Applied Biosystems, model Veriti), involving initial denaturation (5 min at 94 °C), followed by 35 cycles of denaturation (94 °C for 45 sec), annealing (52 °C for 45 sec) and extension (72 °C for 90 sec).At the end of the cycles, there was a final elongation of 72 °C for 7 min.
The amplification products were separated by electrophoresis on a 2% agarose gel with 1X TBE buffer (10.8 g/L Tris-base; 5.5 g/L boric acid; 0.83 g/L EDTA), under electrical charge of 100 volts for 4 hours.The gels were submerged in ethidium bromide solution (0.50 µg/mL) for 30 min, photographed under UV light in a photodocumentator (ChemiDoc MP Imaging System -Bio Rad) and separated according to molecular weight by the Ladder 100 bp marker.

Statistical analysis
Visual analysis of gels enabled to generate a binary coded matrix, assuming 1 for the presence of a band and 0 for their absence.Descriptive statistics were then calculated: number of total bands (NTB), number of polymorphic bands (NPB), percentage of polymorphic bands (PPB), private markers per primer and per population (PM Area1 and PM Area2 ), and size variation of the fragments generated in base pairs (SVF).
ISSR markers were evaluated by the analysis of the optimal number of fragments necessary for this study.In addition, the informativeness of the marker was estimated through the polymorphic information content (PIC), calculated for each primer in the respective populations and also for the pooled data.To perform these analyses, the Genes program was used (Cruz, 2016).
Genetic diversity parameters were estimated for individual populations and for pooled data using the Popgene program (Yeh and Boyle, 1997).The number of observed alleles (A O ), number of effective alleles (A E ), Nei's genetic diversity index (H*) (Nei, 1978) and Shannon's index (I*) (Shannon and Weaver, 1949) were calculated.
The genetic dissimilarity, considering the total sample, was estimated in paired individuals using the arithmetic complement of the Jaccard coefficient.The dissimilarity values were converted into a numerical matrix and, later, a dendrogram was developed using the Unweighted Pair-Group Method using Arithmetic Avarages (UPGMA), with an estimated cut-off point according to Mojena (1977), with the coefficient k = 1.25.To determine the consistency between the dissimilarity values and the representation in the dendrogram, the cophenetic correlation coefficient (CCC) was calculated.These analyses were performed using Genes program (Cruz, 2016), however, to obtain the circular dendrogram, the dissimilarity matrix was exported to the R software (R Core Team, 2020), applying the vegan packages (Oksanen et al., 2018), cluster (Maechler et al., 2019), dendextend
The genetic structuring was evaluated using the Genes program (Cruz, 2016) to perform the molecular analysis of variance (Amova), with two hierarchical levels, that is, between and within populations.Gene flow between populations was also calculated, assuming Nm = [1/(FST-1)]/4, using Arlequin 3.5 software (Excoffier and Lischer, 2010).
The Bayesian approach using Structure 2.3 software (Falush et al., 2007) was performed to determine the number of genetic groups (K), where 20 runs were standardized for each K value, establishing the number of groups varying between K = 1 to K = 5, with 7.500 interactions and a burn-in of 2.500 interactions, totaling 10.000 Monte Carlo interactions via Markov Chains (MCMC).The results were exported to the Structure Harvester software (Earl and Vonholdt, 2012), and the number of genetic groups was determined by the ad hoc ∆K method (Evanno et al., 2005).

Descriptive analysis
A total of 112 bands were genotyped, 98 of which were polymorphic, corresponding to 87.5% of polymorphism.The primer UBC 815 had the highest number of polymorphic bands, while the primer UBC 845 had the highest number of total bands.The primers UBC 827 and UBC 810 showed the lowest values for the number of polymorphic bands and number of total bands, respectively.Among the total of bands evaluated, twelve occurred only in area 1 and four occurred only in area 2. The primers UBCs 811, 845 and 868 did not identify private markers in any of the populations, however, the primers UBC 880 and UBC 807 resulted in greater number of private markers, identifying five and four brands, respectively (Table 1).

ISSR marker informativeness and genetic diversity
From marker sufficiency analysis, the estimates of correlations between genetic dissimilarities resulted in a correlation (r) of 0.97 and stress (E) of 0.03.These parameters revealed that 92 polymorphic fragments would be sufficient for accessing the genetic variation of P. peroba populations (Figure 2).
The polymorphic information content (PIC) was the highest for primer UBC 812 (PIC = 0.29), in area 1. Primers UBCs 807 and 839 had the highest values in area 2 (PIC = 0.29).For the populations evaluated individually, the mean PIC involving all loci was approximated between area 1 (PIC = 0.23) and area 2 (PIC = 0.24).As for the joint data, the PIC value was the highest obtained, resulting in an average of 0.26 (Table 2).
The individual evaluation showed that the area 1 population had a higher number of observed alleles (A O ), however, a lower number of effective alleles (A E ) when compared to area 2. The joint data showed a higher A O value when compared to the populations evaluated individually, however, the A E value was the same as that found in area 2.
The parameters that determine the degree of genetic diversity (H* and I*) indicated greater diversity within the area 1 population.For the joint data, the H* and I* indices showed a small increase in genetic diversity (H* = 0.26; I* = 0.40).
The cluster analysis performed by the UPGMA method separated the individuals into seven groups (Figure 3).The largest group, called G3, was composed of 28 individuals, with all individuals sampled in the area 1 population.Two other large groups were formed, both with 27 individuals, the G1 group being formed only by area 2 individuals and the G2 group with only area 1 individuals.The other smaller clusters contain, respectively, four (G5 and G7), two (G6) and one individual (G4), with G7 composed of individuals from area 2 and the others by individuals from area 1.The cophenetic correlation coefficient (CCC) was 73%, indicating consistency between the dissimilarity matrix and the groups formed.

Table 2.
Parameters of marker efficiency and genetic diversity as estimated with ISSR markers in, populations of Paratecoma peroba, using ISSR markers.

Genetic structure
The evaluation of the population genetic structure performed by Amova resulted in the global estimate of ΦST = 0.143, which means that only 14.3% of the total genetic variation was between populations, while the greatest variation (85.70%) was found within populations (Table 3).The estimated N m for the populations was 6.69.
The Bayesian approach was performed for all genotypes sampled, in order to verify how P. peroba populations are structured.The definition of the most probable K value estimated by the ΔK method proposed by Evanno, Regnaut and Goudet (2005)

DISCUSSION Descriptive analysis, optimal number of fragments and informativeness of ISSR markers
Compared to another endangered and endemic species of the atlantic forest, called Dalbergia nigra (Silva Júnior et al., 2020), evaluated in forest fragments close to or similar to those in this study, the number of total bands and the percentage of polymorphism were lower to those found for P. peroba.This comparison can also be made with other forest species that do not qualify as vulnerable (Vieira et al., 2018;Vieira et al., 2022), however, which have values similar to those found, showing that the individuals evaluated in the populations of P. peroba maintain high genetic variability.The occurrence of private markers among populations (Table 1) is certainly associated with increased genetic variability among individuals, however, it may also imply greater genetic differentiation between populations.
The estimates of the correlations between the genetic dissimilarities obtained by the bootstrap analysis and the optimal number of fragments obtained for the populations of the species P. peroba (Figure 2) are in agreement with that proposed by Kruskal (1964), which, determines that stress values (E) smaller than 0.05 and correlation (r) close to 1 indicate precision.In addition, the analysis of the optimal number of fragments required for this study resulted in 92 fragments, which is lower than the value found and used in this study (total of 98 markers).Therefore, the ISSR primers used were sufficient to identify polymorphic loci among individuals.
Regarding the informativeness of ISSR markers, the analysis of polymorphic information content (PIC) has been performed in several studies with forest species (Silva Júnior et al., 2020;Santos et al., 2021), as they represent the capacity of a marker detect genetic variability among individuals (Preczenhak, 2013).According to Tatikonda et al. (2009) the maximum PIC value for dominant markers is 0.5, representing high informativeness, however, there is a variation rate where values from 0 to 0.25 indicate low informativeness, above 0.25 to 0.45 for moderately informative and above 0.45 are already considered highly informative.By this classification, the ISSR markers used in this study are characterized as moderately informative when submitted to individual populations and to joint data (Table 2), showing reliability in the data obtained.

Genetic diversity
The number of alleles (A O and A E ) and genetic diversity (H* and I*) estimated for the individual populations were slightly higher for area 1 than for area 2 (Table 2).These results indicate that this population has a good distribution of alleles and greater genetic diversity for the species P. peroba, highlighting the importance of maintaining this fragment, considering the occurrence of pastures in the surroundings.Furthermore, when compared to area 2, characterized as a Conservation Unit with a larger territorial dimension, it is possible to affirm that smaller fragments such as area 1 may have satisfactory levels of genetic diversity for some species.According to Martins et al. (2016), high levels of genetic diversity can be found in small forest fragments, in addition to contributing to reduced spatial isolation and being relevant sources for seed collection that can be used in restoration projects, especially in severely fragmented landscapes.
However, the importance of maintaining the population of P. peroba, called area 2, is highlighted, mainly due to the occurrence of private markers (Table 1), which may be associated with adaptive characteristics for the species.Furthermore, the increase in the values of H* and I* when calculated for the joint data (Table 2), reveal higher levels of genetic diversity for the species, characterized as moderate.Shannon index values (I*) can vary from 0 to 1, where 0 represents absence of genetic diversity and 1 represents high diversity (Lewontin, 1972).
The characterization of genetic diversity is also carried out based on similar studies considering the species, genus, family, or at a broader level, considering populations of forest species that have related characteristics, such as reproduction system, pollination, dispersion, zone occurrence, among others.For P. peroba or even for the genus Paratecoma, there are no studies to date involving the genetic characterization of their remaining populations.However, considering the Bignoniaceae family, studies with the species Oroxylum indicum (Rajasekharan et al., 2017) and Handroanthus impetiginosus (Pimenta et al., 2022) resulted in H* values close to the found in this study, classified as moderate (H* = 0.25) to high (H* = 0.35) genetic diversity, respectively.
Another observation should be made of the vulnerability status of the species, classified as endangered in the red book of flora in Brazil (Martinelli and Moraes, 2013) and in the book fauna and flora threatened with extinction in the state of Espírito Santo (Fraga et al., 2019).This classification makes evident the importance of conserving the remaining populations of the species, even if they contain reduced levels of genetic diversity.
Despite the vulnerability of the species, the cluster analysis performed by the UPGMA method (Figure 2) again confirms moderate genetic diversity present in the total sample, presenting seven clusters.Thus, the individuals studied have the potential to be used as matrices for seed collection and production of genetically divergent seedlings, which may increase the adaptive percentage of future established populations.The adoption of strategies that benefit the increase in genetic diversity is essential, as the presence of different alleles well distributed among individuals can determine the rates of adaptation and microevolution within populations of a species (Kulevicz et al., 2020).
Still in relation to the dendrogram, among the groups it is possible to observe that five are formed only by area 1 individuals and two are formed by area 2 individuals (Figure 2), confirming the greater genetic diversity present in area 1.Furthermore, it was observed that there are no groups formed by representatives of different populations, indicating at least a moderate genetic differentiation between them.

Genetic structure
Analysis of molecular variance (Amova) revealed greater intrapopulation variation, however, Φ ST (Table 3) confirms the occurrence of moderate genetic differentiation between populations.Values of Φ ST between 0 and 0.05 indicate low genetic differentiation; greater than 0.05 to 0.15 moderate differentiation; greater than 0.15 to 0.25 high differentiation and values above 0.25 indicate high differentiation (Wright, 1978).
Gene flow analysis (N m = 6.69) revealed the occurrence of genetic sharing between populations.According to Wright (1951) N m values greater than 1 indicate the occurrence of gene flow.However, it is noteworthy that the N m analysis assesses the historical gene flow (Wright, 1951), where populations may have shared alleles when they were still connected, and that the current scenario of Atlantic Forest fragmentation and the distancing of populations may be affecting contemporary gene flow.Added to the ecological factors of the species, such as pollination carried out by insects, that is, short-flight pollinators and anemochoric seed dispersal (Lins and Nascimento, 2010), which tends to promote a colonization closer to the parent tree (Warneke et al., 2021), it is possible to infer that populations are genetically distancing.Finally, an observation should be made of the markers used in this study, which are classified as dominant and, therefore, cover a lower allelic variation and, consequently, a lower differentiation between populations.
The structure analysis with the Bayesian approach confirms the occurrence of genetically structured populations.Three genetic groups were detected based on the algorithms (Figure 4.a), all shared between the two populations.However, area 1 shows a more homogeneus distribution of the genetic groups, while area 2 has one group prevailing (Figure 4.b).
The mix of genetic groups in the two populations can again be associated with the occurrence of gene flow in the past, when they were still a single population with continuous distribution.However, forest fragmentation and the spatial distance of individuals tend to generate isolated populations, each subjected to different evolutionary factors (Zambrano et al., 2020).
In the area 1 population, the occurrence of three well-distributed genetic groups reveals that the population has remained in the face of anthropic and evolutionary factors, being an important source of propagules for future projects of recovery and restoration of degraded areas and conservation of the species.On the other hand, in the area 2 population, a predominance of a genetic group is observed, characterized by the red color (Figure 3.b), demonstrating a structuring process.
Genetically structured populations are formed by the fixation and loss of alleles, generated mainly by evolutionary factors such as genetic drift and inbreeding (Ralls et al., 2018).Genetic drift is characterized as a stochastic event, which changes the frequency of alleles in a population with each generation, with more drastic effects in reduced and fragmented populations, due to a greater probability that CERNE (2022) 28: e-103055 França et al.
alleles are lost (Basey et al., 2015).Inbreeding, on the other hand, is related to the crossing of related individuals, that is, it is the condition in which the alleles present at a locus are identical by descente (Vega-Trejo et al., 2022).
In order to reverse this situation and expand the genetic base in the area 2 population, a first approach must be to raise the awareness of the local community through sustainable practices, mainly because it is a Conservation Unit characterized as 'National Forest', where methods of sustainable exploitation (Brasil, 2000).Finally, this information will serve for the management of the UC in relation to the P. peroba species, and measures such as the inclusion of seedlings obtained from seeds from neighboring fragments, including area 1, where information on the genetic basis of the species is already available, and, the increase in the connectivity of forest fragments through ecological corridors.

CONCLUSION
The ISSR markers identified a higher than expected number of fragments, also resulting in a PIC value classified as moderate, characterizing them as effective in genetic studies with the P. peroba species.The moderate genetic diversity found for the species indicates its maintenance, however, the fragmentation associated with evolutionary and antropic factors has provided changes in the diversity and genetic structure of its populations.The estimated genetic diversity for the individual populations was slightly higher for area 1 than for area 2, despite this, both were classified as moderate, being important for the maintenance of the species.For both populations, private markers were observed that imply greater genetic variability between individuals.However, greater genetic variation is observed for the population contained in area 1, which may be related to the greater number of private markers, implying a greater number of genetic groups identified in the dendrogram and in the Bayesian analysis.As for the population located in area 2, the occurrence of only two groups in the dendrogram and the predominance of a genetic group in the Bayesian analysis indicates that this population is in the process of genetic structuring.Therefore, measures such as awareness of the local community, inclusion of seedlings obtained from seeds from neighboring fragments, including area 1, and the creation of ecological corridors could increase the genetic base of this population.
NTB: Number of total bands; NPB: Number of polymorphic bands; PPB: Percentage of polymorphic bands; PM: Private markers; SVF: Size variation of the fragments determined by a 100 bp molecular weight marker.H = (A, T or C); R = (A or G); V = (A, C or G) and Y = (C or T).CERNE (2022) 28: e-103055 França et al.

Figure 2 .
Figure 2. Estimates of the correlations between genetic dissimilarities obtained by bootstrap analysis and the optimal number of fragments obtained for populations of the species Paratecoma peroba.
resulted in K = 3 (Figure 4.a).In figure 4.b, the different colors assigned to the groups indicate the proportion of ancestry of the genotypes.

Figure 4 .
Figure 4. Bayesian approach performed for 93 individuals of the species Paratecoma peroba.a) Graph between the relationship of Δk values for each k value, which resulted in k = 3. b) Bar graph that identifies the number of clusters and demonstrates the genetic structuring of the total sample through colors.

Table 1 .
Descriptive analysis of ISSR primers selected for evaluation in Paratecoma peroba.

Table 3 .
Analysis of molecular variance between and within populations of the species Paratecoma peroba.