Genetic Diversity of Bertholletia excelsa Bonpl.: A Native Species of the Amazon Rainforest

1Universidade do Estado de Mato Grosso (UNEMAT), Faculdade de Ciências Biológicas e Agrárias, Alta Floresta, MT, Brasil 2Universidade do Estado de Mato Grosso, Ciências Biológicas, Alta Floresta, MT, Brasil Abstract In view of the economic and environmental importance of Bertholletia excelsa Bonpl. for the Amazon and the reducing natural habitat area of this species, the present study proposes to examine genetic variability among accessions of Brazil nut from the Amazon rainforest. Seventeen native Brazil nut genotypes were sampled from Sinop-MT and the genomic DNA were amplified using 15 inter-simple sequence repeat (ISSR) markers. The percent polymorphism and the polymorphism information content (PIC) of each primer were determined. The primers amplified 84.62% of the polymorphic bands and 15.38% of the monomorphic bands. The PIC for each primer ranged from 0 to 0.68. The Unweighted Pair Group Method with Arithmetic Mean (UPGMA) was the most efficient in group representation, as it showed the highest cophenetic correlation coefficient (CCC = 0.92), the lowest stress (12.31%) and the lowest distortion (1.51%). The use of ISSR markers was an efficient tool in the study of genetic diversity among Brazil nut genotypes, and the genetic diversity found can be used for conservation and pre-breeding programs for this crop.


INTRODUCTION
In the forest fragmentation process, mosaics of remnant vegetation are formed (Estopa et al., 2006) as a result of the substitution of part of the native forest for pastures and agricultural activities (Carvalho et al., 2005). As a consequence, natural vegetation is reduced to areas insufficiently large for proper species propagation, which affects ecological and genetic processes (Estopa et al., 2006). According to the National Institute of Space Research, in 2017, the total deforested area in the Brazilian Amazon had reached approximately 6,600 km 2 , which represents around 13% of its geographic area (INPE, 2017). The northern region of Mato Grosso State is a real example of this type of situation; in this case, due to the great agricultural expansion for the cultivation of soybean (Fearnside, 2017).
In addition to having a direct impact on the number of individuals and reducing intra-and interspecific genetic diversity, deforestation is a constant threat to the genetic resources of the local biome. Genetic resources are crucial to pre-breeding programs of several endemic plant species of the Amazon, such as the Brazil nut (Bertholletia excelsa Bonpl.). This species belongs to the family Lecythidaceae, which comprises ten genera and approximately 150 species, most of which are found in the northern region of the country (Barroso et al., 2002).
Brazil nut is an important non-timber forest product. It is considered a highly valued Amazonian genetic resource, in addition to being the target of constant projects for the development of sustainability and conservation of the Amazon (Clay, 1997). At present, Brazil nut is used in agroforestry systems (Lobo et al., 2013;Costa et al., 2009) in intercrops with wild crops like cocoa (Theobroma cacao), species of the family Meliaceae (Khaya ivorensis, Swietenia macrophylla, Carapa guianensis and Cedrella odorata) and several fruit and forest species such as açaí (Euterpe oleracea), cupuaçu (Theobroma grandiflorum), bacuri (Platonia insignis), black pepper (Piper nigrum) and rubber tree (Hevea brasiliensis).

-8
Carpejani AA, Pena GF, Vieira FS, Tiago PV, Rossi AAB 2 Bertholletia excelsa is widely accepted in both the national and international markets. Its main marketed product, the nut, is not only consumed "as is", but also used in the preparation of other food products such as sweet snacks with honey, chocolate, among other ingredients (Ribeiro et al., 1999).
Because of its considerable economic and social importance, different lines of research have been used for the conservation and pre-breeding of Brazil nut; e.g., molecular biology (Silva et al., 2004).
Different types of molecular markers-i.e., which detect polymorphism at the DNA level-have been successfully employed in studies of plant population genetics. Among them, the ISSR are widely used because they do not require previous information of the DNA sequence (universal primers); have a low development cost; present high interspecific transferability; and; most importantly, produce a high level of polymorphic information (Barth et al., 2002;Brandão, 2011). A number studies with ISSR have demonstrated their efficiency (Costa et al., 2018;Oliveira et al., 2017;Chagas et al., 2015;Dias et al., 2015;Rossi et al., 2014;Moulin et al., 2013;Rossi et al., 2009;Brandão, 2011;Cidade et al., 2008;Almeida et al., 2012).
Brazil has an enormous plant diversity, especially in terms of native species, which are threatened by increasing the substitution of conserved areas for large tracts of agricultural land and farming practices (Tabarelli et al., 2012).
Information about the genetic structure of populations of native species may help to explain the evolutionary factors and degree of diversity found in tropical environments. These, in turn, indicate how primary species behave and respond to the pressures caused by anthropic relationships (Eguiarte et al. 1992).
The evaluation of genetic diversity is an important instrument currently used in population studies (Büttow et al., 2010). Examining and quantifying genetic diversity in accessions of native populations may be an efficient strategy for in vitro conservation, allowing the conservation of accessions and collection sites; or ex situ conservation, through the creation of germplasm banks from seeds of those genetic materials.
In view of the economic and environmental importance of the species B. excelsa for the Amazonian peoples, the present study proposes to investigate the existing genetic diversity among B. excelsa species in a native population of the Amazon forest.

MATERIAL AND METHODS
Samples of the B. excelsa genotypes were collected in 2018 from a natural population in the municipality of Sinop -MT, Brazil, present in a forest fragmentation area on the banks of the road BR-163 (11°51'0,8"S and 55°30'56"W), at an altitude of 367 m. The climate in the region is a humid continental equatorial type with a well-defined dry season, average annual temperature between 22 and 27.6 ºC and average annual precipitation between 2000 and 2200 mm (Ramos et al., 2017). The region is characterized as part of the soil-climatic transition zone between cerrado and Amazonian forest; its vegetation is classified as a semi-deciduous, sub-mountainous emergent canopy forest (Ubialli et al., 2009). According to the Köppen-Geiger climate classification, the climate is an AW type-equatorial savannah with dry winters (Kottek et al., 2006).
The Brazil nut crop evaluated in this study was composed of adult individuals present in a fragmented matrix located in a forest-to-pasture transition region. For the molecular analyses, leaves were collected from all the 17 B. excelsa individuals present in the fragmentation area. The leaf material was dried in silica gel and packed in ziplock plastic bags still in the field. Next, the leaf material was transported to the Laboratory of Plant Genetics and Molecular Biology at the Alta Floresta University Campus, where it was stored in a freezer at -20 ºC.
Total DNA was extracted using 100 mg of leaf tissue, which was macerated in liquid nitrogen. The extraction process followed the method described by Doyle & Doyle (1990) with modifications for the culture; e.g., increases in the CTAB concentration to 3%, in polyvinylpyrrolidone (PVP) from 1% to 2%; and in β-mercaptoethanol from 0.2% to 3%. The incubation time was reduced from 60 to 5 min (Giustina et al., 2014).
The quality and quantity of extracted DNA were analyzed by 0.8% agarose gel electrophoresis, ethidium bromide staining (0.6 ng.mL -1 ) and visualization with a UVB transilluminator. The lambda DNA (Invitrogen) presented a variation range of 10, 20, 50 and 100 ng.µL -1 .
Fifteen ISSR primers developed by the University of British Columbia were used (Table 1).
The final volume of each reaction was 20 µL. Amplification reactions were carried out in a thermocycler under the following settings: 94°C for 5 min (initial denaturation), followed by 35 cycles at 94°C for 45 s, 45-53°C (according to the Tm of the primer used) for 45 s, 72ºC for 90 s and 1 cycle at 72ºC for 7 min (final extension).
The amplification products were separated by 1.5% agarose gel electrophoresis (m/v), using the 100-pb marker (lambda DNA -Invitrogen), ethidium bromide staining (0.6 ng.mL -1 ) and visualization under ultraviolet light by a UVB LTB transilluminator. A genetic dissimilarity matrix was constructed from the polymorphisms obtained using the ISSR primers based on binary characters of presence (1) or absence (0) of bands. Dissimilarity is a measure that allows for mathematically establishing the genetic distance between accessions (Bussab et al., 1990); higher values between the genotypes indicate greater dissimilarity.
In general terms, Botstein et al. (1980) proposed the classification of molecular markers into three classes according to the PIC values, considering them very informative when higher than 0.5; moderately informative when between 0.25 and 0.50; and little informative when lower than 0.25. The PIC value is used as an indicator of the markers' ability to characterize polymorphisms in studies of genetic variability Tatikonda et al., 2009). The genetic diversity of the locus or PIC used in the evaluation of the discriminatory power of a locus was calculated using Excel software. The complement of the Jaccard coefficient was used to obtain the dissimilarity matrix.
Eight hierarchical methods were tested to obtain an adequate genotype clustering from the generated dissimilarity matrix, namely, Weighted Pair-Group Method with Arithmetic Mean To evaluate the consistency of clustering, we calculated the cophenetic correlation coefficient (CCC) of the distance matrices within each method, in addition to the stress and distortion levels. The criterion of Mojena (1977) was adopted to determine the ideal number of groups formed after clustering. The best response was then compared to Tocher's method to infer about the formation of divergent groups.
The application of a hierarchical classification method to a multivariate dataset is aimed at defining the elements to classify partitions or hierarchies of optimum partitions in relation to a previously established criterion. By this approach, the behavior of the individuals is inferred graphically. Therefore, no method is deemed "optimum", though some are better suited for certain situations than others (Kaufman & Rosseeuw, 1990).
The statistical representation of stress is a parameter used to determine the precision level of the adjustment of the graphic projection (Kruskal, 1964). Accordingly, stress is divided into five classes, as follows: when over 40%, it is considered unsatisfactory; from 20 to 40%, regular; from 10 to 20%, good; from 5 to 10%, excellent; and when lower than 5%, perfect.
As stated by Valentin (2012), a method is considered more adequate than another when the dendrogram provides a less distorted image of reality. The degree of deformity caused by dendrogram construction can be evaluated by calculating the CCC. Thus, a lower degree of distortion will be reflected by a higher CCC.
All analyses were performed using GENES software (Cruz, 2016).

RESULTS AND DISCUSSION
The 15 ISSR primers amplified 156 DNA fragments in the 17 analyzed genotypes of Brazil nut, consisting of 132 polymorphic (84.62%) and 24 monomorphic (15.38%) bands. The average number of amplified bands per primer was 10.4 (Table 2).
A high percentage of polymorphic loci was observed in the studied B. excelsa population, even though it was under great environmental pressure. However, this result reflects the polymorphism existing in this population prior to the deforestation of the area, since only adult individuals (in the reproduction phase) were sampled. The primers which revealed the highest polymorphism percentage were UBC 818 and UBC 842, with 100% polymorphism. Primer UBC 861, in turn, did not show polymorphism, which indicates its inefficiency for the species.  The total 132 polymorphic fragments obtained in this study may be considered sufficient for the analysis of the species in question.
The polymorphic information content (PIC) for each marker ranged from 0.00 (UBC 861) to 0.68 (UBC 842), averaging 0.35 (Table 2). Primers UBC 808, 822 and 861 can be considered little informative, whereas UBC 818 and 842 were highly informative and, therefore, the most recommended for genetic diversity studies with genotypes of the species B. excelsa (Table 2).
The greatest dissimilarity was observed between genotypes 1 and 9, the latter of which presented high dissimilarity values with all genotypes. Table 4 contains the CCC, stress and distortion values, which are parameters used in the choice of the method for the graphic representation of the clustering of B. excelsa genotypes evaluated in this study.
The CCC values ranged from 0.73 to 0.92 (Table 4). In practice, phenograms with CCC lower than 0.7 indicate the clustering method is inadequate to summarize the information of the dataset (Rohlf, 1970). In the current study, this criterion was adopted together with the dendrogram representation for the choice of the best among the tested methods.
Based on the stress level, the WPGMA (stress = 15.57%) and UPGMA (stress = 12.31%) clustering methods were the most efficient for grouping the species. However, this parameter alone should not be the basis for the choice of the type of graphic representation. In this study, the highest CCC was obtained with the UPGMA method (0.92), which was thus chosen to represent the clustering of B. excelsa genotypes.
The dendrogram (Figure 1) shows five main groups formed (G-I to G-V), adopting the mean distance of 0.60 between groups as cutoff point, in accordance with the criterion of Mojena (1977).
Group I contained the highest number of genotypes (11, in total). Great diversity was observed within this group, since the two most similar genotypes in it (6 and 16) were grouped at a distance of approximately 0.25. Groups G-II and G-III were constituted by a pair of genotypes each (12 and 13; and 4 and 7, respectively), whereas groups G-IV and G-V were formed only by one individual each (17 and 9, respectively). These results reflect the great genetic variability of the studied genotypes, both within G-I and among the other groups.
Tocher's clustering method generated two groups (Table 5), the first of which was formed by 16 out of the 17 evaluated genotypes. Group II contained genotype 9, only.  Tocher's method partially agreed with the UPGMA method (Figure 1), since group II generated by the former had the same composition as group V obtained by UPGMA. Tocher's group I included all other groups formed by the UPGMA method (groups I, II, III and IV); however, if the average distance of 0.78 had been used as a cutoff point, there would be 100% agreement between the two clustering methods. Both methods have been widely used in studies of populations; records of similarities between both have been reported by several authors (Büttow et al., 2010;Campos et al., 2010;Bertan et al., 2014).
Many specimens from the sampled area are drying and dying as a consequence of the drastic change in the natural environment of the species, which causes it to not adapt to the new microclimate conditions created after the disturbance.
Forest fragmentation, which is a direct and inevitable consequence of deforestation, can affect the Amazon biome in many ways, altering the diversity and communities of the formed fragments, as well as changing ecological processes such as pollination, nutrient cycling and carbon stocking (Laurence & Vasconcelos, 2009).
According to Tabarelli et al. (2012), the growth of human populations and the expansion of their activities alter natural pastures all across the globe. If conservation studies and projects are not developed in a near future, currently remote and undisturbed tropical-forest areas may be converted into "archipelagos of forest islands" (Wright, 2005;Peres et al., 2006), which may directly influence the behavior and genetic structure of the species present in the local biome.
Preserving areas that safeguard the minimum necessary for the occurrence of genetic flow is extremely important to prevent the population's genetic base from narrowing, which 6 -8 Carpejani AA, Pena GF, Vieira FS, Tiago PV, Rossi AAB 6 may result in inbreeding depressions in the subsequent generations (Kageyama & Castro, 1989).
Tropical forest species are characterized by requiring in situ conservation due to the occurrence of interspecific relationships to neighboring species; e.g., climax species and recalcitrant seeds. It is thus essential that research be undertaken on the influence of environmental disturbances on the genetic structure and distribution of species (e.g., Brazil nut).
Brazil nut is a species classified as vulnerable according to Red List of Threatened Species (IUCN, 2015), which attributed its main threat to loss of habitat due to deforestation. Plant conservation programs aim at maintaining the existing levels of genetic variability, warranting research on population genetics for their conservation (James & Ashburner, 1997). Therefore, the success of any conservation program depends on information about the existing genetic variability of the species. In this respect, the use of molecular markers has shown to be an efficient and increasingly accessible tool to quantify the existing genetic variability of plant populations (Giustina et al., 2014;Rivas et al., 2013;Moulin et al., 2013;Da Silva et al., 2011).

CONCLUSIONS
Inter-simple sequence repeat (ISSR) markers are an efficient tool to determine genetic diversity among the sampled accessions of B. excelsa Bonpl.
The UPGMA clustering method showed to be the most efficient for the differentiation of the evaluated B. excelsa genotypes when compared to the other clustering approaches used in this study.
The genetic diversity found can be used as a source of genetic resources for conservation and pre-breeding programs for the referred crop.