DNA fingerprinting based on SSR amplification profiles for Piper species identification ( Piperaceae )

DNA fingerprinting based on SSR amplification profiles was applied to native species of Piper from the Atlantic Forest to compare the utility of this type of molecular marker with the morphological characters traditionally applied in Piper taxonomy and identification. Fifty-one SSR markers developed for four species of Piper native to Asia and Mesoamerica were applied to 16 species, together with 63 morphological characters, for species characterization. Molecular and morphological data were analysed by cluster analysis, followed by a cluster sharpness test and the construction of a heat map to visualize the association of characters with species groups. A multivariate regression tree determined the number of loci needed for species identification. Forty-five primers were transferable to at least four species. Molecular data were more efficient in detecting sharp groups than morphological data. Species groups delimited by a set of shared morphological characters were differentiated based on molecular data. The sixteen studied species could be separated by nine primers, demonstrating the cross-species transferability of SSR markers and the usefulness of DNA fingerprinting for both the delimitation and the identification of species of Piper.


Introduction
Plant species identification may be a challenging task when dealing with highly species-rich families and genera in the Neotropics such as Piper, the most representative genus of the Angiosperm family Piperaceae.Worldwide, Piper is among the 20 most species-rich genera of Angiosperms with about 2000 species (Frodin 2004;Stevens 2001).In Brazil, Piper is among the 30 most species-rich genera comprising 272 species (BFG 2015).The morphological characters traditionally used for taxonomic purposes within the genus Piper are observable in the flowers, which are very small, as well as in the leaves, which are variable in shape and size (Jaramillo & Manos 2001).Other informative morphological characters for species identification, such as fruit shape, depend on samples being obtained during this particular phenological period, or are questionable because of their variability as observed in leaf venation and floral bracts (see Yuncker 1972;Carvalho-Silva et al. 2015).
SSR (Simple Sequence Repeats, or microsatellites) markers are available for some species of Piper of economic interest and importance for the food industry, such as P. nigrum, or of ecological interest, such as P. polysyphonum, P. cordulatum and P. solmsianum (Liao et al. 2009;Menezes et al. 2009;Andree et al. 2010;Yoshida et al. 2014).Due to their polymorphism, SSR markers have been used as an auxiliary tool to discriminate taxa by DNA fingerprinting (Duminil & Michele 2009;Nybom et al. 2014) and are used in plant breeding for variety and cultivar selection (Arruda et al. 2003;López-Olmos et al. 2005).However, the availability of microsatellite markers relies on the construction of genomic libraries (Jewell et al. 2006) or on their transferability between related species, given that the occurrence of many microsatellite loci is conserved in plants and animals.The transferability of SSR markers originally developed for cultivated species to related indigenous species (Gupta & Varshney 2000;Barbará et al. 2007) is an alternative that reduces costs by facilitating the use of the same primers for different species.
DNA fingerprinting for species identification using SSR markers amplification profiles is a method based on putative microsatellite loci in species of interest (e.g.indigenous species), inferred from species for which their are available (generally cultivated species).This is possible because the regions that flank repeating hypervariable motifs are conserved (Gupta & Varshney 2000), and primers are designed for microsatellite amplification in these regions.However, accumulated mutations in the flanking regions of microsatellites can result in the failure of primers to amplify SSR loci across species of a genus.The intention is to determine the presence/absence pattern of markers unique to a particular species, disregarding potential allelic variation.This kind of DNA fingerprinting is an alternative application of SSR markers for plant species identification (see Tuler et al. 2015).Considering that this tool makes it possible to differentiate species, it can also be applied to clarify taxonomic issues.
A comprehensive taxonomic approach of Yuncker (1972) had improved the knowledge of Piper species in Brazil.However, the species richness known for the genus in this country have increased substantially since the 70's by the description of new species for science (e.g., Carvalho-Silva et al. 2015;Sarnaglia-Junior & Guimarães 2015;Chao-Yun et al. 2017).As any further extensive taxonomic treatment was undertaken recently, the boundaries between some Piper species remain dubious.This is especially limiting when it involves species which bioactive compounds are known for their economic importance.For example, Piper aduncum is known to have fungicidal and insecticidal potential (Silva et al. 2007;Rocha et al. 2013), and may be confused with P. macedoi Yunck due to similarities in external morphology (Yuncker 1972;Christ et al. 2016).The DNA fingerprinting method based on SSR amplification profiles can be applied to confirm the identity of a given species of interest, differentiating it from the others closely related.In this study, we compared information from SSR amplification profiles with the morphological characters traditionally used in Piper taxonomy to identify the most efficient data set for species distinction.

Materials and methods
Initially, it was necessary to test the transferability of primers developed for Central American and Asian species of Piper to the indigenous South American species included in this study, for which specific primers are unavailable.Morphological and molecular data were submitted to cluster analysis followed by a test of group sharpness (Pillar 1999).The data set resulting in the greatest number of sharp groups was considered the most efficient for species identification.Heat maps applied to the clusters show the morphological characters and markers on which the groups are based.Finally, the minimum number of SSR amplification profiles and morphological characters required to identify these species were quantified by constructing regression trees.
Field trips for the collection of samples were carried out between August and December 2015 with weekly collection periods.Young and healthy leaves used in the molecular analyses were obtained from 3-9 individuals per species randomly distributed in a fragment of Atlantic Forest located south of Castelo Municipality, state of Espírito Santo (20°35'54"S 41°10'53"W) in which Piper species had their taxonomy previously studied (Christ et al. 2016).The leaves were stored on silica gel and lyophilized.A sample of one individual per species collected was dried and housed in the herbarium VIES (List.S1 in supplementary material).

Morphological analyses
Morphological characters included in cluster analysis are those traditionally used for the taxonomy of Piper species, available in the literature (Yuncker 1972;Christ et al. 2016).A matrix of species (16) per qualitative morphological characters (31) was constructed and later encoded as a binary matrix, based on absence (0) or presence (1), resulting in 63 character states (Tab.S2 in supplementary material).Thus, some characters were coded as dummy variables (see Legendre & Legendre 2012) since some species have two different states for the same character (i.e., multi-state qualitative descriptor).This matrix was used to calculate the similarity between species by using the Jaccard coefficient

DNA fingerprinting based on SSR amplification profiles
for Piper species identification (Piperaceae) (Legendre & Legendre 2012), which does not consider the absence of bands as evidence of similarity between taxa.This coefficient was used for the morphological data because the common absence of a given morphological character is biologically uninformative in the context of this study.
The cluster analysis was based on UPGMA (unweighted pair-group method with arithmetic means) (Legendre & Legendre 2012).A test of group sharpness was applied to the cluster to determine an optimal partition level (Pillar 1999).The stability of the partition level was tested by resampling (10,000 iterations).This test shows if the groups reappear in the re-sampled data more often than expected at random (Pillar 2006).A significance level of 10 % (α = 0.1) was used.This test was applied sequentially to the groups of species formed by the first cluster analysis.The heat map (Borcard et al. 2011) was applied to the cluster using the distance matrix and the presence/absence matrix of the morphological data.This analysis allows visualizing which variables are associated to the groups of species formed in the clusters.A regression tree was built to determine which morphological characters are more informative and better distinguish the species from this study.The analyses were performed in the R environment (R Core Team 2016), using the packages vegan (Oksanen et al. 2017), cluster (Maechler et al. 2016), tree (Ripley 2016), ade4 (Dray & Dufour 2007), stats (R Core Team 2016), proxy (Meyer & Buchta 2017), and ggplot 2 (Wickham 2009), and the cluster sharpness test was performed in the Multiv program (Pillar 1999).

Molecular analysis
The genomic DNA of species representatives was purified using the Doyle & Doyle (1990) (Liao et al. 2009).One DNA sample per species was used for the transferability analysis.Polymerase Chain Reactions (PCR) were prepared with a final volume of 15μL containing 50ng of DNA, 0.3μM primer, 1.5 unit of Taq DNA polymerase, 1X buffer containing MgCl2 10mM and 0,2μM de dNTPs.The reactions were subjected to an initial denaturation step for one minute at 94 °C, followed by 35 cycles of 1 min at 94 °C, 1 min at 52 °C (in the first 10 cycles temperature was decreased by 1°C, and in the remaining cycles temperature was maintained at 42 °C), 1 min at 72 °C with a final extension of 5 min at 72 °C.PCR products were analyzed by 1.2 % agarose gel electrophoresis stained with Gel Red in 1X TBE buffer (Trisbase 0,89 mol/L, Boric Acid 0,80 mol/L and EDTA 0,02 mol/L) at 100 volts for about 2 hours.The size of fragments was estimated using a 100 bp ladder.The fragments were photographed under ultraviolet light using the Photodocumentation System (ChemiDocXRS + System-Bio-RadTM).
A binary matrix of species ( 16) per microsatellite profile (51 loci) was constructed and encoded as a binary matrix based on the absence (0) or presence (1) of SSR markers amplification.This matrix was used to calculate a resemblance matrix using the simple matching coefficient (Sokal & Michener 1958), which consider the common absence of bands as evidence of similarity between taxa.Unlike the analysis for morphological data, the absence of primer transferability between species (common zeros) are informative for species identification in the context of DNA fingerprinting based on SSR amplification profiles (see Tuler et al. 2015).Nevertheless, considering the possibility that the use of different resemblance measures for morphological and molecular data could lead to a bias in the results, we also ran the analysis using the Jaccard index and found the same group structure.Therefore, the simple matching coefficient was maintained in the molecular data analysis.
A test of group sharpness (Pillar 1999) and a heat map (Borcard et al. 2011) was applied to the cluster as it was done for the morphological data.Additionally, a multivariate regression tree (Borcard et al. 2011) was constructed to determine the number of loci needed to identify the species in question.All branches on the left in the dendrogram indicate absence of amplification of the marker, while the branches on the right indicate that the marker was amplified for the species.The SSR markers that did or did not amplify for each pair of species were indicated at the branching bifurcation (e.g.p26 means primer 26, Tab.S3 in supplementary material).The regression tree used for validating primer transferability for each species included samples of at least three individuals per species.The analyses were performed in the same programs and packages cited above.
The heat map (Fig. 2) showed that all species share 20 to 23 morphological character states of the 63 included in the analyses.Among the species that were revealed as monospecific groups, only Piper amalago, P. dilatatum and P. umbellatum had exclusive morphological character states.Piper amalago (G1) shared 22 morphological character states with other species, and was differentiated by the exclusive character states of deltoid lamina ( 16) and acrodomous venation (31).Piper dilatatum (G3) shared 23 morphological characters with the other species, but differed by the exclusive presence of branches (3), pubescent lamina (13) and pubescent peduncles (46).Piper umbellatum (G1') is the only species with a cordiform leaf blade (17) and acromete veins (33).Piper miquelianum (G2) formed a monospecific group in spite of sharing 22 characters with the other species, differing by brochidodromous leaf venation (34) and pedicellate flowers (49), however these characters are shared with P. anisum.Piper bowiei (G2') is part of a large group, but features stylus long as an exclusive character.Nevertheless, this species was revealed as a monospecific group after a sequential group partition analysis.The species of G2', and G3' share the most common characters of the genus Piper, for example margin entire (20), spike solitary (42) and styles inconspicuous (53).The regression tree applied to the morphological data (Fig. 3) revealed 10 character states (1, 2, 3, 5, 7, 9, 11, 17, 43 and 44) able to distinguish Piper species from study.
Nine microsatellites (primers 1, 2, 3, 4, 5, 17, 26, 28 and 37) were sufficient to identify all of the 16 studied species (Fig. 6).The amplification of these primers was confirmed by the analysis of transferability with a larger number of individuals.

Discussion
The amplification profiles of SSR primers were more efficient to distinguish the studied species of Piper than morphological characters.The group sharpness test applied to the cluster analyses based on molecular data resulted in 10 groups in contrast to seven sharp groups revealed by the morphological data.Furthermore, the size of the groups seems to be more balanced in the cluster analysis based on molecular data.Each set of data (morphological and molecular) resulted in four distinct monospecific groups, considering the group-partitioning test applied sequentially.In this case, the test resulted in two groups with four species each.On the other hand, morphological data resulted in four groups in the first group sharpness test, which included almost all studied species.In this case, even after applying a second partition test, a large group of eight species (amongst 13 species) remained in one group.
The relationship between the species for which SSR markers were originally developed and the species included in this study can explain differences in their rates of transferability to each other.The markers from Piper solmsianum, the only indigenous species of Brazil used for primer transferability (BFG 2015), had the highest transferability rate of the 16 species studied.Piper cordulatum and P. polysyphonum, which are restricted to Central America (Tropics 2016) and China (Yung-Chien et al. 1982), respectively, had the lowest transferability rates.The absence of transferability may be due to genetic differences in the regions that flank the microsatellites and may reveal the evolutionary distance between the species (Gupta & Varshney 2000;Barbará et al. 2007).It is widely accepted that taxa geographically closer to each other are more similar to one another (Tobler 1970;Pennington et al. 2004), which explains the high transferability rates of P. solmsianum.
Morphology is one of the main criteria for species delimitation (Sattler & Rutishauser 1997).Ideally, morphological characters for plant identification should be stable regarding the environment and easy to observe (and to interpret).However, fine characters delimit many species from each other, especially in species-rich genera.With regard to the genus Piper, Yuncker (1972) pointed out the general similarities between P. anisum and P. miquelianum as well as P. dilatatum and.P. gaudichaudianum in the main taxonomic study for Brazilian species.According to this author, these species are distinct mainly for branch pilosity, which can vary depending on the environment (Pérez-Estrada 2000).The heat map approach made it possible to visualize several other morphological characters these species have in common.In addition to the general similarity observed between P. anisum and P. miquelianum, such as pedicellate fruits and brochidodromous leaf venation, this analysis revealed another 15 characters shared between these species (e.g., round bracteola, peduncle pilose and petiole striated).The molecular data efficiently distinguishes these species, as nine primers that amplified for P. anisum did not amplifiy for P. miquelianum.In addition, the molecular data also showed evidence of the close relationship between these species in the cluster analysis.This is possible because microsatellite loci retain the phylogenetic signal (Zhao & Kochert 1993;Ochieng et al. 2007).
Morphological data grouped species that share several overlapping morphological characters, for example, in Piper caldense and P. cernuum.Even though they share an important morphological character that is exclusive to both (pendulum spike), these species did not form a sharp group.This was probably the case because both share most of the characters that circumscribe the other studied species (e.g., Piper tuberculatum and P. arboreum).Consequently, two large groups (G2' and G3') were formed based on the sharing of many characters common to the genus Piper.Morphology can be efficient to species identification in some cases.Some species of Piper are recognizable by easyto-observe characters, e.g.Piper amalago and P. umbellatum (Yuncker 1972), whose differences could be captured by the morphological cluster analysis applied here.The first species has an acrodomous leaf venation and the second shows a unique inflorescence arranged in umbels.These conspicuous morphological differences make them "easily recognizable species", but they also make it difficult to hypothesize possible relationships between them and other species of the genus.Based on microsatellite amplification profiles, both species remained grouped with other species, even after the second group sharpness test.In general, the heat map showed two large groups of species separated by the position of the spike (43).The branches indument (1-4), the petiole shape (5-6) and the leaf blade shape (15-19) make it possible to identify Piper species, as revealed by the regression tree, especially when added to the unique characters.The position of the spike erect (43) or pendulum (44) separated large groups of species twice in the analysis, which highlights the importance of this character.These characters, however, are observable only in fertile samples.Likewise, the striated petiole (5), short sheath (9) and symmetric leaf blade (10) were used to discriminate small groups of species.Among the 10 character states revealed by the analysis, four are referring to the branch or leaf blade pilosity.This character, however, may vary according to environment.
DNA fingerprinting based on transferability of SSR markers proved to be more efficient to identify the sixteen species of Piper studied here, compared to morphological data.The molecular marker used also dispenses the analysis of fertile samples, essential to the identification of some species studied here based on morphological data.However, applying the molecular tool exclusively for identifying plant samples may result in misunderstandings, for example, if some amplification profile has been misinterpreted due to the occurrence of some lab artefact.In this scenario, the use of morphological data is highly recommended.Besides allowing the identification of plant samples, the DNA fingerprinting based on SSR amplification profiles can also provide additional source of information to investigate the boundaries between species, being useful for taxonomic purposes.

Figure 5 .
Figure 5. Groups (indicated in brackets) of cluster analysis (UPGMA) based on the amplification of the 45 SSR primers transferable