INTRODUCTION
Baccharis dracunculifolia DC. (Asteraceae) is a widely distributed species of shrub occurring in Argentina, Bolivia, Paraguay, Uruguay, and southern and southeastern Brazil. It produces a diverse array of secondary metabolites that exhibit anti-inflammatory [1], antimicrobial [2], and antioxidant [3] medicinal properties. Furthermore, the chemical compounds of B. dracunculifolia are the main components of Brazilian green propolis, which is known for its diverse medicinal properties [4,5]. B. dracunculifolia is also associated with many herbivores, pollinators, and endophytic fungi, and thus is an important species for ecological community structure and functioning [7-9]. But in spite of these studies, nothing is known about the bacteria that are associated with B. dracunculifolia.
Endophytic bacteria have an intimate interaction with their hosts. They colonize intercellular spaces and the vascular system of their host plants, generally without harming them [10-12]). Increasing evidence has shown the importance of endophytic organisms in the success of host plants [13-15]. The endophytic bacteria that have been isolated so far have mostly belonged to the phyla Proteobacteria, in particular the class Alphaproteobacteria, Firmicutes, Actinobacteria and Bacteroidetes [12]. Nevertheless, a growing number of culture-independent studies have revealed a broader diversity than culture-based studies[15,16].
In spite of the growing number of studies on the microbiota of some Neotropical plants [17,18], there remains a need for knowledge regarding the bacterial communities inhabiting medicinal plant species. In the present study, a next generation sequencing approach was employed to determine the composition of the endophytic microbiota of B. dracunculifolia, both with and without its major galling insect, Baccharopelma dracuncufoliae.
MATERIAL AND METHODS
Study area and sampling
Samples of B. dracunculifolia were aseptically collected in the Reserva Vellozia, in the state of Minas Gerais, Brazil (19o16'49"S/43o34'56.97"W and 19o16'57.56"S/43o35'20.49"W) on June 2013. Plants were located in dry areas of rupestrian grassland habitat (savanna) and were between five and six years old and of similar height. Leaf samples were collected and randomly taken from quaternary branches [19] of six healthy plant individuals (no galls) and six individual plants with galls. Also collected at random were 5 cm long root-tip fragments from six other galled B. dracunculifolia. All plant material was stored in bags containing silica crystals during transport to the laboratory where they were stored at -20°C until DNA extraction.
DNA extraction
Leaf samples were rinsed with sterile water, surface-sterilized by immersion in 70% ethanol for 3 min, soaked for 5 min in 2% sodium hypochlorite, immersed in 70% ethanol for 30 s, and finally rinsed five times in sterile distilled water. To verify leaf-surface sterility, water from the final rinsing was placed onto nutrient agar medium and incubated at 37°C for 48 h. Root samples were subjected to the same aseptic procedure with the additional of initially removing soil from the outer surface of the root with a fine brush. The six healthy leaves, galled leaves, and galled plant root samples were pooled for a single DNA extraction. DNA from leaves (50 mg) was extracted according to Souza et al. [20], whereas total DNA from the root samples (5 g) used the PowerMax soil DNA isolation kit (MoBio Laboratories) following the manufacturer's instructions. The quantity and quality of total DNA were determined using a NanoDrop spectrophotometer (NanoDrop Technologies).
16S rRNA gene amplification and sequencing
For leaf samples, partial 16S rRNA gene amplicons were produced using the primer set 985F (5'-CAACGCGAAGAACCTTACC-3') and 1046R (5'-CGACAGCCATGCANCACCT-3') [21], corresponding to the V6 hypervariable region. 16S rRNA gene amplification and sequencing were performed at the Beijing Genomics Institute using an Illumina HiSeq 2000 plataform (paired-end sequencing). Galled plant root DNA was amplified with a set of primers targeting the V4 hypervariable region of the 16S rRNA gene. The forward primer was 515f (5'-AATGATACGGCGACCACCGAGATCTACACTATGGTAATTGTGTGCCAGCMGCCGCGGTAA-3'), and the reverse primer 806r (5'-CAAGCAGAAGACGGCATACGAGATXXXXXXXXXXXXAGTCAGTCAGCCGGACTACHVGGGTWTCTAAT-3'). Amplification conditions were described previously by Caporaso et al. [22].
The resulting 16S V4 amplicon (500 ng) was fragmented to roughly 150-200 bp using the Covaris S system (Covaris, Woburn, MA, USA), and a library was constructed using the Ion Plus Fragment Library Kit following the manufacturer's instructions. Sequencing conducted on a PGM Ion TorrentTM platform using the Ion Xpress Template Kit and the Ion 316 chip (Life Technologies, USA) following the manufacturer's protocols produced an average read length of about 170 bp.
Bioinformatics and statistical analyses
Reads were trimmed for quality with the MOTHUR software ([23] using the parameters qwindowaverage=30 and qwindowsize=50. After removal of low-quality reads, operational taxonomic units (OTUs) were assigned using a closed-reference OTU picking protocol (QIIME suite 1.8) [24] against the Greengenes gg_13_8_99 reference database [25] pre-clustered at 97% identity. The closed-reference OTU picking protocol has been shown to be reliable in comparing data generated on different sequencing platforms and from different primer pairs [22]. Reads classified as mitochondria or chloroplasts were filtered out of the dataset by applying the "filter_taxa_from_otu_table.py" command. To further account for potential chloroplast DNA contamination, any cyanobacterial-related reads that were not classified to the class level using the Greengenesgg_13_8_99 reference database, had their phylogenetic relationships inferred with the neighbor-joining algorithm ([26] using the ARB (version 5.3) software package and SSU Ref 115 SILVA sequence database [27,28], and those placed in the chloroplast clade were excluded. A biom-formatted OTU table was built and imported to the Phyloseq [29] package of R software [30] for downstream analysis. All samples were normalized to the lowest number of reads using the command "rarefy_even_depth", and were used to measure alpha diversity. The sequence data generated in this study were deposited in the NCBI Sequence Read Archive under accession numbers SRR2104507 for healthy leaves, SRR2104500 for galled leaves and SRR2103679 for galled plant roots.
RESULTS AND DISCUSSION
We report here, for the first time, the relative abundance, diversity, and composition of endophytic bacterial communities in B. dracunculifolia, using a massive sequencing approach with primers targeting hypervariable regions of the 16S rRNA gene.
Sequencing and diversity overview
After further quality control and the removal of plastidial, mitochondrial and unclassified bacterial reads, we secured a total of 7,239 reads for the root samples and 277 reads for each of the leaf samples. The reads were grouped into the following OTUs with a threshold of 97% sequence identity: 2,156 galled plant root, 105 galled leaf, and 103 healthy leaf.
The normalized dataset yielded 164 OTUs from leaf samples (79 of galled leaf and 85 of healthy leaf), and 206 OTUs from the galled plant root samples. Rarefaction analysis, based on OTUs at 97% identity, revealed that the libraries were partially representative of the bacterial communities of B. dracunculifolia (Fig. 1). The Good's coverage [31] also indicated sufficient depth of sequencing, with coverage values of 81% for galled plant roots, 78% for galled leaves, and 78% for healthy leaves. Rarefaction analysis suggested that the galled and healthy leaves were less diverse than the galled plant root sample. These data were supported by Shannon and the Simpson indexes (Table 1). Both Chao1 and ACE estimators indicated higher taxonomic richness in the roots compared to the leaves, which may be due to the fact that the surface and the interior of the leaves have water and resource limitations, high UV exposure and wide temperature shifts [16,32]. In contrast, the rhizosphere harbors a high density of soil bacteria that compete for plant-derived nutrients [33]. Some of these soil bacteria are capable of penetrating the root, which is the main location of entry of bacterial endophytes ([12]. Ma et al. [34] studied the diversity of endophytic bacteria in roots and leaves of the reed Phragmites australis under stressful saline conditions, and observed that the bacterial diversity in roots was significantly higher than that in the leaves.

Figure 1 Rarefaction curves of the dataset of the samples from healthy leaves (HL), galled leaves (GL) and galled plant roots (GR).
Overview of taxonomic representation of bacteria
Overall, 26 phyla were found as endophytes of B. dracunculifolia, with 24 of these phyla being found in the galled plant root samples, 11 being found in the galled leaf samples, and 13 being found in the healthy leaf samples (Table S1). Moreover, a number of phyla were found only in one set of samples, including 12 phyla in the galled plant root samples whereas only 2 in the healthy leaf samples. Relative frequency of taxa is showed in Figure 2. The most abundant phyla were Proteobacteria (galled plant roots= 53.8%, galled leaves= 54.2% and healthy leaves= 63.5%), Actinobacteria (galled plant roots= 28.1%, galled leaves= 17% and healthy leaves= 6.1%), Firmicutes (galled plant roots= 4.5%, galled leaves= 11.2% and healthy leaves= 10.1%), Acidobacteria (5.3%, exclusively from galled plant roots), and Nitrospirae (galled leaves =10.8% and healthy leaves= 11.2%). The two phyla Proteobacteria and Actinobacteria are consistently found on different plant species[35-37], although their frequency and composition vary enormously among the studied species.
Other phyla that were found in the studied tissues of B. dracunculifolia, but at lower proportions (with abundances ranging from 1.1 to 4.5%), were: Bacteroidetes, Chloroflexi, and Verrucomicrobia (Fig. 1, Table S1). A few other taxa were recorded at much lower abundances (≤0.8%): Chlorobi (galled plant roots, galled leaves, and healthy leaves), Calditrix (healthy leaves), AD3 (galled plant roots), Armatimonadetes (galled plant roots), Chlamydia (galled plant roots), Cyanobacteria (galled plant roots and healthy leaves), Elusimicrobia (galled plant roots), Fusobacteria (galled plant roots), Gemmatimonadetes (galled leaves, and healthy leaves), OC31 (healthy leaves), OD1 (galled plant roots), Planctomycetes (galled plant roots), Spirochaetes (galled plant roots), Synergistetes (galled plant roots), Tenericutes (galled plant roots), TM6 (galled plant roots), TM7 (galled plant roots), and WPS-2 (galled plant roots). These findings are consistent with other cultivation-independent studies that revealed that a few bacterial phyla predominate in the phyllosphere and roots [38,39]. The 2,364 OTUs (galled plant roots= 2,156, galled leaves= 105, healthy leaves= 103) were classified into 78 known classes (galled plant roots= 72, galled leaves= 20, and healthy leaves= 23), 125 orders (galled plant roots= 108, galled leaves= 43 and healthy leaves= 41) and 176 families (galled plant roots = 159, galled leaves = 44 and healthy leaves= 46). It should be noted that in spite of the large variation in the number of reads (galled plant roots= 7,239, galled leaves= 277 and healthy leaves= 277), several taxa that were present in the galled leaf and healthy leaf samples (26-fold less reads) were absent in the galled plant root samples. For example, the genera Candidatus Accumulibacter (11.2%), Nitrospira (11%) and Dechloromonas (8.1%) were all well represented and exclusive to the leaf samples, indicating specificity for the leaf environment as have been previously reported [16,36].

Figure 2 Relative frequency of taxa observed in samples of healthy leaves (HL), galled leaves (GL), and galled plant roots (GR), based on massively parallel sequencing. Each phylum or class bar is broken-down when a particular taxonomic group dominated the phylum or class. White bars represent other taxa within a particular taxonomic group.
Taxonomic assignment
Proteobacteria was the most abundant and diverse phylum recovered from the samples of B. dracunculifolia. OTUs belonging to the class Alphaproteobacteria had a higher relative abundance in the galled plant root samples (28.9%), whereas betaproteobacterial-associated OTUs predominated either in the galled leaf or healthy leaf samples (34.7% and 35%, respectively). The dominance of members of Alphaproteobacteria in the rizosphere could be due to their preference for nutrient rich environments [40-42]. Nevertheless, this finding contrasts with that of Bulgarelli et al. [38] and Li et al. [43] who found Betaproteobacteria to be predominant in the root of Arabidopsis thaliana and Typha angustifolia, respectively.
Alphaproteobacteria was represented by seven orders, but were dominated by Rhizobialles and Sphingomonadales, with 51.5% and 27.3% of all Alphaproteobacteria reads from the galled plant root microbiota, respectively. The genus Methylobacterium constituted a considerable fraction (21.3%) of the order Rhizobialles. The presence of Methylobacterium in roots has been consistently detected in several host plant species [44,45]. Interestingly, this genus was not found in our leaf samples although previous studies have reported its occurrence in the phyllosphere of many different plant species [36,46,47]. Members of the genus Methylobacterium are capable of fixing nitrogen and producing auxin and cytokinin regulators, which improve agronomic characteristics of the plant, such as durability and performance [48]. Moreover, these compounds can protect the plant in different situations, such as pathogen infection, by inducing systemic resistance [49,50] and by reducing toxic compounds in impacted environments [51]. In addition to Methylobacterium, Rhodoplanes (8.6%) were frequently recovered from galled plant root samples. Representatives of Rhodoplanes have been isolated from activated sludge [52], and are characterized by their capacity for complete denitrification.
The order Sphigomonadales were represented by two families in the galled plant root samples: Erythrobacteraceae (1.6%) and Sphingomonadaceae (97.9%). Members of the genus Sphingomonas constituted 50% of the Sphingomonadaceae community, with the following species being identified: S. wittichii, S. echinoides, S. suberifaciens, S. changbaiensis, S. azotifigens, S. yabuuchiae, and S. mali. Some of the species of Sphingomonadaceae are also known for their ability to degrade aromatic compounds, which makes this genus of particular interest to environmental remediation. Sphingomonas (1.1% in galled leaf) and Methylobacterium (0.4% in healthy leaf) were rarely observed in leaf samples. Another genus of the family Sphingomonadaceae that was well represented in galled plant roots was Kaistobacter (15.7%). This genus is rarely reported in the literature, and no information is available regarding their possible functional role in plants.
Representation of betaproteobacterial taxa was higher in the leaf samples than in the galled plant root samples (35% in healthy leaf, 34.7% in galled leaf, and 12.2% in galled plant root). Burkholderiales and Rhodocyclales-associated OTUs were recovered from root and leaf samples, respectively (Fig. 1). Within the order Burkholderiales, reads were assigned to 29 genera, with the most common (over of 2%) being: Burkholderia (17.1%), Schlegelella (7.3%), Rhodopherax (3%), Comamonas (2.5%), and Methylibium (2.2%). The type strain of Schlegelella isolated from activated sludge under aerobic and thermophilic conditions is capable of degrading poly-3-hydroxybutyrate, as well as copolymers containing 3-hydroxybutyrate and 3-mercaptopropionate linked by thioester bonds [53]. Candidatus Accumulibacter of the order Rhodocyclales was abundant in both leaf samples (43.7% in galled leaves, and 38.3% in healthy leaves) and absent from the root samples. Members of this genus are widely known to accumulate polyphosphate and remove enhanced biological phosphorus in activated sludge of wastewater treatment plants. Candidatus Accumulibacter has also been reported as an endophyte in roots of Typha angustifolia [54,55].
Gammaproteobacteria class predominated in the healthy leaf sample especially the genus Providencia. Among Deltaproteobacteria, the order Myxococcales predominated in the galled plant root samples (93.4%), with the family Haliangiaceae exhibiting the largest proportion (43.4%). The representatives of this family are aerobic, mesophilic, and chemoorganotrophic. In contrast, Deltaproteobacteria was detected in low abundances in both leaf samples (<2%). The class Epsilonproteobacteria, which was only found in the galled plant root samples, was the least prevalent (0.04%) class, being represented only by the genus Arcobacter.
Actinobacteria was the second most abundant phylum in the galled plant root and galled leaf samples. Actinomycetales, Solirubrobacterales, Gaiellales, and Rubrobacterales were the dominant orders of Actinobacteria. Actinomycetales comprised more than half of all Actinobacteria-associated reads. At the family level, Pseudonocardiaceae, Streptomycetaceae, and Micromonosporaceae accounted for 18.2% of Actinobacteria. Together these families represent a group of microorganisms known to be valuable producers of antibiotics [56,57].
The order Solirubrobacterales encompasses three families (Patulibacteraceae, Conexibacteraceae and Solirubrobacteraceae) whose members are strictly aerobic and chemoorganotrophic [58]. Although members of these families are poorly described, they accounted for 12.6% of all Actinobacteria associated-reads. Previous studies have documented the occurrence of Solirubrobacterales in soil [59,60]. Recently two other studies have revealed species of Solirubrobacter as endophytes; Solirubrobacter phytolaccae isolated from roots [61], and Solirubribacter tabaienses isolated from stems [62] of Phytolacca acinosa Rox.
The roots contained 0.8% of Intrasporangiaceae and 0.0005% of Propioniobacteriaceae (Actinomycetales), while galled leaves contained 4.2% and 8.5%, respectively, and healthy leaves 29% and 17.6%, respectively. Propioniobacterium, the single genus detected, is also resident and abundant in healthy human skin [63], and some species are found in dairy products [61]. Interestingly, the recently proposed order Gaiellales [64], which was recovered from a deep mineral water aquifer in Portugal, has been recently reported from the roots of rice [65]. In our study, the order Gaiellales was only found in galled plant roots (0.09%). In contrast to the root samples, the galled leaf samples had a large proportion of Rubrobacterales belonging to the genus Rubrobacter, which are common in arid soils and on rock surfaces worldwide and are extremely resistant to desiccation and UV stress [67,68]. This finding is notable as B. dracunculifolia is a pioneer species capable of colonizing the extremely harsh habitats of the mountain top rupestrian grasslands. This ecosystem is characterized by having wide variation in temperature, low humidity, shallow and nutrient-poor soils and high solar irradiation [7,69].
Firmicutes was more abundant in the leaf than in the root samples. Bacillus (30.7%) and Geobacillus (21.6%) were the principal and exclusive genera of the galled plant root samples. Other important genera included Clostridium, Staphylococcus and Streptococcus (% reads in galled plant roots: 0, 4.9 and 7.3; galled leaves: 25.8, 9.7 and 25.8; and healthy leaves: 14.3, 21.4 and 10.7, respectively).
A few OTUs of Nitrospirae, classified as belonging to the genus Nitrospira (3 in galled leaves and 2 in healthy leaves), noticeably dominated the community. These OTUs comprised about 11% of all the reads in the leaf samples. Nitrospira are the most widespread and diverse known nitrite-oxidizing bacteria and key nitrifiers in natural ecosystems [70]. Thus, our findings suggested that the endophytic communities of leaves of B. dracuncufolia are involved in the process of nitrification, in contrast with those from the root.
In Acidobacteria (which was almost absent from the leaf samples; only three reads), two classes predominated in the galled plant root samples: Solibacteres (48.3%) and Acidobacteriia (29.9%). Acidobacteria is one of the most abundant bacterial phyla of terrestrial ecosystems [71] and they play important role in the carbon cycle due to their ability to degrade complex plant derived polysaccharides, such as cellulose and lignin [72]. However, their specific role in the soil and rhizosphere ecosystems is relatively unknown [48]. Additional phyla were detected at much lower abundances (Table S1).
CONCLUSIONS
Our study revealed an abundance of Alphaproteobacteria-related taxa in the root environment, and a predominance of representatives of Betaproteobacteria and Nitrospirae in the leaf environment. Moreover, our findings suggest that taxon-specific ecological niches in the leaf and root environments may select specific bacteria, and likely reflect the different physicochemical characteristics of these structures. Altogether, our findings provide a baseline for further research and add significant new information to the current knowledge of the endophytic bacterial composition in B. dracunculifolia.