Analysis of microbial community biodiversity in activated sludge from a petrochemical plant

The1active sludge process is one of the most-used techniques for the biodegradation of organic compounds present in effluents from an assortment of wastewaters. This study investigated the bacterial community structure of a petroleum industry’s activated sludge and its physical and chemical parameters using high-throughput sequencing. Samples were collected over one year: autumn 2015 (C1), winter 2015 (C2), spring 2015 (C3), and summer 2016 (C4). Total DNA was extracted, and the primers targeting the V4 region of the 16S rRNA gene were used for amplicon sequencing. The majority of the detected microorganisms were considered rare microbiota, presenting a relative abundance below 1% of the total sequences. All of the sequences were classified at the phylum level, and up to 55% of the ASVs (Amplicon Sequence Variants) were associated with known bacterial genera. Proteobacteria was the most abundant phylum in three seasons, while the phylum Armatimonadota dominated in one season. The genus Hyphomicrobium was the most abundant in autumn, winter and summer, and an ASV belonging to the family Fimbriimonadaceae was the most abundant in the spring. Canonical Correspondence Analysis showed that physicochemical parameters of SS, SD and TSS are correlated, as well as ammoniacal nitrogen. Sample C3 presented the highest values of COD, AN and solids (SS, SD and TSS). The highest COD, AN, and solids values are correlated to the high frequency of the phylum Armatimonadota in C3.


INTRODUCTION
Biological and industrial wastewater treatment plants (WWTP) are standout biotechnological processes in operation worldwide (Figuerola and Erijman, 2007), whose significance is increasing in a consistently developing human society. Most wastewater treatment processes use the natural self-depuration limit of aquatic conditions, which is the effect of microbial activity (Heidenwag et al., 2001). It is crucial to recognize the relationship between microbial communities and their performance in the full-scale installations, since bacterial metabolism is essential for effective biological treatment of wastewater (Kwiatkowska and Zielinska, 2016).
Biological treatment by the active-sludge process is well known. This most-used technique for the biodegradation of organic compounds in effluents from a variety of wastewaters and their microbial community has been studied in urban, industrial, and petrochemical wastewaters (Zhang et al., 2011;Sánchez et al., 2013;Ye and Zhang, 2013). These studies have demonstrated that the most prevalent microorganisms in these samples are Betaproteobacteria, Alphaproteobacteria, Nitrobacteria, Bacteroidetes, Firmicutes, and Actinobacteria. High-throughput sequencing technologies provide deep insights into the bacterial populations (Ibarbalz et al., 2013) and have been used to reveal the bacterial range of some complex environments, including activated sludge samples (Claesson et al., 2010;Zhang et al., 2011;Yang et al., 2014;Gwin et al., 2018). Some microorganisms have not been completely identified (Krishnan et al., 2016;Abe et al., 2017), showing that there is much more to discover about the biodiversity of activated sludge. In this study, we accessed the microbial community diversity present in activated sludge from the petrochemical industry using amplicon sequencing based on the 16S rRNA gene.

Active sludge samples collection
Activated sludge samples were collected from a wastewater treatment plant (WWTP) located in Triunfo, Rio Grande do Sul, Brazil (29°51'01.1" S 51°22'50.9" W) previously described by Antunes et al. (2018). The WWTP handles 450-m³ h -1 of wastewater and is operated as a conventional activated-sludge treatment process, mechanically aerated by blades. One liter of sludge was collected directly from the input aeration tank (Figure 1)   The following parameters were determined by a certified laboratory, according to the American Public Health Association ( APHA et al., 2012): total organic carbon (TOC), chemical oxygen demand (COD), dissolved oxygen (DO), total suspended solids (TSS), solids suspended (SS), solids dissolved (SD); and total Kjeldahl nitrogen (TKN). The chemical results are listed in Table 1.

DNA isolation and 16S rRNA gene fragment sequencing
Total DNA was extracted from 0.25 g of active sludge using the Dneasy PowerSoil Kit (Qiagen) following the manufacturer's standard protocol. The concentration and purity of the isolated DNA were determined using an ND-100Nanodrop spectrophotometer (Thermo Fisher). Partial 16S rRNA gene sequences were amplified using universal primers 515F and 806R, previously identified as suitable for bacteria and archaea (Bates et al., 2011). Amplification was performed in a 25 μL mixture, consisting of 1 μL of genomic DNA, 2 mM MgCl2, 0.2 μM of each primer, 200 μM of each dNTP, 1U Taq DNA polymerase and 1X reaction buffer. These primers amplify 291 bp from the V3-V4 hypervariable region of the prokaryotic 16S rRNA gene. Amplification was carried out in a Mastercycler Personal 5332 Thermocycler (EppendorfR) according to the following program: initial denaturation at 94ºC for 2 min, followed by 25 cycles of 45 s at 94ºC, 45 s at 55ºC, 1 min at 72ºC and a final cycle at 72ºC for 6 min. For library construction, 100 ng of DNA was used as described in the Ion Plus Fragment Library manual kit. Barcode sequences were added to identify each sample from the total sequencing output, since all samples were sequenced in a multiplexed run. Amplicon sequencing was conducted on the Ion PGM System (Thermo Fisher) using an Ion 316 chip, following the manufacturer's instructions.
Sequences from 16S rRNA amplicon sequencing were processed using DADA2 (Divisive Amplicon Denoising Algorithm) (Callahan et al., 2016) in R (R Core Team, 2019). Filtering, dereplication, sample inference, and chimera identification were performed, and the generated amplicon sequence variants (ASVs) were taxonomically assigned based on the SILVA database v. 138 (Quast et al., 2013). The ASV data were imported into R using phyloseq (McMurdie and Holmes, 2013). Unassigned taxa and any residual ASVs identified as chloroplast, mitochondria, or eukaryote were excluded from the analysis. The remaining sequences were analyzed as described by Heinz et al. (2017). Sequencing results were deposited in the National Center for Biotechnology Information (NCBI) under BioProject ID PRJNA471748.

RESULTS AND DISCUSSION
After removing the low-quality sequences, the amplicon sequencing from the four samples collected seasonally from the petrochemical industry active sludge yielded a total of 241,859 16S rRNA gene sequences samples, representing an average of 60,465 sequences per sample. The average sequence length was 273 bp.
The microbiota was classified within 31 phyla, 65 classes, 146 orders, 167 families and 185 genera or respective taxa. The domain Bacteria had the highest number of classified microorganisms (94.9% of the total sequences). The occurrence of four archaeal phyla was observed: Crenarchaeota, Halobacterota, Nanoarchaeota, Aenigmarchaeota. The phylum Aenigmarchaeota was present only in sequences from sample C3, comprising 0.10% of the total sequences in sample C3.
The classified bacterial community was composed of thirteen phyla with an abundance higher than 1% of the total sequences ( Figure 2). Proteobacteria was the most abundant phylum in samples C1, C2, and C4, representing up to 37% of the total sequences in C2, followed by the phylum Bacteroidota present in samples C1, C2 and C4 (16.22%, 15.36% and 17.59% of the total sequences, respectively). In sample C3, the most abundant phylum was Armatimonadota and Proteobacteria; they represented 49.16% and 21.09% of the total sequences, respectively ( Figure 2). Armatimonadota was the second-most abundant phylum in C1, after Proteobacteria, accounting for 11.74% of the total sequences. Unclassified sequences at the phylum level presented an average of 0.01% of the total sequences in the samples.
Rev. Ambient. Água vol. 16 n. 3, e2655 -Taubaté 2021 Figure 2. Classification of the most abundant phyla (≥ 1% of the total sequences in at least one sample) of microorganism present in activated sludge samples over a year (samples C1 to C4). "Others" represents the phyla whose abundances are lower than 1% of the total sequences. * Archaea phyla.
From the 336 detected taxa, 33 presented a relative abundance higher than 1% in at least one sample (Table 2) and were considered the predominant microbiota. From that, seventeen microorganisms were classified at the genus level. Hyphomicrobium was the most abundant genus in samples C1, C2 and, C4, accounting for 13.98%, 12.72% and 13.07% of the total sequences, respectively. The most abundant microorganism of sample C3 was a taxa belonging to the family Fimbriimonadaceae (phylum Armatimonadota), representing 48.96% of the total sequences in that sample. The majority of the 336 detected taxa were considered rare microbiota for presenting a relative abundance below 1% of the total sequences. From that, 185 microorganisms were classified at the genus level (Supplementary Table 1).
Canonical Correspondence Analysis (CCA) showed that the values of the physicochemical parameters of SS, SD and TSS are correlated, as well as ammoniacal nitrogen (Figure 3). According to the analyzed chemical parameters (Table 1), C3 presents the highest COD, AN and solids (SS, SD and TSS) compared to the other samples. These microbiological and chemical characteristics found in sample C3 make it different from C1, C2, and C4 ( Figure 4). The highest COD, AN, and solids values are correlated to the high frequency of the phylum Armatimonadota.
Our study provided 16S rRNA gene sequence analyses of the microbial community present in activated sludge from the petrochemical industry. Our findings are in accordance with previous studies of activated sludge, with the predominance of Proteobacteria (Xia et al., 2010). Sidhu et al. (2017) characterized and dissected the phylogenetic and functional structures from the sludge community at the phylum level and found the dominance of Proteobacteria in raw and dried sludge samples, representing 97.9% and 92.6%, respectively.    Analysis of the microbial community revealed key groups for degradation of recalcitrant compounds present in the industrial effluent. Proteobacteria prevail in WWTPs treating pharmaceutical, oil refinery, and biological reactors (Xia et al., 2010;Ibarbalz et al., 2013;Kwiatkowska and Zielinska, 2016). Alphaproteobacteria and Gammaproteobacteria were the most dominant class in Proteobacteria. The filamentous Alphaproteobacteria are versatile consumers of various organic substrates (Kragelund et al., 2006). Most species are aerobic or facultatively anaerobic; many are oligotrophic, preferring to grow in environments with low nutrient concentration (Madigan et al., 2016).
Activated sludge has a very diverse microbial community structure depending on both wastewater composition and operational conditions in the treatment plant. However, in several studies of microbial community structure, it has been found that the composition of activated sludge from different plants is quite similar in terms of overall dominating bacterial phylogenetic groups. In nutrient removal of activated sludge, the dominating group frequently found is Alphaproteobacteria, Gammaproteobacteria and Betaproteobacteria (Klausen et al., 2004;Lee et al. 2002;Schmid et al., 2003;Wagner and Loy, 2002). Studies in WWTPs suggested a higher diversity of active denitrifiers, including uncharacterized Alphaproteobacteria, Gammaproteobacteria and Actinobacteria (Osaka et al. 2006;Hagman et al., 2008;Morgan-Sagastume et al., 2008). Filamentous Alphaproteobacteria have been shown as essential microorganisms in industrial WWTPs, often related to bulking incidents or deteriorating settling sludge properties (Levantesi et al., 2004).
At the order level, it was found that the dominant populations in the activated sludge samples were Burkolderiales and Rhizobiales, which represented 8.03% and 7.44% of those populations. This low percentage indicates a great diversity of the bacterial populations present in the activated sludge.
Sample C3 presented the most different microbial composition of the four samples, mainly because of the dominance of the individuals from the phylum Armatimonadota (Lee et al., 2013). This phylum is found in a diverse array of environments, such as geothermal soils (Stott et al., 2008), freshwater lakes and rivers (Crump and Hobbie, 2005), the water discharged from manures (Simpson et al., 2004), and activated sludge (Dalevi et al., 2001). Portillo et al. (2009) pointed out that this bacterial phylum could constitute an average of 5% among the total bacterial sequences recovered in hypersaline soils, geothermal springs, lake and river, bioreactors, and endolithic environments. Among the phylum Armatimonadetes, a more extensive geographical distribution was found in anaerobic niches (Harris et al., 2004;Stott et al., 2008). Chemical parameters influenced the bacterial community of C3. The canonical correlation analysis (CCA) shows that the phylum Armatimonadota presented a positive correlation with the increasing COD, TOC and total dissolved and suspended solids of the C3 sample. This sample showed the highest COD and the second-highest TOC and Solids (TSS, SS, and SD) quantification; these parameters contribute to the formation of an environment with low oxygen concentrations, which may have favored the occurrence of the phylum Armatimonadota. Also, sample C3 showed bacterial diversity differences between the other collections of activated sludge, such the phyla Aenigmarchaeota, Caldisericota, Cloacimonadota, MBNT15 and Sva0485, which were only detected in C3 (Supplementary Table 1).
CCA analysis also showed the correlation of Actinobacteriota with the presence of dissolved oxygen (DO). Most genera from this phylum are aerobic (Goodfellow and Williams, 1983) and this phylum presented significant quantification in sample C2 (2 mg per liter).
Nitrospirae shows a correlation with the presence of NTK. The ability to perform nitrite reduction was a physiological characteristic observed in Nitrospirae (Sidhu et al., 2017). According to Ward et al. (2009), genomic evidence suggested that the role of acidobacteria in nitrogen cycling in soils and sediments is the reduction of nitrate, nitrite, and possibly nitric oxide due to assimilatory nitrate reductase gene sequences. The presence of nif genes related to conventional nitrogenase was found in a study by Inoue et al. (2015), suggesting nitrogen fixation ability in some Bacteroidetes species.
Rev. Ambient. Água vol. 16 n. 3, e2655 -Taubaté 2021 Acidobacteriota shows a correlation with the presence of AN, SS, SD and TSS. Bacteria belonging to the phylum Acidobacteria have also been observed in a wide variety of environments, including extreme (Hobel et al., 2005), polluted (Bobbink et al., 2010), and effluent wastewater environments (LaPara et al., 2000). Ward et al. (2009) found that Acidobacteria were involved in nitrogen cycling, promoting the conversion of nitrate and nitrite.
All the sequences were classified at the phylum level, and up to 55% were associated with a bacterial genus. Among the most abundant microorganisms, Hyphomicrobium and Fimbriimonadaceae were described in the literature as potential denitrifiers and degradators. The genus Hyphomicrobium is a denitrifier and can degrade C-1 compounds such as methanol (Rissanen et al., 2017). Sequences representing the phylum Armatimonadetes have been isolated by culture-independent methods from various environments, including aerobic and anaerobic wastewater treatment processes, the rhizosphere, hypersaline microbial mats and subsurface geothermal water streams (Portillo and Gonzalez, 2009;Lee et al., 2013;Tamaki et al., 2011). Fimbriimonadaceae belonging to Armatimonadetes was detected in an anammox consortia where ammonium was removed without nitrite and oxygen (Liang et al., 2014).

CONCLUSION
Even with the advances brought about by the new generation sequencing, there are still challenges regarding the classification of the microorganisms in environmental samples. The classification of sequences at a lower taxonomic level, such as family or genus, is essential to understanding a WWTP as a whole and the real participation of each microorganism in the different stages of treatment. The present study contributed to the characterization of the microbial communities involved in the sewage treatment of the petrochemical industry. Identifying the microorganisms has the broader impact of contributing to the knowledge of biological wastewater treatment.

ACKNOWLEDGMENTS
We would like to thank Sistema Integrado de Tratamento de Efluentes Líquidos do Polo Petroquímico (SITEL-CORSAN) for authorizing the sample collection. We thank High Performance Computing Lab -LAD/PUCRS for allowing access to run the high-throughput sequences analyses. Luiz Gustavo A. Borges thanks PEGA/PUCRS. We also thank CNPq and CAPES for their financial support.