In-depth transcriptome reveals the potential biotechnological application of Bothrops jararaca venom gland

Abstract Background: Lack of complete genomic data of Bothrops jararaca impedes molecular biology research focusing on biotechnological applications of venom gland components. Identification of full-length coding regions of genes is crucial for the correct molecular cloning design. Methods: RNA was extracted from the venom gland of one adult female specimen of Bothrops jararaca. Deep sequencing of the mRNA library was performed using Illumina NextSeq 500 platform. De novo assembly of B. jararaca transcriptome was done using Trinity. Annotation was performed using Blast2GO. All predicted proteins after clustering step were blasted against non-redundant protein database of NCBI using BLASTP. Metabolic pathways present in the transcriptome were annotated using the KAAS-KEGG Automatic Annotation Server. Toxins were identified in the B. jararaca predicted proteome using BLASTP against all protein sequences obtained from Animal Toxin Annotation Project from Uniprot KB/Swiss-Pro database. Figures and data visualization were performed using ggplot2 package in R language environment. Results: We described the in-depth transcriptome analysis of B. jararaca venom gland, in which 76,765 de novo assembled isoforms, 96,044 transcribed genes and 41,196 unique proteins were identified. The most abundant transcript was the zinc metalloproteinase-disintegrin-like jararhagin. Moreover, we identified 78 distinct functional classes of proteins, including toxins, inhibitors and tumor suppressors. Other venom proteins identified were the hemolytic lethal factors stonustoxin and verrucotoxin. Conclusion: It is believed that the application of deep sequencing to the analysis of snake venom transcriptomes may represent invaluable insight on their biotechnological potential focusing on candidate molecules.


Background
Animal venom is composed of a complex and potent mixture of molecules with different physiological activities, ranging from moderate effects, such as allergic reactions and dermatitis [1,2], to more severe effects like hemorrhage, intravascular coagulation, necrosis, respiratory arrest and death [3][4][5].These bioactive compounds are low explored bioresources for the development of new therapeutic drugs for different type of diseases and conditions [6][7][8].
Venomous snakes Snake venom is a promising source of therapeutic proteins, as these venoms comprise more than 95% of the dry weight of a snake's venom is composed of peptides/ proteins [9].These venoms comprise wide variety of enzymes such as phospholipases A 2 , proteases (metal and serine), L-amino acid oxidases, and esterases, as well as many other non-enzymatic proteins and peptides, which have several biochemical and pharmaceutical properties [10][11][12][13].
One venomous snake of particular interest is Bothrops jararaca, which is endemic to the tropical/semitropical forest habitats of southeastern Brazil, northeastern Paraguay, and northern Argentina [14].In recent years, high throughput technology has been implemented more often in snake venom analyses, allowing better understanding of proteomics and transcriptomics of venom, which has exposed its complexity [15,16].High throughput transcriptome analysis allows the identification of complete transcripts expressed in the snake venom gland [17,18].Moreover, molecular cloning of biotechnological proteins of B. jararaca requires the complete characterization of the full-length coding regions of interesting transcripts.The venom produced by B. jararaca has previously used to isolate bradykinin-potentiating factor (BPF), which was the basis for the creation of the antihypertensive agent captopril, and the oral anticoagulant Exanta, also known as ximelagatran [19,20].
In this study, we generated an RNA-Seq transcriptome, performed de novo assembly and annotated the sequences from the venom gland of Bothrops jararaca.This work expands the current knowledge of the biotechnological potential of the venom gland of Bothrops jararaca.Furthermore, results may also be used for the discovery of novel candidates for cancer treatment, hypertension, inflammatory response, virus infection, and other human diseases, as well as the development of effective treatments of poisoning from Bothrops jararaca bite.

RNA extraction from venom gland
All experiments were performed in accordance with Brazil's National Council for the Control of Animal Experimentation (CONCEA) guidelines and were authorized by the Ethic Committee on Animal Use of the Butantan Institute, under protocol n. 4390280116.Venom glands were removed 3 days after venom milking, when RNA transcription is at its highest level [21], from a chemically euthanized female adult B. jararaca under captivity (Herpetology Laboratory of the Butantan Institute) and immediately stored at -80 °C.Venom glands were then transported in dry ice to the Barretos Cancer Hospital for molecular biology analysis.Tissue samples from the venom glands (40 mg) were then disrupted and homogenized in a Precellys 24 homogenizer (Bertin Technologies) at 4,500 rpm, while being left to cool down on ice between the 3 repeated 20 second-cycles.Total RNA was isolated using RNeasy Mini kit (Qiagen) and quantified by spectrophotometry.In order to avoid cross-contamination during RNA extraction, we performed the RNA extraction in the Molecular Oncology Research Center of the Barretos Cancer Hospital, where no other source of nucleic acid of reptilian origin was ever extracted.

Transcriptome sequencing and quality analysis
Deep sequencing of the mRNA library from B. jararaca was done using the Illumina NextSeq 500 platform (76 bp singleend) and the NextSeq 500/550 Mid Output v2 kit (150 cycles).Libraries were constructed using TruSeq Stranded mRNA LT Sample Prep Kits (Illumina), following the TruSeq Stranded mRNA Sample Prep HS protocol (Illumina).In order to avoid cross-contamination with transcripts of other related species, the venom gland sequenced for this work was the only reptilian sample in the Illumina chip.The quality analysis of the sequenced data was evaluated with the FASTQC software [22].Reads were filtered with Trimmomatic v0.36 [23] using a 4-base sliding window.Leading or trailing bases with average Phred quality score lower than 20 were removed, along with adapters and reads with a length less than 50 base pairs (bp).

De novo assembly and quality analysis
De novo assembly of the B. jararaca transcriptome was performed using Trinity [18] with default k-mer size of 25.The statistics of the Trinity Assembly like the Nx statistics (eg. the contig N50 value), total trinity genes, and total of trinity transcripts were obtained with perl script 'TrinityStats.pl' of the Trinity toolkit [24].Gene open reading frames (ORFs) or protein coding regions within the transcripts were predicted with the TransDecoder program [24].CD-HIT-EST version 4.6.1 [25] was subsequently used for clustering predicted proteins with 100% of sequence identity and 100% alignment coverage.For analysis of transcript abundance, the sequenced paired-end reads were realigned with the assembled transcripts for quantification by the RSEM (RNA-Seq by Expectation Maximization) software [26] using the Trinity script 'align_and_estimate_abundance.pl'present in the Trinity toolkit [18].The transcriptome completeness and contigruity of B. jararaca was assessed by comparing the assembly transcripts to benchmarking sets of the universal single-copy (BUSCO) of Eukaryota, Metazoa, and Vertebrata using BUSCO v3, based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs selected with the database OrthoDB v9 [27,28] .Full-length transcript or near full length transcript of B. jararaca was identified using BLASTX (BLAST+ v2.2) by alignment against the predicted proteome of Anolis carolinensis (GCA_000090745.1,GenBank ID), Ophiophagus Hannah (GCA_000516915.1,GenBank ID), and Python bivittatus (GCA_000186305.2,GenBank ID), which were obtained from GENOME (https://www.ncbi.nlm.nih.gov/genome/) and UniProtKB/Swiss-Prot release 2019_04 May-08, 2019 using an E-value cut-off set to 1 x 10 −20 .
Full-length transcript or near full-length transcript are defined as transcripts (query) similar to proteins already annotated in reference genomes with high quality standards of genome completeness and functional annotation.The alignment between the transcripts (query) and the reference sequence protein obtained in the the BLASTX covered across more than 80-90% of the transcript length.The resulting table with full-length transcript or near full-length transcript from BLASTX was obtained with the perl script 'analyze_BLASTPlus_topHit_ coverage.pl'from Trinity toolkit [24].For each BLAST hit in the target database of the protein, the best matching Trinity transcript was selected, and the percent of the BLAST hit's length covered by the Trinity transcript was identified.

Functional annotation of transcriptome
Annotation was performed using Blast2GO version 3.2 [29].All predicted proteins after clustering step were blasted against the non-redundant protein database (NR) of NCBI (ftp://ftp.ncbi.nih.gov/blast/db/;29-02-2015) using BLASTP with an E-value cut-off set to 1 x 10 −6 .The metabolic pathways present in the transcriptome were annotated using the KAAS -KEGG Automatic Annotation Server [30], which provides functional annotation of genes via BLAST using the method BBH (bidirectional best hit), against the manually curated KEGG GENES database.With the KEGG Orthology groups (KOs) identified via KAAS [30], the complete functional modules of the metabolic pathways present in the B. jararaca transcriptome were reconstructed using the tool KEGG Mapper tools [31].

Toxin identification
The toxins were identified in the B. jararaca predicted proteome using BLASTP with an E-value cut-off set to 1 x 10 −6 and using the protein sequences obtained from the Animal Toxin Annotation Project (version 31/10/2018) from Uniprot KB/Swiss-Pro database as a reference [32].The BLASTP annotation results obtained of alignment against the Animal Toxin Annotation Project database and with same annotation description obtained from the BLASTP against the NR database were considered toxins.In addition, all inhibitors or tumor suppressor proteins were identified in the annotation from the BLASTP results against the NR database using the search terms: inhibitors AND tumor suppressor and later these results were manually checked.For identification of the full length transcript of B. jararaca that aligns to the Animal Toxin Annotation Project from Uniprot KB/Swiss-Pro database [32], we processed the BLASTx hits (E-value cut-off set to 1 x 10 −20 ) using the 'analyze_BLASTPlus_ topHit_coverage.pl'script from the Trinity package (http:// trinityrnaseq.sourceforge.net/),as described above.

Phylogenetics analysis of stonustoxin and verrucotoxin
The sequences of stonustoxin and verrucotoxin of B. jararaca identified were aligned with BLASTP program (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins) and homologous sequences recovered with an E-value cut-off above of 1 x 10 −20 were used for Phylogenetic analyses reconstruction.Phylogenetic analyses were performed in the platform NGPhylogeny.fr[33] using the method Advanced PhyML + SMS.In this mode, the program of alignment is the MAFFT [34].The parameters of matrix of distance chosen in the program MA FFT was BLOSUM62.After the alignment, the sequences were curated with BMGE [35] for gaps remotion and to keep the informative sites.Then, the Smart Model Selection in PhyML program assesses and chooses the best evolutionary model for phylogenetic analyzes.For phylogenetic analysis was used the program PhyML [36] with maximum likelihood method and the inference of Branch support in the tree as done with Bootstrap (FBP + TBE) using 1000 bootstrap replicates.All other parameters for program BMGE and Newick tree format Display were keept as default.All steps were done in the program NGPhylogeny.fr(https://ngphylogeny.fr/).

Data visualization
Figures and data visualization were performed using the ggplot2 package in R [37], through the R software [38].

Transcriptome sequencing, de novo assembly and full-length transcript analysis
Deep sequencing of the venom gland transcriptome of B. jararaca was performed using a NextSeq 500 (Illumina), generating 67,551,639 unpaired reads.Transcriptome raw data quality showed a Phred quality score (per base sequence quality average) higher than 30 (Additional file 1).De novo assembly of all reads resulted in a total assembly of 64,853,458 bp representing 76,765 genes (N50 length of 1104 bp) and 96,044 transcripts (N50 length of 1,104 bp), with a mean contig length of 675.25 bp and a GC percentage of 43.56% (Table 1).
The completeness of transcriptome assessment performed by the Benchmarking Universal Single-Copy Orthologs (BUSCO; version 3.0) showed that 85.8% of the 303 core eukaryotic genes and 88.7% of the 978 core metazoan genes were found in our B. jararaca transcriptome assembly (Table 2).Furthermore, the BUSCO analysis with 2,586 core vertebrata genes showed that 1,550 (59.9%) and 543 (21%) of the 2,586 expected vertebrata genes were identified as complete or fragmented, respectively, while 493 (19.1%) genes were considered missing.
The quantity of full-length transcripts, or near full-length transcripts in B. jararaca, was determined by the number of predicted sequences of Reptiles and UniprotKB/Swiss-Prot databases (Table 3).Among the 96,044 transcripts of B. jararaca, 2,076 (2.16%) matched near full length with coding sequences of Anolis carolinensis, 4,555 (4.74%) with Ophiophagus hannah, 6,707 (6.98%) with Python bivittatus, and 5,472 (5.69%) with UniprotKB/Swiss-Prot (Table 3 and Additional file 2).Thus, a total of 6,835 full length or near full-length transcripts were identified using the concatenated predicted proteome of all three reptiles cited above.

Functional annotation
From the 96,044 transcripts identified, 49,345 open reading frames (ORF) were predicted using TransDecoder [10] with ab initio model.In this analysis, only predicted ORFs that were at least 60 amino acids long were retained.The clustering steps of predicted proteins in the CD-HIT program (100% amino acid identity) resulted in a set of 41,916 non-redundant sequences.Functional annotations of these coding regions were inferred using BLASTP against the NR database in NCBI.Additional annotation was performed using InterProScan [39] and Gene Ontology [40] using the program Blast2GO [29].Thus, 78.81% of unique predicted proteins (33,034) were annotated, while 8,162 (19.47%) transcripts did not have similarity against any proteins in the NR database (Additional file 3).The E-value distribution of hits obtained against NR, the average number of hits per sequence and the High Scoring Segment Pair/Coverage distribution are shown in Additional files 4, Additional file 5 and Additional file 6, respectively.Among the 33,034 annotated sequences, 22,933 (54,7%) best hits were matched in BLAST Tophits against predicted proteins of the species Thamnophis sirtalis, Python bivittatus, Ophiophagus hannah, Anolis carolinensis, and Protobothrops flavoviridis (Figure 1).The other 1,858 sequences   had best hits against other organisms, such as reptiles, humans, mammals, fishes, bacteria, virus, fungi, and others (Additional file 7).

Gene ontology mapping with Blast2GO
The predicted proteins identified in B. jararaca venom gland were mapped into putative functional group-based Gene Ontology (GO) terms assigned using NR derived BLAST hits and InterProScan using Blast2GO, categorized into three ontologies: biological processes (BP), cellular components (CC) and molecular functions (MF) (Figure 2).

Functional domains identified
All predicted proteins obtained from the transcript ORFs were searched for functional domains signatures present in the Smart [42], PFAM [43], and Superfamily [44] thought of the profile hidden Markov models with InterProScan [39].Overall, a total of 20,605 predicted proteins were categorized into 3,992 domain/ family signatures with PFAM (Figure 3).
In addition, the 107 full-length transcripts or nearly full-length transcripts of B. jararaca encoding toxins were identified by BLASTX recovery against the Animal Toxin Annotation Project (version 31/10/2018) from Uniprot KB/Swiss-Pro database and with the same annotation obtained when NCBI non-redundant database was used as reference.Among these, we annotated transcripts for the vascular endothelial growth factor, acidic phospholipase A 2 , phospholipase B, zinc metalloproteinasedisintegrin-like, venom nerve growth factor, among others (Additional file 15).

Discussion
More than 90% of the dry weight of snake venom is represented by peptides and proteins, potentially representing a natural arsenal of biotechnological relevant proteins [13,46].Traditional snake venom protein purification involves the extraction of crude venom, followed by purification via physical or chemical methods, such as high-performance liquid chromatography purification, which produces small amounts of purified products, but not all the products are purified.While this procedure is well standardized, the amount of peptides/proteins purified is not enough for pre-clinical or clinical studies.Less abundant snake venom protein classes, well reviewed by Boldrini-França et al. [47].With the intent to further increase the amount of known proteins, we described here the transcriptome from the B. jararaca venom gland and its annotations.Data presented in this study can be used to produce proteins in heterologous expression systems, such Eukaryotic, bacteria or yeast cell cultures.Recent advances in molecular biology and genomics has made possible the analysis of whole transcriptomes from different tissue sources, being found five publications describing transcriptome analyses of B. jararaca [14,[46][47][48][49][50][51].
An initial study of the transcriptome of an adult specimen of B. pauloensis caught in São Paulo State (Brazil) was described by Rodrigues and collaborators in 2012 [52], using an Expressed Sequence Tag (EST) singleton library of 668 EST sequences.Moreover, a transcriptome analysis using deep sequencing approach was recently published by Gonçalves-Machado et al. [14], in which they compared the transcriptomes of two B. jararaca populations from the Brazilian Southern (S) and Southeastern (SE) Atlantic rainforest.This transcriptome analysis was performed with 205,449 (SE)/281,569 (S) reads and generated 14,246 (SE)/12,240 (S) contigs, describing 15 (SE)/16 (S) different family proteins [14].Similarly, Junqueira-de-Azevedo et al. [49] described the transcriptome of the venom gland of an adult B. jararaca, generating 116,236 reads.In the present work, we describe the transcriptome analysis of B. jararaca, with 67,551,639 unpaired reads, which was assembled using Trinity.This analysis generated 76,765 genes with an N50 length of 828 bp along with 96,044 transcripts with an N50 length of 1,104 bp and a total assembly of 64,853,458 bp with 41,916 unique predicted proteins.To date, an approximate N50 value of 1,431 bp and average contig length of 894 bp was reported for the transcriptome of the snake Bothrops moojeni [53], but other works [8,10,[41][42][43][44] did not described this value.
Our transcriptome of the vertebrata core genes was more complete than any previously sequenced transcriptome of other snakes [54].However, it is expected that not all genes are expressed in the venom gland.BUSCO recovery tends to be highest when the entire organism and/or multiple developmental stages or the same organism is used to generate the assemblies, compared to those assembled from a select number of tissues [55].The BUSCO results of our assembly generated here was comparable to the other transcriptomes reported, where recovery varied between 68 and 95% [56][57][58][59].
Our predicted proteome had a greater number of predicted proteins with high identity or best hit obtained from BLASTP against the proteomes available for Thamnophis sirtalis, Ophiophagus hannah, Python bivittatus, Anolis carolinensis, and Protobothrops flavoviridis (Figure 1) (Additional file 7).We also identified and annotated a higher number of predicted proteins than these previously deposited genomes, which could make the current transcriptome analysis a reference or complement for the functional knowledge of the coding proteins present in other snakes or reptiles.
We identified 107 full-length or near full-length transcripts (80 -100% coverage lenght of query in relation to reference toxins) which were defined as toxins or inhibitors (Additional file 13).These transcripts included hemolytic lethal factor stonustoxin [60,61] and verrucotoxin [62], which were identified for the first time in snakes.We identified MASP1 and MASP2 that are of alternative pathways of complement activation, involved in the activation of factor D [63], complement factor I, complement C1r subcomponent and complement component C7, which possibly participate in the tissue damage in a host bitten by B. jararaca and may aid in the venom poisoning.The excessive complement activation and immune activation could exacerbate the severity of the tissue injury [64].
Several predicted toxins were identified for the first time in this transcriptome, such as veficolin-1, ryncolin-1-like, pancreatic alpha-amylase, venom allergen 3-like, stonustoxin subunit alpha and verrucotoxin (Additional file 14).Moreover, different inhibitors and potential tumor supressors were also found, not yet reported in the venom gland of B. jararaca, such as the relAassociated inhibitor, ribonuclease inhibitor, reversion-inducing cysteine-rich protein with Kazal motifs and Insulin-like growth factor-binding protein 6 (Additional file 14).Moreover, up to now, the cobra venom factor and the three-finger like transcripts were never described in venom glands of B. jararaca [61,65,66].
We also identified two genes coding toxins similar to stonustoxin subunit beta (DN34719_i1, TPM:105, DN56183, TPM: 17.00) and to verrucotoxin (DN37544_i1, TPM: 16, DN1850, TMP: 34), which has been studied in venom of fishes of the species Synanceia verrucose [67].The phylogenetic analysis indicated that B. jararaca stonustoxin (DN34719_c0_g1) and rerrucotoxin (DN1850_c0_g1) are related to stonustoxin and verrucotoxin of Synanceia verrucose.Results for stonustoxin indicated that this protein diverged from an ancestral node, with a bootstrap value of 100% significance (Figure 5 and Additional file 15).However, these proteins have different functional domains predicted.Hence, it is not possible to affirm that these homologous sequences have the same function, because neofunctionalization is a common evolutionary process found after speciation in homologous or paralogous sequences.
The Synanceia verrucose inhabits shallow waters of the tropical or subtropical Indo-Pacific regions and are among the most venomous and dangerous fishes in the world.The purified stonefish toxins commonly present potent hemolytic activities due to its ability to form pores in the cell membrane.Also, these toxins elicit potent hypotension, inhibit neuromuscular function, and induce cardiovascular collapse in humans and native predators [54,68].The stonustoxin-like were also found in the transcriptome of the venom gland of annelids [69] and are found in a variety of vertebrates, including the common ostrich, platypus, tasmanian devil, and coelacanth [70].
We identified with KEGG mapper [71] one of the largest mapped/annotated pathways of the components of the human retrovirus response.Subsequently, one of the identified components of the Gene Ontology term [34] identified by Blast2Go in our analysis is the virion part, consisting of sequences coding for several endogenous retrovirus described in snakes such as Python curtus endogenous retrovirus, Python molurus endogenous retrovirus, Endogenous retrovirus group PABLB member 1, Endogenous retrovirus group K members of families 1, 8, 9, 10, 11, 18, 19 and 25, and human endogenous retroviruses (HERVs) such as HCML-ARV already isolated from blood cells of patient with chronic myeloid leukemia [72].
One important finding was the identification of Syncytin and L1-retrotransposons genes in the venom gland of B. jararaca.Syncytin genes are associated with placental evolution in mammals and viviparous anim als [73].Up to now, these coding sequences were never described in transcriptomes of viviparous snakes, mainly B. jararaca.Only recently it was found in the Mabuya lizards [74].
In addition, we also identified the metabolic pathways essential for survival such as thermogenesis, catabolism and anabolism of carbons, amino acids metabolism, fatty acid metabolism, pathways involved with splicing of RNA, and processing and degradation of proteins (Additional file 11).Among the most frequent signatures of functional domains identified in our work was the kinase domain and serine-threonine/tyrosine-protein kinase catalytic domains, which are known to regulate or activate most cellular pathways.Similarly, the RNA recognition motif was also abundant and acts in the recognition of RNA and proteins known to bind single-stranded RNAs [75].Other prominent functional domains found within the B. jararaca transcriptome were the zinc finger domain, which is a DNA, RNA, protein or lipid binding domain [76]; as well as metalloproteases; WD40 repeats, involved in several functions such regulation to cell cycle control and apoptosis [77]; a reverse transcriptase domain, characteristic of retroviruses involved in replication in the host; and a pleckstrin homology domain, with a role in recruiting proteins to different membranes [78].These signatures reflected the biologic role of the proteins identified.
The additional analysis of RSEM showed that the most abundant transcripts genes belong to zinc metalloproteinasedisintegrin-like jararhagin, natriuretic peptide metalloprotease, C-type lectin 8a and L-amino acid oxidase.The distribution of these protein types is characteristic of the Bothrops genus, whose species produce venoms most notable for local tissue damage such as edema, hemorrhage, and necrosis, which has already been described by other researchers [60].
So far, our work represents the largest transcriptome analysis of B. jararaca venom gland.All raw and analyzed data are available for further studies, making possible further exploitation for biotechnological uses.

Conclusion
Our transcriptome analysis strategy yielded unique insights into the diversity of B. jararaca venom gland transcriptome toxins.The present results bring an important contribution to the development of snake venom-derived proteins as potential biotechnological sources, especially for the search and development of candidate molecules.4390280116.All experiments were performed in accordance with Brazil's National Council for the Control of Animal Experimentation (CONCEA) guidelines.
Cumulative number of protein of the A. carolinensis, O. hannah, Python bivittatus and UniprotKB-Swiss-Prot databases recovery by BLASTX that aligned by at least one transcript in the assembly B. jararaca transcriptome across at 80-100 percentage (%) of coverage.The transcripts identified in B. jararaca were annotated as full-length transcripts if they match a protein in the reference proteome database at E-value threshold of 1e -20 .Pct_cov_Hit: percentage of coverage of top matching hits of reference proteome that align across more than X% (80-100) with a transcript of B. jararaca.PC: protein counts of target reference proteome that aligned by at least one transcript of B. jararaca.PCS: protein count sum of reference proteins of reference proteome that aligned at X% (80-100) coverage by at least one transcript of B. jararaca.

Figure 1 .
Figure 1.Top BLAST hit distribution of predicted proteins from of Bothrops jararaca venom gland transcriptome.Recovery by Blast2GO with similarity filter parameter of 55%, E-Value-Hit-Filter: 10 -6 .

Figure 2 .
Figure 2. Gene Ontology category classification at level 2 and functional distribution of the transcriptome of B. jararaca performed by Blast2GO.The predicted proteins were functionally mapped according to the three major classifications of Gene Ontology: (A) biological process (BP), (B) molecular function (MF) and (C) cellular component (CC).They wereannotated by setting the following parameters -E-Value-Hit-Filter: 10 -6 and others parameters default.

Figure 4 .
Figure 4. Functional class annotation of toxins and accessory family proteins identified in B. jararaca venom using Animal Toxin Annotation Project as reference.

Table 1 .
Statistics of Trinity de novo assembly.

Table 2 .
Summary of transcriptome completeness assessment by BUSCO notation.

Table 3 .
Full-length transcript reconstruction analysis of B. jararaca venom gland transcriptome in relation to Anolis carolinensis, Ophiophagus hannah, Python bivittatus and UniprotKB/Swiss-Prot.