Molecular characterisation of dengue virus type 1 reveals lineage replacement during circulation in Brazilian territory

Dengue fever is the most important arbovirus infection found in tropical regions around the world. Dispersal of the vector and an increase in migratory flow between countries have led to large epidemics and severe clinical out-comes, such as dengue haemorrhagic fever and dengue shock syndrome. This study analysed the genetic variability of the dengue virus serotype 1 (DENV-1) in Brazil with regard to the full-length structural genes C/prM/M/E among 34 strains isolated during epidemics that occurred in the country between 1994-2011. Virus phylogeny and time of divergence were also evaluated with only the E gene of the strains isolated from 1994-2008. An analysis of amino acid differences between these strains and the French Guiana strain (FGA/89) revealed the presence of important non-(cid:86)(cid:92)(cid:81)(cid:82)(cid:81)(cid:92)(cid:80)(cid:82)(cid:88)(cid:86)(cid:3)(cid:86)(cid:88)(cid:69)(cid:86)(cid:87)(cid:76)(cid:87)(cid:88)(cid:87)(cid:76)(cid:82)(cid:81)(cid:86)(cid:3)(cid:76)(cid:81)(cid:3)(cid:87)(cid:75)(cid:72)(cid:3)(cid:68)(cid:80)(cid:76)(cid:81)(cid:82)(cid:3)(cid:68)(cid:70)(cid:76)(cid:71)(cid:3)(cid:86)(cid:72)(cid:84)(cid:88)(cid:72)(cid:81)(cid:70)(cid:72)(cid:86)(cid:15)(cid:3)(cid:76)(cid:81)(cid:70)(cid:79)(cid:88)(cid:71)(cid:76)(cid:81)(cid:74)(cid:3)(cid:85)(cid:72)(cid:86)(cid:76)(cid:71)(cid:88)(cid:72)(cid:86)(cid:3)(cid:40)(cid:21)(cid:28)(cid:26)(cid:3)(cid:11)(cid:48)(cid:72)(cid:87)(cid:314)(cid:55)(cid:75)(cid:85)(cid:12)(cid:3)(cid:68)(cid:81)(cid:71)(cid:3)(cid:40)(cid:22)(cid:22)(cid:27)(cid:3)(cid:11)(cid:54)(cid:72)(cid:85)(cid:314)(cid:47)(cid:72)(cid:88)(cid:12)(cid:17)(cid:3)(cid:36)(cid:3) phylogenetic analysis of E proteins comparing the studied isolates and other strains selected from the GenBank database showed that the Brazilian DENV-1 strains since 1982 belonged to genotype V. This analysis also showed that different introductions of strains from the 1990s represented lineage replacement, with the identification of three lineages that cluster all isolates from the Americas. An analysis of the divergence time of DENV-1 indicated that the lineage circulating in Brazil emerged from an ancestral lineage that originated approximately 44.35 years ago.

During the 20th century, the world faced the re-emergence of many infectious diseases, including dengue, which is the most important arbovirus in the world in terms of morbidity and mortality and represents a serious public health threat in most tropical countries (Guzman & Kouri 2002).An estimated 2.5 billion people are at risk of infection worldwide and the most affected areas are tropical and sub-tropical countries in Southeast Asia, the Pacific region and the Americas (Guzman et al. 2010).
In Brazil,12,363,690 cases of dengue virus (DENV) disease were identified from 1990-2011, with 764,032 cases in 2011 alone.Since the first cases of dengue reported in 1982 in Boa Vista (state of Rondônia) (Osanai et al. 1983) and the introduction of DENV-1 in 1986 (Schatzmayr et al. 1986), this serotype had shown a sporadic circulation, but epidemiological surveillance data showed that there was a high circulation of the DENV-1 serotype in the country between 2009-2011, mainly in states where the population had not previously been in contact with the virus (MS 2010(MS , 2011)).
Differences in the severity of dengue associated with certain serotypes or specific genotypes have been widely discussed (Pandey et al. 2000, Rico-Hesse 2003, Raekiansyah et al. 2005, Kyle & Harris 2008, Weaver & Vasilakis 2009).However, several studies comparing the sequences of DENV isolates from patients with dengue haemorrhagic fever (DHF) and dengue shock syndrome (DSS) have identified genetic variations between isolates, but have found no consistent differences that could be correlated with disease severity (Blok et al. 1991, Sistayanarain et al. 1996, Mangada & Igarashi 1998, Uzcategui et al. 2001, dos Santos et al. 2002, Raekiansyah et al. 2005).
Genetic variants of the four DENV serotypes were identified by partial sequencing of the structural region of the viral genome, which has contributed to the description of intra-serotype genotypes (Rico-Hesse 1990, Lanciotti et al. 1994, 1997, Zanotto et al. 1996, Rico-Hesse et al. 1997, Holmes & Burch 2000, Gonçalvez et al. 2002).An analysis of the genetic relationship between DENV isolates from different geographic regions in various endemic areas permitted the establishment of genotype groups showing over 6% nucleotide divergence.For DENV-1, five genotypes thus far have been identified worldwide (Rico-Hesse 1990, Gonçalvez et al. 2002).
DENV-1 was introduced to the Americas in 1977 and at least one or two genotypes apparently circulate in the American continent (Gonçalvez et al. 2002, Rico-Hesse 2003).According to Santos et al. (2004) and Pires Neto et al. (2005), studies of the E/NS1 junction region of Brazilian strains suggest that genotype V (America/Africa/ Asia) is the only strain circulating in Brazil to date.Phylogenetic studies performed by dos Santos et al. (2011) considering the DENV-1 strains recovered in the state of Rio de Janeiro (RJ) showed that there are multiple lineages of this serotype circulating in Brazil.
The aim of the present study was to analyse the genetic variability at the level of the structural C/prM/ M/E genes of the DENV-1 isolates obtained over the course of epidemics that occurred in several Brazilian states between 1994-2011.Furthermore, we performed phylogenetic studies of the time of divergence of the DENV-1 strains to provide a more thorough understanding of the epidemiology profile of strains circulating in Brazil and their sources.

MATERIALS AND METHODS
Viral strains and RNA extraction -Twenty-nine viral isolates (Supplementary data) were obtained from the virus collection at the National Reference Laboratory at the Department of Arbovirology and Haemorrhagic Fever, Evandro Chagas Institute.The isolates were obtained from different Brazilian states in the North [Acre, Amapá, Amazonas, Pará (PA), Roraima (RR), Tocantins], Northeast (Ceará, Maranhão, Piauí, Rio Grande do Norte), Central-West [Mato Grosso(MT)] and Southeast (Minas Gerais) Regions.These viruses were isolated from pools of adult female Aedes aegypti mosquitoes and human samples (blood and viscera fragments) between 1994-2008.Twenty-seven isolates recovered from humans were selected according to data collected in the Information System for Notifiable Diseases dengue epidemiological form.Two strains were isolated from mosquitoes collected in Belém, PA.The strains were obtained from cell cultures of Aedes albopictus clone C6/36 during the second passage (Igarashi 1978).Viral RNA was recovered from C6/36 culture supernatants when a 75% cytopathic effect in the cell monolayer was shown.RNA was extracted by the Trizol TM LS method (Gibco Life Sciences, San Diego, USA) according to the manufacturer's instructions.A mouse-adapted DENV-1 isolate Mochizuki strain was used as a positive control.

Real-time polymerase chain reaction (RT-PCR)
and nucleotide sequencing -The C/prM/M/E genes of DENV-1 were fully amplified by RT-PCR according to a method adapted from Lanciotti et al. (1992) using four primer pairs specific for each gene (Supplementary data).The amplicons were purified using the PureLink Quick Gel extraction kit (Invitrogen, California, USA) according to the manufacturer's instructions.The purified cDNA was directly sequenced in an automated capillary ABI 3130 sequencer (Applied Biosystems, USA) using the ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction 3.1V kit (Applied Biosystems, USA) by the dideoxyribonucleotide chain termination method (Sanger et al. 1977).The amplicons were sequenced in both directions with the same primers used for RT-PCR.
Nucleotide and amino acid sequence analysis -The nucleotide sequences were assembled, aligned and analysed with the SeqMan, Editseq and Megalign v.5.0 programs (LaserGene DNA Star ® Package, USA).To analyse the diversity of the genes, C/prM/E residues were considered variable sites when at least one virus showed a nucleotide substitution at that site leading to a synonymous or nonsynonymous substitution.Cleavage sites were identified according to the proteolytic processing scheme for open reading frames of flaviviruses developed by Rice and Strauss (1990).The SignalP 3.0 program (cbs.dtu.dk/services/SignalP/) was used to identify the host signal peptidase cleavage sites at the C/prM, prM/E and E/NS1 protein junctions (Chang et al. 2000).Glycosylation sites and cysteine residues were predicted using the NetNGlyc 1.0 (cbs.dtu.dk/services/NetNGlyc/) and Protean (included in the LaserGene DNA Star ® Package) programs.The amino acid sequence of the fusion peptide was identified using the Megalign program.
Phylogenetic analysis and time of divergence -The E gene, comprising 1,485 bp, was chosen for this analysis.The nucleotide sequences obtained in this study were compared with each another and with 48 other sequences of strains isolated from different parts of the world and deposited in GenBank (ncbi.nlm.nih.gov) that correspond to the main genotypes of DENV-1 (Supplementary data).The phylogenetic tree was constructed by the maximum likelihood (ML) method (Felsenstein 1981) using the PHYML program (Guindon & Gascuel 2003).
The Modeltest program v.3.7 was used to select the best model of nucleotide substitution for the sequences based on Akaike's information criterion (Posada & Crandall 1998).Bootstrap analyses with 1,000 replicates were performed to increase the reliability of the groups obtained (Felsenstein 1985).Two old DENV-1 strains (Gen-Bank accessions EU848545 and AB074760) were used as outgroups to permit tree rooting.The obtained tree was visualised using the Tree View Program (Page 1996).
Inferences regarding the time of divergence were made based on the dates of isolation of the DENV-1 strains and on the time of isolation of the GenBank strains obtained from previously published data.Analyses were carried out using the maximum posterior probability tree generated with the Bayesian Evolutionary Analysis by Sampling Trees program v.1.5.2 (beast.bio.ed.ac.uk/) (Drummond & Rambaut 2007), which uses Bayesian Markov chain Monte Carlo (MCMC) algorithms combined with the chosen model and prior knowledge of sequence data to infer the posterior probability distribution of phylogenies.A chain of 20 million trees was used, with samplings fixed at every 1,000 trees generated.The time of viral divergence was estimated based on the date of strain isolation by calculating the rate of nucleotide substitution for DENV-1 using a lognormal relaxed molecular clock model.All estimates were obtained using the general time reversible nucleotide substitution model (Rodriguez et al. 1990) (base frequencies = 0.3230 0.2070 0.2621 0.2079, gamma distribution shape parameter = 2.0200 and proportion of invariable sites = 0.5360).The generated trees were visualised with the FigTree 1.2.2.program (tree.bio.ed.ac.uk/software/ figtree/) and statistical analysis was carried out in the Tracer package (tree.bio.ed.ac.uk/software/tracer/).

RESULTS
Analysis of multiple alignments of nucleotide and amino acid sequences of the structural protein genes of DENV-1 -Sequence with 2,325 nucleotides were obtained from the 29 DENV-1 isolates used in this study, which corresponds to 775 amino acids of the C, prM/M and E proteins.After assembly, these sequences were aligned with five other sequences available in the Gen-Bank database (Supplementary data) for the Brazilian strains BR-90 (RJ, GenBank accession AF226685), BR97-111 (Pernambuco, GenBank accession AF311956), BR/01-MR (Paraná, GenBank accession AF513110), 15/ BR/MS/2010 (MT, GenBank accession HQ696612) and 0122/BR/RJ/2011 (RJ, GenBank accession JN122281) to evaluate the genetic identity and divergence among the DENV-1 strains circulating in Brazil.
The thirty-four strains compared in this study presented a nucleotide identity ranging from 96-100% and a divergence of up to 4.3% when compared with each other and with the other five Brazilian strains selected (Supplementary data).Identity at the amino acid level ranged from 98-100% and the divergence did not exceed 1.7% (Supplementary data).
Genetic variability was analysed by comparing 34 amino acid sequences among themselves and with the strain AF226687/FGA/89 (French Guiana).This analysis showed a large number of synonymous substitutions, most of which were located in the third position of the codon; all substitutions that occurred in the second position of the codon resulted in amino acid changes.
Considering the analysis of the Brazilian strains alone, it was observed that various amino acids were conserved in all strains studied: cysteine residues involved in the formation of disulfide bridges in prM/M (4 residues) and protein E (12 residues), the fusion peptide sequences located between residues 98-113 of E protein and predicted glycosylation sites located at positions 67 and 153 of E protein.The nonsynonymous substitutions that resulted in a change in the biochemical character of the amino acid were point substitutions among the Brazilian isolates, i.e., they occurred in one of the two strains studied.However, for two different positions located in the E protein, we found nonsynonymous substitutions that changed the biochemical character in several strains: Met→Thr and Ser→Leu at positions E297 and E338, respectively (Fig. 1).
The genome regions corresponding to the proteins C and prM/M were found to be conserved.Nonsynonymous substitutions resulting in amino acids with different biochemical properties were only found in strains H721251 [C10 (Arg→Gln)], H748499 [C75 (Asn→Glu)], H739688 [M133 (Phe→Ser)] and H693852 [M143 (Ala→Thr)].Some identified nonsynonymous substitutions resulted in amino acids with the same biochemical properties.For example, the change in residue prM29 (Ala→Val) was identified in 14 (48.2%) of the strains studied.
The largest number of substitutions was observed in the region of E protein, with changes in residues E180 (Thr→Ala) and E473 (Ala→Thr) being detected in all strains studied (Fig. 1).Residue E180 is located in domain I of the protein and residue E473 is found in the transmembrane region.Substitutions at residues E297 (Met→Thr) and E338 (Ser→Leu) were observed in 12 strains, with the former being located in domain I and the latter in domain II of protein E.
In addition, substitutions at residues E96 and E379 (located in domains II and III, respectively) were observed in all strains studied.However, these substitutions did not alter the biochemical character of the amino acid, except in strain H628435 at position E96 (Val→Tyr).
When compared to the clinical presentation of the patients, the amino acid changes identified were not related to the severity of the disease.Nonsynonymous substitutions were identified in strains isolated from patients with dengue fever and from those with DHF.However, except for strain H527543, all strains isolated from patients with DHF presented an elevated number of substitutions in protein E and four of the six strains isolated from patients with DHF showed changes in residue E338.

Phylogenetic analysis and time of divergence -
The phylogenetic analysis generated by the ML tree identified five genotypes (I, II, III, IV and V) (Fig. 2).The main nodes corresponding to each genotype presented bootstrap values of 100%.All Brazilian isolates from this study belonged to genotype V, which was further divided into four lineages (A, B, C and D).The clusters were grouped into distinct clades with strains from the Americas belonging to lineages A, B and C and strains from West Africa belonging to lineage D.
In lineages A, B and C, we observed the presence of multiple strains circulating in Brazil.Lineage A comprised the isolates from this study and other Brazilian strains isolated since 2000, except BR/BeH733587/2007, which was isolated in RR and was more closely related to the DENV-1/AR/AY277664/1999 strain from Argentina.Lineage B grouped strains that were isolated from 1994-2002, including one strain isolated from Argentina.Lineage C included strains BR/BeH739688/2007, BR/BeH748499/2008 and isolates from Paraguay, Colombia, Venezuela and Mexico.
The time of divergence between strains was estimated and the evolutionary rate of the E gene was 5,795 x 10 -4 substitutions per site per year.We used a Bayesian inference based on MCMC to reconstruct Brazil's DENV-1 coalescent history.Estimates of the divergence time in the phylogenetic tree (Fig. 3) showed that DENV-1 originated from an ancestor approximately 121.5 years ago (1890).This ancestor diverged, giving rise to the ancestral lineages that are responsible for the emergence of the five DENV-1 genotypes identified thus far.The ancestor that gave rise to genotype V, found in the Americas and West Africa, diverged approximately 70.48 years ago (1940)(1941)(1942)(1943)(1944)(1945).
The Brazilian isolates diverged 44.35 years ago (1965)(1966)(1967)(1968)(1969)(1970), lineage B and C originated from a com-mon ancestral lineage approximately 42 years ago (1970) and lineage A originated 22.84 years ago (1985)(1986)(1987)(1988)(1989)(1990).The estimation of divergence time for all genotypes suggests that the DENV-1 found in Africa and the Americas (including Brazil) originated from a common ancestral lineage.From this common ancestor emerged a genetic lineage that gave rise to two other lineages: one directly entered Brazil and formed a group that includes only the Brazilian isolates of this study obtained after 2001 and the other ancestral lineage first entered America and then spread throughout the continent, giving rise to a group that includes the strains isolated between 1994-2000 and strains BR/BeH693852/2001, BR/BeH650975/2002, BR/ BeH739688/2007 and BR/BeH748499/2008, in addition to other American strains.

DISCUSSION
The results of the present study suggest that the high rate of synonymous substitutions identified in the strains analysed might be related to the long period of DENV-1 circulation in Brazil, even at low rates.Analyses of the amino acid sequences revealed a substitution at residue prM29 (Ala→Val) in various strains of the study, suggesting that this position may be a hot spot for mutations.dos Santos et al. (2002) also identified changes at this position.
Furthermore, Catteau et al. (2003) reported that the M ectodomain is involved in the induction of apoptosis by the four DENV serotypes, suggesting that exporting the M ectodomain from the Golgi apparatus to the plasma membrane is essential for the initiation of apoptosis by the Fig. 1: differences of amino acids among the gene E sequences of Brazilian dengue virus serotype 1 (DENV-1) compared among themselves and with the FGA/89 DENV-1 strain (French Guiana).Amino acids in red: nonsynonymous substitution with biochemical change of character; amino acids in black: nonsynonymous substitution without change in biochemical character.Asterisk means that there was no replacement.mitochondrial apoptotic pathway.The analysis of amino acid substitutions that occurred in this protein is essential to our understanding of how the M protein might play a key role in the fate of cells infected with DENV.
Amino acid substitutions in structural proteins were principally observed in the region of the E protein.This protein is located on the surface of DENV and is the dominant viral antigen, exhibiting haemagglutinin properties.The protein binds to the host cell membrane and induces a protective immune response in humans (Chambers et al. 1990).A significant substitution was observed at position E338, with a change from Ser to Leu.This residue is located in domain III of the protein, a region comprising residues 299-397, which form the C-terminal end of solubilised E protein.This domain is principally involved in receptor binding, with the residues exposed on the viral surface being responsible for the determination of receptor specificity, type of vector and host and cell tropism (Chen et al. 1997, Mandl et al. 2000, Crill & Roehrig 2001, Nayak et al. 2009).This substitution was identified in various strains in this study and has been previously described elsewhere (dos Santos et al. 2002, Barrero & Mistchenko 2004).
Residue E297, where a substitution of Met for Thr occurred, is located at the interface between domains I-II.Rey et al. (1995) suggest that mutations in the domain interfaces of the E protein may influence the pathogenicity of flaviviruses.Amino acid substitutions at this residue Fig. 2: phylogenetic analysis of gene E of dengue virus serotype 1 (DENV-1).Seventy-seven sequences corresponding to genotypes I, II, III, IV and V were obtained from GenBank.The bootstraps values (calculated after 1,000 replicates) for the maximum likelihood method were above the node of each main group.The EU848545 and AB074760 sequences of DENV-1 were used as outgroups for analysis.The Brazilian isolates analyzed in this study are highlighted by black triangles.The scale bar corresponds to the nucleotide substitution rate.The vertical distance is given for illustrative purposes only.
were observed in various isolates of this study and have also been described by other investigators (Després et al. 1993, Gonçalvez et al. 2002, Barrero et al. 2004).
Furthermore, amino acid changes were detected at residue E473.This residue is located in the signal peptide region of NS1, which is involved in processing of the viral polyprotein (Barrero & Mistchenko 2004).This change identified in all isolates of the present study was characterised by a substitution of Ala for Thr.These results agree with previously published information and no correlation was described between the amino acid substitution at this residue and possible changes in the structure of the E protein (Després et al. 1993, Barrero & Mistchenko 2004).
Analyses of the amino acid sequences showed no relationship between the genetic variation in DENV-1 and the clinical presentation of patients.This finding might be explained by the existence of other factors involved in the development of DHF and DSS, such as differences in host susceptibility.Determinants of susceptibility that lead to a predisposition or resistance to DHF/DSS include human leukocyte antigens, age, gender, pre-existing chronic disease and antibody facilitation, which consists of the formation of immune complexes in a secondary infection caused by a different DENV serotype mediated by antibodies derived from a primary infection (Holmes & Twiddy 2003, Coffey et al. 2009).
The existence of lineages with distinctive geographical and temporal relationships was suggested by work conducted on DENV-1 in Colombia (Mendez et al. 2010).We attempted to reconstruct the phylogenetic history of the virus in Brazil.The phylogenetic analysis is in accordance with the classification proposed by Rico-Hesse (1990) and Gonçalvez et al. (2002), who assigned the Brazilian strains to genotype V.
dos Santos et al. (2011) analysed strains isolated in RJ from 2009-2011 and identified multiple lineages.Despite the multiple lineages circulating in Brazil over the years, we observed an independent segregation between lineages A-B, demonstrating the lineage replacement that has occurred since the introduction of DENV-1 in Brazil.Strains BR/BeH748499/2008 and BR/BeH739688/2007 were found to be more related to strains originating from other Latin American countries.This finding suggests a Latin American origin for those strains that may have entered the country due to the geographic proximity.
Therefore, outbreaks associated with the re-emergence of DENV-1 in Brazil that have been occurring since 2009 may be due to lineage replacements.However, others factors such as a renewed cohort of susceptible people should be considered because DENV-1 was the first serotype to be introduced into the country since 1982.A massive circulation in the first endemic years (1980s) was followed by a low circulation.
Estimations of the time of divergence for DENV-1 showed that this virus originated from an ancestral lineage approximately 121.5 years ago, in agreement with Twiddy et al. (2003) and Gonçalvez et al. (2002), who estimated a divergence time of 125 and 100 years, respectively.The present results regarding the divergence time permit us to consider the epidemic transmission cycle of dengue in humans and to infer that the DENV-1 circulating in Brazil originated from a common ancestor shared by strains from Africa at the end of the 19th century and the beginning of the 20th century.
The chronological data reported here suggest that the ancestral lineage that gave rise to the genotype V found circulating in the Americas emerged between 1940-1945.This period was preceded by dengue outbreaks in some American countries, including the United States (Texas, 1922) and some Caribbean countries (1934).In the following decades, the disease was disseminated to other countries, such as Panama, Puerto Rico, Venezuela and Cuba (Guzman & Kouri 2002).
The common ancestor that gave rise to the Brazilian isolates emerged between 1965-1970 and the strains studied emerged at the end of the 1960s and during the 1970s.During this period, the Ae.aegypti eradication program was interrupted in South and Central America, an event that resulted in the occurrence of numerous dengue outbreaks and dispersal of the virus throughout Brazil over subsequent decades (Gubler 1998, Guzman & Kouri 2002).
Strains BR/BeH748499/2008 and BR/ BeH739688/2007 were more related to the strains isolated in other American countries, which suggests that the dynamics of population flow worldwide contribute to the rapid spread and introduction of new lineages of this virus (Raghwani et al. 2011).Phylogeographic studies of DENV-1 strains isolated in Brazil are necessary to define possible routes of entry into Brazil, similar to what was done for DENV-1 in Vietnam and DENV-3 in Brazil (Araújo et al. 2009, Raghwani et al. 2011).
Comparisons of the amino acid changes found between the Brazilian DENV-1 strains studied with the strain from French Guiana (FGA/89) showed important nonsynonymous substitutions that altered the biochemical character of the amino acids.However, further analyses of the 3D structure of the proteins and molecular dynamics and mutagenesis analyses are necessary to investigate whether some of these substitutions may be related to DENV-1 virulence, particularly those identified at positions E297 and E338.In addition, the phylogeny and time of divergence estimated for DENV-1 demonstrated the importance of monitoring the introduction and spread of this virus in the country to control outbreaks of dengue.

Fig. 3 :
Fig. 3: molecular clock of dengue virus serotype 1 (DENV-1).DENV-1 divergence time was estimated using year of isolation as calibration points under the strict molecular clock model.The numbers indicated correspond to the scale bar in years, the divergence time and posterior probability values are indicated in the main nodes.The 95% highest posterior density intervals for each divergence time are represented by the blue bars.Strains of the study are highlighted in red, the roman numerals represent the genotypes of DENV-1 and the letters represent the lineages.