Epidemiological and functional implications of molecular variants of human papillomavirus

Human papillomavirus genomes are classified into molecular variants when they present more than 98% of similarity to the prototype sequence within the L1 gene. Comparative nucleotide sequence analyses of these viruses have elucidated some features of their phylogenetic relationship. In addition, human papillomavirus intratype variability has also been used as an important tool in epidemiological studies of viral transmission, persistence and progression to clinically relevant cervical lesions. Until the present, little has been published concerning the functional significance of molecular variants. It has been shown that nucleotide variability within the long control region leads to differences in the binding affinity of some cellular transcriptional factors and to the enhancement of the expression of E6 and E7 oncogenes. Furthermore, in vivo and in vitro studies revealed differences in E6 and E7 biochemical and biological properties among molecular variants. Nevertheless, further correlation with additional functional information is needed to evaluate the significance of genome intratypic variability. These results are also important for the development of vaccines and to determine the extent to which immunization with L1 virus-like particles of one variant could induce antibodies that cross-neutralize other variants. Correspondence


Papillomavirus heterogeneity
The first evidence for the existence of different human papillomavirus (HPV) types was obtained in the early 1970's when it was observed that mRNA purified from plantar warts hybridized with DNA purified from other plantar warts but not from warts at other sites, including anogenital ones.In the following years, the genetic heterogeneity of HPV was confirmed by restriction fragment length polymorphism analysis and this was followed by an extensive and clear defini-tion of HPV types based on DNA sequencing of different genes and the long control region (LCR) (1).
HPV types are defined as a viral genome with an L1 late gene sequence that is at least 10% dissimilar from that of any other type.To date more than 120 different HPV genotypes associated with infections in humans have been described.In addition, the Papillomavirus Nomenclature Committee has established that HPV genomes be classified into molecular variants when they present more than 98% of similarity to the prototype sequence in the L1 gene (1,2).Nevertheless, a recent study based on the comparison of the complete nucleotide sequence of 12 HPV-16 isolates from different phylogenetic branches showed that 4% of the full genome is variable within the 8 genes, and that 9.9% amino acid positions are variable (3).This study also revealed that the E4 and E5 genes are more variable than the LCR, which was previously reported to show as high as 5% dissimilarity among molecular variants (4).Furthermore, it was observed that E5 and E2 proteins had the highest proportion of nonsynonymous/synonymous amino acid variations (3).Nevertheless, a strong inter-gene sequence co-variation has been observed for HPV (5).
In the early 1990's, Ho and collaborators (4) observed that HPV-16 isolates could be distinguished by nucleotide heterogeneity within their genomes and suggested that molecular variants are generated over a large time span.It was then proposed that it would be possible to reconstruct viral phylogeny if all the intermediate genotypes were found.For this purpose, 301 HPV-16-positive cervical samples collected on the 5 continents had a fragment of the LCR sequenced and the data generated were used for the construction of a phylogenetic tree (4).In this analysis, 48 HPV-16 molecular variants characterized by 51 point mutations in 48 nucleotide positions were identified.The sequences generated were compared to the prototype genome which was isolated from a cervical cancer patient from Germany and was the first HPV-16 genome to be completely sequenced (1).It is important to highlight that the term prototype cannot be interpreted as indicative of an ancestral sequence.In fact, the ancestor of any HPV type is still unknown.The phylogenetic tree constructed is composed of 5 branches.The nomenclature of these branches reflects the geographical origin of most of the isolates within them.Variants from two of these branches were almost restricted to the African continent, which generated the denomination African 1 and 2 variants (4).Cervical samples containing African variants were also observed in other geographical regions at lower frequencies (0 to 9%) (5,14).Variants from a third branch were the only ones detected in different parts of Europe and therefore were included in the European branch.Isolates from this branch were also found in the majority of all other ethnic groups at frequencies ranging from 60% in Southeast Asia to 93% in North America (4).Interestingly, a high prevalence (>80%) of European variants was also observed in indigenous groups in Brazil and Argentina (4,15).A fourth branch was solely composed of Chinese and Japanese isolates, and therefore denominated Asian.Isolates from this branch were observed to be rare or even not found in other continents.A fifth branch, denominated Asian-American, was composed of a small fraction of all Asiatic and indigenous samples, and also of isolates from immigrant populations in America (4,5).Variants from this branch were detected only in Central and South America, and in Europe they were restricted to Spain.More recently, the highest prevalence of Asian-American variants ever was described in a Mexican cohort (88%) (12).These studies also suggested that the colonization of the American continent is reflected in its composition of HPV-16 variants.In fact, in samples from the city of São Paulo, southeast Brazil, we detected variants from all branches of geographical and phylogenetic relatedness, in agreement with the high level of miscegenation observed in this population (14).This was also observed in other studies conducted on multi-ethnic populations (16)(17)(18).Subsequent studies based on nucleotide sequence analysis of other viral genome regions (E6, L2, L1) expanded and complemented this phylogeny (5).Some variants are spread worldwide reflecting migration, whereas other variants appear to segregate according to ethnic group, suggesting evolution at these locations.A similar study was conducted on a worldwide collection of HPV-18-positive samples in which 37 molecular variants were described (6).These variants are characterized by point mutations in 25 nucleotide positions in comparison to the prototype sequence of HPV-18.The latter was isolated in Germany from a cervical cancer patient from Northeast Brazil and was fully sequenced (1).The nucleotide data generated with this collection led to the construction of a phylogenetic tree with a topology similar to that of HPV-16 (6).
The distribution and the small evolution of HPVs 16 and 18, and probably other HPV types suggest that these viruses evolved consecutively with their natural hosts over a period of a few million years.It was estimated that their diversity reflects an evolution of over 200,000 years from a precursor genome that may have originated in Africa (6,12).

Nucleotide variability and risk of cervical neoplasia
As described above, nucleotide sequence comparison has made possible the reconstruction of the origin and spread of HPVs in human populations.In addition, HPV intratype variability has also been used as an important tool in epidemiological studies of viral transmission, persistence and progression to clinically relevant cervical lesions.It has been observed that the same variants can be detected in cervical and anal smears of heterosexual women and homosexual men.Furthermore, a study conducted in Singapore detected the same variant only in 50% of sexual partners, suggesting that sexual transmission occurs with low infectivity (18).
Analysis of HPV isolates at the nucleotide level has also been suggested as a means to distinguish between persistent infection and acquisition of infection by different molecular variants.In multiple infection samples one major variant seems to persist over time while less frequent ones are transiently detected (19).In addition, data from the Canadian women's human immunodeficiency virus study group revealed that 8% of the women persistently infected by HPV-16 had transient infection by different molecular variants over time (20).However, other prospective studies found the same molecular variant of HPV-16 in 100% of the cases over follow-up time (14).The same was observed in infections of HPV types 33, 35, and 52 (11,13).
Different prospective and case-control studies are being conducted worldwide in order to analyze any possible association of increased risk of persistent infection and/or development of cervical lesion with specific variants of HPVs 16 and 18.However, because of limited sample size, individual variants are generally grouped into two categories based on nucleotide sequence variability, since it would be very difficult to detect any differences in risk based on individual variants within the group.Studies conducted in North America have revealed that non-European variants of HPV-16 are associated with increased risk for cervical intraepithelial lesion (18) and anal carcinoma in situ (21) in relation to the risk attributed to European variants.In a cohort study that is being conducted by us in São Paulo, Brazil (Ludwig/ McGill study), non-European variants of HPVs 16 and 18 were also epidemiologically associated with an increased risk of persistent infection and development of cer-vical lesion (14).Furthermore, in case-control studies conducted on people from Costa Rica and Mexico, a higher prevalence of Asian-American variants was observed in samples isolated from cervical cancer as compared to normal ones (16,17).It is important to consider the multi-ethnic composition of the populations in which non-European variants are epidemiologically associated with an increased oncogenic potential (Table 1).In contrast, in populations from different parts of Europe the majority of HPV-16 molecular variants detected are from the European branch.A subset of the variants from this phylogenetic branch has a substitution in nucleotide position 350 (T→G) of the E6 gene that leads to a substitution of the amino acid 83 from valine to leucine.This mutation can also be detected in Asian-American but not in African variants (5).In a population from the United Kingdom, the 350G variant was suggested to be associated with an increased risk of persistent infection and cytological progression to cervical intra-epithelial neoplasia grade 2/3 (22).However, this epidemiologic finding was not observed in any of the European studies or in studies conducted on other continents and the data provided are still conflicting (5,(23)(24)(25)(26)(27).It was also observed in the Japanese population that the E6 D25E variant, which is rarely found in Western countries, is more prevalent in invasive can-cer samples.In addition, pre-cancerous lesions harboring this variant were less likely to regress than those with the HPV-16 prototype (28).The D25E E6 variant was the most frequent variation detected in Korean women.However, in this population, no significant difference was associated with increased risk for cervical cancer (24).In addition, the analysis of HPV-16 samples from Thailand women revealed that this E6 mutation coincided with a specific E7 mutation at residue 29 leading to a substitution from asparagine to serine (26).This E7 variant was also more frequent in cervical cancer samples compared to precursor lesions (29).
The nucleotide variability of the HPV-18 E2 gene suggested the existence of a subtype with decreased oncogenic potential (30).However, other studies analyzing nucleotide variability in the LCR, E2, or E6/E7 genes found no correlation between HPV-18 genotypes and lesion grade (31,32).Intratypic variability among HPV-18-positive specimens from North American and Mexican women suggested an association between specific variants and the histopathology findings.Non-European variants of HPVs 16 and 18 were commonly detected in adenocarcinomas (16,33).A possible hormonal mechanism was suggested to explain this finding (2).
The association of specific variants of HPVs 33, 35, 52, and 58 with the persistence of the infection and with the severity of the neoplasia has also been observed (9,11,13).Taken together, these results indicate that, epidemiologically, the oncogenicity of molecular variants of HPVs 16, 18, and possibly other types, appears to vary not only geographically, but also with the ethnic origin of the populations under study.Hildesheim and collaborators (17) did not observe any ethnic differences between individuals infected with European or non-European HPV-16 variants, supporting the association between non-European variants and increased risk of cervical disease.On the other hand, the increased risk to acquire non-prototype-like variants of HPV-16 was reported to be associated with non-white compared to white individuals and also to non-white versus white sex partners (18).Analysis of HPV-52 specimens showed that white women were more frequently infected with the prototype than with non-prototype variants.In contrast, women of African descent were more likely to be infected with non-prototype variants (11).The prototype of HPV-52 was detected in a biopsy sample from a woman living in the US.Other associations between race and infection by different molecular variants of HPVs 31, 33, and 35 have also been described (13).However, excluding the study by Hildesheim and collaborators (17), in all other studies, race classification was based on skin color which is not an ideal method to determine the degree of genetic relatedness among individuals.Further studies should include more women to confirm or rule out the association between ethnicity and viral polymorphism.Nevertheless, such ethnic contributions to the oncogenicity of HPV variants could be explained by different distributions of human leukocyte antigen (HLA) types across populations.HLA alleles are well known to have population-related distribution and could preferentially predispose women to establish and/or retain infection with particular HPV molecular variants.

Molecular variants and immune response
The most efficient immune response against disease caused by HPV is cell mediated.T lymphocyte recognition of HPV peptides is important for the control of infections and the development of cervical lesions.Thus, the presentation of viral peptides to T cells may be influenced by genetic variability of both HPV and HLA.An increased risk for cervical cancer associated with HPV-16 infection was attributed to carriers of HLA DRB1*1501/DQB1*0602 haplotypes among women from New Mexico.However, in Norway, HLA DRB1*0301positive women had an increased risk of developing cervical intra-epithelial neoplasia grade 3 after infection by HPV-16 (34).
HPV 16 and 18 E2 variants were more frequently detected in women with HLA DR/DQ haplotypes 0401/0301 and 0101/ 0301 (32).Furthermore, it was observed in the Japanese population that DRB1*1501 and DQB1*0602 frequencies were significantly increased among patients infected with the HPV-16 prototype E6, and DRB1*1502 was associated with infection by the D25E variant (35).In addition, it was suggested that the G131 E6 molecular variant was associated with a worse prognosis in HLA-B7 women with cervical cancer.Computer models have suggested that the change within this E6 sequence alters an HLA-B7-binding peptide which could affect the cytotoxic T lymphocyte response.Nevertheless, the association between this variant, the HLA-B7 allele and prognosis was not further confirmed (36).More recently, Zehbe and collaborators (37) described the correlation between specific HLA I and II haplotypes and cervical cancer positive for the 350G E6 HPV-16 variant.These data are still very conflicting since other studies did not observe any association between HLA haplotypes and specific HPV-16 molecular variants (23).This suggests that other factors in addition to HLA polymorphism may be associated with acquisition of a variant in the development of cervical carcinoma.Largescale epidemiologic studies are required to finally demonstrate the role of HLA in HPVinduced cervical disease.
It is also of interest to define alterations that may interfere with antigenic properties of the main capsid protein L1.For instance, variant 114K of HPV-16 assembles into virus-like particles (VLPs) in a heterologous expression system, whereas the prototype isolate is incapable of forming VLP.This difference is attributed to a single amino acid change of residue 202 of the L1 protein (38).It was also observed that the yield of VLPs produced varied within a range from 1 to 79 depending on the HPV-16 L1 variant used, and that mutations of residues 83 to 97 seemed to affect the level of L1 expression (39).In this study, variants which differed by up to 15 amino acids from the L1 prototype were used.
One or more amino acid alterations within the L1 protein of HPV-16 could represent a conformational change in the capsid protein and thus could also affect the conformation of epitopes relevant for viral neutralization.However, several studies have indicated that molecular variants of HPV-16 are crossreactive since it was observed that sera from patients infected with different HPV-16 genotypes were capable of neutralizing VLPs of different HPV-16 variants (39,40).No differences in seroconversion rates between women infected with prototype and nonprototype variants were also observed (18).Furthermore, the absence of HPV-16 E6 and E7 antibodies was independent of the variant detected in cervical cancer patients (41).Even more important, it was reported that, from a vaccination perspective, molecular variants of HPV-16 belong to only one serotype (42).On the other hand, it was observed that L1 amino acid substitutions detected in cervical cancer patients and not normally found in natural variants could prevent VLP assembly.These substitutions were also unable to induce innate and adaptive immune responses in mice (43).This was suggested to be a mechanism for evasion from the immune system during carcinogenesis.Although lymphoproliferative responses to virions have been extensively described, very few studies have been conducted in order to map L1 epitopes.Strang and collaborators (44) defined the interaction between HPV-16 L1 peptides and the HLA class II DR4Dw4 allele.However, only one of the peptides described overlaps a substitution detected in molecular variants of the African 2 branch.Recent data from our laboratory also indicate that some alterations in E6/E7 proteins of natural HPV-16 variants seem to affect the proliferation of peripheral blood mononuclear cells (Souza PSA, personal communication).The relevance of these results for vaccine development remains to be shown.This same question also applies to crossprotection against different HPV types that could be involved by these vaccines.Some immune responses against HPV in both nonvaccinated and vaccinated people have been shown to be exquisitely type-specific; thus, cross-protection is unlikely to occur.This is the basis for second-generation vaccines encompassing up to six different high-risk HPV VLPs currently being developed.

Nucleotide variability and oncogenic potential
Differences in transformation activity of HPVs 16 and 18 have been ascribed to the LCR and E6/E7 early genes (45).Thus, it is reasonable to suppose that mutations in these regions may affect the clinical outcome of the infection.To date, little has been published concerning the functional significance of molecular variants (Table 2).
The early proteins E6 and E7 bind to and functionally inactivate the tumor suppressor proteins p53 and pRB, respectively, thus leading to the disruption of the repair process and the cell-cycle machinery.Only E6 and E7 from high risk HPV types are capable of immortalizing primary human foreskin and of altering the functions of these tumor suppressor proteins.These observations indicated the importance of nucleotide variability in the oncogenic potential of different HPV types.Furthermore, although both HPVs 16 and 18 can abrogate terminal differentiation of keratinocytes induced by calcium and serum, this occurs with different efficiencies (45).
The LCR comprises about 10% of the viral genome and contains sequences important for the regulation of viral replication and transcription of early genes.The central segment of the LCR of high risk HPV genital types encloses an epithelial cell specific en-hancer that contains several binding sites for viral (E2 and E1) and cellular transcription factors (AP-1, Sp-1, NF-1, Oct-1, TEF-1, YY-1, KRF-1, Skn-1a, and TFIID) (2).The activation of the main early promoter of HPV involves synergism between these proteins that vary in affinity for the different recognition sites throughout the LCR.Most of these factors stimulate transcription, although YY-1 can either repress or stimulate.In addition, the LCR contains glucocorticoid-responsive elements that were shown to mediate transcriptional activation by hormones in transient transfection experiments (46).The importance of chromatin organization in the transcriptional process of HPVs 16 and 18 has also been recognized (47).
Nucleotide changes within the LCR that  overlaps cellular transcription binding sites have been extensively described (4,5,14).This variability could influence the binding affinity of the different transcription factors and thus affect viral transcription and replication.Not only mutations overlapping binding sites for the negative regulator YY-1 in the HPV-16 LCR are more commonly detected in cancer samples but also the promoter activity of these genomes is enhanced as compared to the prototype sequence (57).Luciferase activity assays showed that although HPV-16 European isolates had similar transcriptional activity, the Asian-American and the North-American 1 variants presented an enhanced activity as compared to European sequence of the HPV-16 prototype.On the other hand, one African variant tested exhibited promoter activity similar to that of the prototype (48,49).The LCR of the European variants analyzed in these studies had only few nucleotide changes as compared to the prototype.However, this was different for the non-European variants tested (4).Site-directed mutagenesis assays revealed that the enhancement observed in the promoter activity could be attributed to mutations in the 3' segment of the LCR (from nucleotide 7622 up) (48,49).The Asian-American and the North-American 1 variants contain a mutation at nucleotide position 7729 that is not present in the African or the other European genotypes.Nevertheless, neither of the tested nucleotide changes seemed to be responsible in itself for the enhanced transcriptional activity.In fact, the combination of nucleotide changes in the corresponding LCR could be responsible for the observed functional differences.
Concerning HPV-18 LCR intratypic nucleotide variability, we observed that the Asian-Amerindian B18-3 isolate was the most transcriptionally active relative to the activity of some European isolates tested (50).Furthermore, we observed that the HPV-18 prototype sequence was more active than the HPV-16 prototype.It has also been re-ported that a commonly detected substitution overlapping the Sp-1 binding site at the 3' end of the LCR enhances the transcriptional activity from the P105 HPV-18 main early promoter, since the binding of the Sp-1 protein to this sequence is also increased (31).
An extensive analysis of the E6 gene of different molecular variants of HPV-16 revealed that common amino acid changes identified overlap positions crucial for p53 interaction or for host immune surveillance (5).Stöppler and collaborators (51) analyzed three E6 molecular variants of HPV-16 and observed that the Asian-American protein surpassed the E6 European prototype sequence in its ability of stimulating the induction of differentiation-resistant colonies of human foreskin keratinocytes in cooperation with the E7 reference protein, and also of inducing p53 degradation in vitro.On the other hand, the African E6 variant analyzed showed reduced activity in both functions, suggesting that molecular variants differ in their biological and biochemical properties.More recently, it was observed that the 350G variant also surpassed the E6 prototype in enhancing the mitogenactivated protein kinase signaling and in the cooperative transformation with the deregulated Notch 1 pathway (52).Furthermore, it was observed that, in contrast to the E6 prototype sequence, the 350G variant inhibits oncogenic ras-mediated transformation suggesting that the quantitative differences in activation of mitogen-activated protein kinase signaling by the E6 protein and the 350G variant correlate with differences in the cooperative transformation by other signaling pathways.
Although it has been observed that the E7 protein is much conserved among different molecular variants, variability within this region could affect some of its known properties.In fact, in vitro mutational analysis has shown that the substitution of a single nucleotide in the E7 gene of HPVs 6 and 16 could affect the malignant transformation (58).It was also observed that E7 HPV-16 variants differ in their binding affinity to the pRB protein in a yeast two-hybrid system (53).
The association between specific variants of HPV-16 and the increased risk for cervical neoplasia could not be attributed to differences in the amino acid sequence of the E5 protein.However, a significantly greater usage of a common mammalian codon was observed among these variants (59).This event could confer a selective growth advantage on such variants since it has been suggested that codon usage may be important in the regulation of early HPV expression.However, similar in vitro transforming activities for HPV-16 E5 variant proteins were also described (54).
Another important issue that could influence the oncogenicity of the molecular variants is virus-driven regulation of transcription and replication.It was initially reported that the transcriptional transactivation potential of the E2 proteins from different HPV-16 variants differed only slightly among variants (49).More recently, it was suggested that in European variants the early E6/E7 gene transcription is repressed by the E2 protein and is frequently up-regulated by the interruption of the E2 gene during viral integration.In contrast, in tumor samples harboring Asian-American variants this gene was observed to be frequently retained (55).The association of Asian-American variants with retention of E1/E2 genes suggests that E2 nucleotide sequence variability could be an alternative mechanism up-regulating the expression of viral oncogenes.The HPV-16 Asian-American E2 protein has 16 non-synonymous mutations distributed along the gene as compared to the prototype sequence.However, none of these substitutions overlaps sequences of known consensus splicing sites located in the E2 gene.Casas and collaborators (60) also observed that Asian-American variants were associated with E1/ E2-positive carcinomas with more than 50 viral copies/cell.This copy number was higher than that observed for European variants, suggesting that Asian-American variants may replicate better than European variants.In fact, European and African HPV-16 variants exhibit lower replication efficiency compared to the Asian-American isolates, which could, in turn, be attributable to their efficient expression of replicating factors E1 and E2 (56).
Most epidemiological and functional studies suggest that Asian-American variants of HPVs 16 and 18 are more oncogenic than European variants.Thus, these genotypes could constitute a marker of cervical cancer onset and progression and could partially explain why some lesions progress to cancer while others do not.It has been shown that nucleotide variability within the LCR could lead to an enhanced expression of the E6 and E7 oncogenes.Furthermore, the variability within these proteins themselves may also have functional significance, as suggested by some in vivo and in vitro studies.Efficient E1/E2 expression and elevated viral replication levels during persistent infection may also represent a risk factor in HPV-16 mediated oncogenesis.Further correlation with additional functional information is needed to evaluate the significance of intratypic genome variability.
Amino acid changes within some variants are located in epitopes critical for the immune response.Thus, vaccines developed against one variant may have a reduced efficacy in countries where these variants are less prevalent.To date, few studies have been conducted to address this issue among HPV-16 isolates.All of these studies have indicated that molecular variants of HPV-16 are serologically cross-reactive.Nevertheless, additional studies are necessary to evaluate epitopes that could be affected by mutations elsewhere in the genomes of these variants for the rational design of anti-HPV vaccines.

E7
Change in binding to cellular proteins Different binding affinity to the pRB protein among HPV-16 molecular variants (53) Change in activation of cellular pathways Generation and maintenance of the transformed phenotype E5 Change in transforming activity HPV-16 E5 variant proteins have similar in vitro transforming activities (54) L1 Affects L1 conformation-dependent Single amino acid changes can affect the efficiency of HPV-16 L1 proteins to epitopes that are relevant for virus self-assemble into VLPs (38,39) neutralization Variants of HPV-16 are serologically cross-reactive (39-41) E2 Change in transcriptional activity No significant differences concerning the transcriptional transactivation potential Increased tendency to integrate of HPV-16 E2 molecular variants were observed (49) Increased replication efficiency Non-integrated DNA is frequently detected in tumor samples harboring HPV-16 Asian-American variants (55) HPV-16 AA variants have enhanced replication efficiency (56) LCR = long control region; HFK = human foreskin keratinocytes; VLPs = virus-like particles.

Table 1 .
Epidemiologic associations between non-European variants of human papillomavirus-16 and increased risk for cervical neoplasia.

Table 2 .
Functional implications of human papillomavirus (HPV) gene variability.