Prediction of exposed domains of envelope glycoprotein in Indian HIV-1 isolates and experimental confirmation of their immunogenicity in humans

We describe the impact of subtype differences on the seroreactivity of linear antigenic epitopes in envelope glycoprotein of HIV-1 isolates from different geographical locations. By computer analysis, we predicted potential antigenic sites of envelope glycoprotein (gp120 and gp4l) of this virus. For this purpose, after fetching sequences of proteins of interest from data banks, values of hydrophilicity, flexibility, accessibility, inverted hydrophobicity, and secondary structure were considered. We identified several potential antigenic epitopes in a B subtype strain of envelope glycoprotein of HIV-1 (IIIB). Solid- phase peptide synthesis methods of Merrifield and Fmoc chemistry were used for synthesizing peptides. These synthetic peptides corresponded mainly to the C2, V3 and CD4 binding sites of gp120 and some parts of the ectodomain of gp41. The reactivity of these peptides was tested by ELISA against different HIV-1-positive sera from different locations in India. For two of these predicted epitopes, the corresponding Indian consensus sequences (LAIERYLKQQLLGWG and DIIGDIRQAHCNISEDKWNET) (subtype C) were also synthesized and their reactivity was tested by ELISA. These peptides also distinguished HIV-1-positive sera of Indians with C subtype infections from sera from HIV-negative subjects.


Introduction
The most effective means of solving the structure of a protein is by using biophysical methods such as X-ray crystallography or nuclear magnetic resonance spectroscopy (1).But both methods require a considerable amount of purified proteins and the use of sophisticated instruments, and methodologies.With the development of DNA sequencing methodology a large number of protein sequences have become available (2).This sequence information in combination with an ever-increasing database of protein structures, solved via the biophysical methods mentioned above, has led to the popularization of predictive computer algorithms (3).
Many attempts have been made to predict the position of antigenic sites in proteins from certain features of their primary structure.Parameters such as the hydrophilicity, static accessibility, and mobility of short seg-ments of polypeptide chains have been correlated with the location of continuous epitopes in proteins (4,5).Substantial experimental and theoretical efforts have been directed at understanding the relation between the structure of a protein and its immunogenic properties.
The first requirement for protein prediction is the availability of the amino acid sequence.Protein sequences are available from two different sources, i.e., protein information resource (http:/pir.georgetown.edu), and DNA data banks (http://www.ncbi.nlm.nih.gov), and can be easily converted into primary amino acid structures (6).
The HIV envelope gene codes for a precursor polyprotein p88, which is glycosylated to form the envelope precursor protein gp160.gp160 is then cleaved by a cellular protease to form the surface extracellular envelope glycoprotein gp120 and the transmembrane glycoprotein gp41 (7).When the gp120 glycoproteins derived from different HIV-1 isolates were compared, five conserved regions (C1 to C5) and five variable regions (V1 to V5) were identified (8,9).Most of the HIV-neutralizing antibodies fall into two categories: those that recognize a determinant in the V3 loop of gp120 (10,11) and those that block the gp120-CD4 interaction (12) by binding to regions in gp120 conserved between different HIV strains (4,13,14).
The transmembrane glycoprotein gp41 contains hydrophobic domains which can be arranged into three transmembrane regions and several important epitopes within its ectodomain (15,16).
The HIV strain which predominates in India, is of subtype C (17)(18)(19).The first cases of HIV-1 infection in India were reported in Tamil Nadu in 1986, and the first AIDS patient was reported in Bombay in May 1987.The HIV-1 epidemic in India is principally heterosexual (20) with the highest HIV infection rates occurring in metropolitan areas in the western part of India (Mumbai) and in the south (Chennai) (21).
In the present study, we predicted exposed domains of gp120 and gp41 of Indian isolates of HIV-1 and then we used synthetic peptides mainly corresponding to the C2, V3 and CD4 binding sites of gp120 and some parts of the ectodomain of gp41 against different type of sera.Finally, we compared this reactivity with theoretical methods of prediction of the immunogenic sites of proteins.

Peptides
The peptides corresponding to the exposed domains of HIV-1 envelope glycoproteins were synthesized by the solid-phase peptide synthesis method of Merrifield (22) and manually by Fmoc chemistry (21,23) or were supplied by the National Institute for Biological Standards and Control, London, UK.

Sera
Venous blood was collected from individuals from different parts of India, kept at room temperature for 2 h, and then left to stand overnight at 4ºC.Blood was then centrifuged at 3000 g for 10 min at 4ºC and serum was separated and stored at -20ºC.

Theoretical analysis
The sequence of HIV-1 IIIB (subtype B)

Prediction of antigenic epitopes in envelope glycoprotein
was available in the literature (24).The Preditop program provided by Dr. J. Pellequer on a 3½-floppy disk was used to study the exposed domains of the proteins.The scales used in the program to study the structure of a protein are grouped as follows: 9 scales of inverted hydrophobicity (Doolittle, Heijne, Manavala, Prils, Rose, Sweet, Totls, Ges, Zimmermann), 2 scales of hydrophilicity (Hopp, Parker); 4 scales of accessibility (Janin, Chothia, Chothia8, Acrophil), 1 scale of antigenicity (Welling), and 3 scales of secondary structure (Chouf3, grenier3, Levitt).The amino acid sequence of the HIV glycoprotein was read as a moving window of seven residues and of their values corresponding to each of the scales used and the mean value was plotted against the fourth residue of the window.In order to compare the profiles obtained by the different methods, various scales were normalized, with the original values of each scale set between +3 and -3 (25).

Enzyme-linked immunosorbent assay
The peptides were tested for immunogenicity using the conventional ELISA described by Engvall and Perlmann (26).The wells were coated with peptide by incubating 100 µl of peptide solution diluted in carbonate bicarbonate buffer, pH 9.6, at a final amount of 1.2 µl/well in each well of an ELISA plate in a humid chamber overnight at 37ºC.Coated wells were then blocked by incubation with 300 µl of blocking buffer (1 mg/ml bovine serum albumin (BSA) in carbonate bicarbonate buffer) in each well for at least 6 h in a humid chamber at 37ºC.One hundred microliters of diluted test serum was added to each well and incubated overnight in a humid chamber at 37ºC.The plates were washed four times with phosphatebuffered saline (PBS)-Tween and 100 µl horse radish peroxidase-conjugated goat antihuman antibody was added at 1/1000 dilution in PBS-Tween containing 0.1% BSA.
The plates were washed four times with PBS-Tween and 100 µl of a 2,2'-azinobis (3' ethyl benzothiazoline sulfuric acid) solution (0.5 mg/ml) in citrate phosphate buffer, pH 5.0, containing 1 µl H 2 O 2 per ml was added to each well, and plates were incubated for 10 min in a humid chamber at 37ºC.After the development of color, absorbance in each well was measured at 405 nm with a microplate reader.

Statistical analysis
P values were determined by the rank sum test, with the level of significance set at 0.05.

Prediction of the exposed domains of the HIV-1 IIIB envelope glycoprotein
The amino acid sequence of the HIV-1 IIIB envelope glycoprotein was analyzed with the Preditop program to predict the antigenic determinants of this molecule.The results are shown in Table 1.

Comparison of antigenic domains of HIV IIIB envelope glycoproteins by computational and experimental analysis
The antigenic domains of HIV IIIB envelope glycoproteins were compared by reaction with HIV-positive sera and predicted by computational analysis and the results are shown in Table 3. From the results shown in this table, it is clear that peptides Nos. 2, 6, 8, 14 and 15 can be considered to be immunogenic by both experimental and computational analysis.

Reactivity of synthetic peptides corresponding to the different parts of the envelope glycoprotein of Indian HIV-1 isolates
Two peptides, one corresponding to gp120 (peptide No. 20) and the other corre- NA indicates that the sequence of Indian isolates was not available at the time of this study.

Antibody-binding properties of synthetic peptides of the different domains of the HIV-1 IIIB envelope glycoproteins
Sera collected from different individuals were confirmed to be HIV positive or HIV negative using the kits which have already been listed in Methods.The list of the synthetic peptides corresponding to the different parts of the HIV IIIB envelope glycoprotein used in this study is shown in Table 2. Confirmed HIV-positive sera recognized some of the peptides better than HIV-negative sera.Some of the peptides reacted with antibodies present in the sera of HIV-positive patients.Some peptides, however, did not show significant differences in reactivity with HIV-positive and HIV-negative sera and were not antigenic in the random set of HIV-infected individuals tested.sponding to gp41 of Indian HIV-1 isolates (peptide No. 21) (27,28), were synthesized (Table 4).These peptides were homologous to peptides Nos. 9 and 16, respectively.Both peptides Nos. 9 and 16 had shown negative results by computational analysis and positive results by ELISA with HIV-positive sera from Indian individuals.Both peptides Nos.20 and 21 showed positive results by ELISA with HIV-positive sera from the same individuals.

Discussion
Prediction of secondary structures does not permit definitive conclusions but facilitates the interpretation of other results and the design of new experiments.The antigenic domains of a protein molecule can be predicted by analyzing the protein sequence using a suitable computer program (29).Many such computer programs are available which take into consideration several parameters of adjacent small continuous stretches of amino acids such as inverted hydrophobicity, hydrophilicity, accessibility, antigenicity, and a tendency to form secondary structures (25).In the present study we used Preditop, one of the more recent programs (version 3.1) wich takes into account 9 scales of inverted hydrophobicity, 2 scales of hydrophilicity, 4 scales of accessibility and 1 scale of antigenicity for computing the probability of a given domain of a protein to be exposed on the surface of the molecule.
This program predicted the exposure of several domains of this molecule.According to the prediction, gp120 (a surface glycoprotein) of the IIIB strain of HIV-1 is supposed to have nine exposed domains, while gp41 (a transmembrane glycoprotein) of this virus is expected to have four exposed domains.The sequences of the predicted domains are shown in Table 1.
Subtype C is the most prevalent subtype of HIV-1 in India.We wanted to know  whether predicted domains of the gp120 protein of HIV-1 subtype B react with Indian sera, which were tested with the above kits.
Only those sera which were positive in all of the three ELISA tests were selected, and further subjected to the Inno-LIA test.Any serum found to be negative by the Inno-LIA HIV-1/HIV-2 test was discarded.The sera were then classified into the following three groups depending on the reaction pattern found with Inno-LIA HIV-1/HIV-2: a) sera from HIV-1-infected individuals, b) sera from HIV-2-infected individuals, and c) sera from HIV-1 + 2-infected individuals.The sera which were recognized as HIV-1 positive were used in this study, and are referred to as confirmed HIV-positive sera.
On the basis of these results, homologous peptides of 9 and 16 (both showing negative results in computational analysis and positive results in experimental analysis) corresponding to the Indian consensus sequence (27) were synthesized (peptides 20 and 21).The reactivity of these peptides with HIVpositive and normal sera was studied by ELISA and both peptides were shown to distinguish normal sera from the HIV-positive ones.
With respect to the difference between the computational and experimental results, it should be mentioned that no method is 100% accurate in predicting the conformation of a protein from its amino acid sequence (6).In addition, our prediction analysis was performed by using the envelope glycoprotein of HIV-1 IIIB which belongs to subtype B. However, we had tested peptides corresponding to these parts against the sera of Indian individuals who were infected with HIV-1 subtype C.
Furthermore, It should be noted that antibodies directed at gp120 might also bind to conformational, discontinuous epitope(s) quite distinct from the linear sequences defined in this study (30,31).An additional factor that should be considered with caution in the interpretation of these data is the impact of glycosylation on the antibody response in vivo.Glycosylation may result in the blocking of antibody accessibility to some of the predicted residues (32)(33)(34).Furthermore, it has been proposed that linear epitopes may be more exposed in the monomeric rather than the oligomeric form of the envelope glycoprotein (35).

Table 1 .
Amino acid sequences of the envelope glycoprotein of HIV-1 IIIB predicted to be exposed.

Table 2 .
Sequences and regions of the peptides used in this study.

Table 4 .
Sequence and region of the peptides corresponding to the Indian consensus sequence envelope glycoprotein of HIV-1.

Table 3 .
Comparison of the antigenicity of peptides of the envelope glycoproteins of the HIV-1 IIIB strain determined by reactivity with sera (ELISA) and by computational analysis.