Genetic diversity and prevalence of CCR2-CCR5 gene polymorphisms in the Omani population.

Polymorphisms in the regulatory region of the CCR5 gene affect protein expression and modulate the progress of HIV-1 disease. Because of this prominent role, variations in this gene have been under differential pressure and their frequencies vary among human populations. The CCR2V64I mutation is tightly linked to certain polymorphisms in the CCR5 gene. The current Omani population is genetically diverse, a reflection of their history as traders who ruled extensive regions around the Indian Ocean. In this study, we examined the CCR2-CCR5 haplotypes in Omanis and compared the patterns of genetic diversity with those of other populations. Blood samples were collected from 115 Omani adults and genomic DNA was screened to identify the polymorphic sites in the CCR5 gene and the CCR2V64I mutation. Four minor alleles were common: CCR5-2554T and CCR5-2086G showed frequencies of 49% and 46%, respectively, whereas CCR5-2459A and CCR5-2135C both had a frequency of 36%. These alleles showed moderate levels of heterozygosity, indicating that they were under balancing selection. However, the well-known allele CCR5Δ32 was relatively rare. Eleven haplotypes were identified, four of which were common: HHC (46%), HHE (20%), HHA (14%) and HHF*2 (12%).


Introduction
The C-C motif chemokine receptor 5 (CCR5) is expressed in memory/effector T cells, monocytes/macrophages and immature dendritic cells (Oppermann, 2004). CCR5 plays a critical role in physiological and pathological conditions through its ability to bind chemokines and regulate the migration of leukocytes throughout the body (Guergnon and Combadière, 2012). CCR5 is also a co-receptor for human immunodeficiency virus-1 (HIV-1) that facilitates virus entry into cells and mediates infection (Littman, 1998;Mummidi et al., 1998).
The CCR5 gene is located in the short arm of chromosome 3 and consists of four exons and two introns (Mum-midi et al., 1997). The gene was identified with two functional promoters: an upstream promoter known as P1 and a weak downstream promoter known as P2. The P1 promoter gives rise to two full-length transcript variants while the P2 promoter results in several truncated transcripts that lack exon 1, although all produce the same CCR5 protein (Mummidi et al., 2000;Wierda and van den Elsen, 2012).
A 32-bp deletion in the open reading frame (ORF) of the CCR5 gene (CCR5D32) causes a premature termination of translation that results in a non-functional receptor (Dean et al., 1996;Liu et al., 1996). CCR5D32 has a variable distribution among populations of different ethnic backgrounds, e.g., it is common in Caucasians but rare among Asian and African populations (Martinson et al., 1997;Novembre et al., 2005). The CCR5D32 allele has a frequency of 10-20% among European populations, with the highest frequency found in northern Europe, and decreases to 2-5% in the Middle East and the Indian subconti-nent (Su et al., 2000). Polymorphisms in the CCR5 promoter region can influence expression of the corresponding protein in Asian and African populations, where carriers of the CCR5D32 mutation are rare (McDermott et al., 1998;Mummidi et al., 2000;Picton et al., 2012). In addition, there is a marked difference in allele frequencies among different ethnic groups (Gonzalez et al., 1999;Catano et al., 2011;Malhotra et al., 2011).
Historically, Oman played a central role as a gateway for the spice and frankincense trade that linked Yemen and India to Africa and Eurasian regions. Consequently, the human population of this region of the Middle East is expected to display a high degree of diversity that reflects its cosmopolitan past Semino et al., 2004;Al-Abri et al., 2012). This suggests that a high degree of diversity is likely to be found in the CCR2-CCR5 genes in the Omani population. Although a few studies have examined the frequency of CCR5D32 in the Arabian Peninsula (Salem et al., 2009;Voevodin et al., 1999), no studies have investigated the allele frequencies of other polymorphisms and the gene diversity of the CCR2-CCR5 complex in this region. In this study, we examined the frequency of the variable sites (the cis-regulatory and coding regions) of the CCR5 gene and estimated the allele frequency of the V64I mutation in the CCR2 gene in the Omani population. We also explored the genetic diversity based on the CCR2-CCR5 gene locus in the Omani population and compared it with other populations.

Study population
Samples were collected from 115 Omani adults (60 males, 55 females) between April 2010 and April 2012. The mean age of the subjects was 36.2 years (range: 19-72). Forty-four of the subjects were healthy blood donors and 71 were patients attending Sultan Qaboos University Hospital. The study population consisted predominantly of Arabs from various regions of the country: Muscat (the capital), north, south, west and the remote region of Musandam. The aim of the study was explained individually and informed written consent was obtained from all participants. This study was approved by the Medical Research and Ethics Committee of the College of Medicine and Health Sciences, Sultan Qaboos University.

Genotyping and nucleotide sequence analysis
Genomic DNA from all participants was extracted from peripheral blood samples anti-coagulated with ethylenediaminetetraacetic acid (EDTA) using QIAmp DNA mini kits (Qiagen, Germany), according to the manufacturer's protocol. Polymerase chain reaction (PCR) amplification of genomic DNA followed by DNA sequencing was done to cover the CCR5 polymorphic sites -2733, -2554, -2459, -2135, -2132, -2086 and -1835. This numbering system is based on the first nucleotide of the CCR5 translational start site defined as 1 and the nucleotide upstream from that as -1, as originally described by Mummidi et al. (2000). All samples were also tested for the presence of the CCR5D32 and CCR2V64I variants in the coding regions of these two genes. The sequences of the primers used are listed in Table 1. PCR was done using AmpliTaq Gold enzyme and PCR buffer (Applied Biosystems, Foster City, CA, USA). Amplified fragments were purified using ExoSAP (USB, Affymetrix, Inc. USA) and sequenced bidirectionally using 8 Al-Mahruqi et al. Position of SNP in the promoter region of the CCR5 gene is based on the first nucleotide of the CCR5 translational start site being designated as 1. Tm: annealing temperature.
BigDye terminator v.3.1 chemistry (Applied Biosystems) and then run on a 3130 XL Genetic Analyzer (Applied Biosystems). All sequences were aligned with the reference sequence (GenBank accession number: NT_22517.18) and analyzed for the presence of SNPs using Lasergene sequence analysis software package v.7.1 (DNAStar, Madison, WI, USA).

Statistical analysis and population divergence
GenAlEx6.41 software (Peakall and Smouse, 2006) was used to estimate the allele frequencies of each SNP in the study sample followed by testing for significant deviation from Hardy-Weinberg equilibrium (HWE). Heterozygosity (as a measure of genetic diversity in the sample group) was calculated. Arlequin v.3.5 package (Excoffier and Schneider, 2005) was used to estimate the pairwise linkage disequilibrium (LD) between all pairs of alleles at different loci based on the expectation-maximization (EM) algorithm. The statistical significance of the LD between pairs of SNPs was assessed by the chi-square test.
Haplotypes were constructed using DnaSP v.5 (Stephens and Donnelly, 2003). The phylogenetic relationships between the CCR2-CCR5 haplotypes were re-constructed using the median joining (MJ) algorithm contained in Network 4.6 software (Bandelt et al., 1999). POPTREE software was used to draw the unrooted neighbor-joining (NJ) tree using the corrected pairwise F ST distances from CCR2-CCR5 haplotype frequencies calculated in 1000 bootstrap replicates to assess the bootstrap values (Saitou and Nei, 1987).

Results
Allele frequencies of the CCR2-CCR5 gene locus One hundred and fifteen samples from Omani adults were successfully genotyped at nine variable sites: one in CCR2 and eight in CCR5. The allele frequencies and genotypes are shown in Table 2. The overall genotype distribution did not deviate significantly from Hardy-Weinberg proportions (p > 0.05). The commonly known CCR5D32 was found in only one individual. The minor allele frequency (MAF) of the CCR264I allele was 13% and most of the carriers of this allele were heterozygous (20.9%). The most common minor alleles were CCR5-2554T and CCR5-2086G that occurred at a frequency of 49.1% and 46.5%, respectively. Two alleles, CCR5-2135C and CCR5-2459A, were present at lower frequencies of 36.5% and 36.1%, respectively. To further examine the degree of diversity of the CCR2-CCR5 gene locus in the study sample, we estimated the expected heterozygosity of each SNP (He). The loci -2554, -2459, -2135 and -2086 had a high level of heterozygosity (0.500, 0.461, 0.464 and 0.498, respectively) ( Table 2). Figure 1 summarizes the results of the LD tests done for all pairwise combinations of SNPs. There was a significant LD between the distal CCR2V64I and the majority of CCR5 SNPs in the promoter region. The internal SNPs at positions -2554, -2459 and -2135 were in strong LD between each other (p < 0.0001) whereas the SNP at position -2132 was not linked to any of the tested positions.

Haplotype frequencies and distribution
The network analysis divided all haplotypes into two main clusters. One cluster consisted of the human haplogroup (HH) -HHA, HHC and HHD, while the second cluster was composed of HHE, (HHG*1 & HHG*2) and (HHF*1 & HHF*2) (Figure 2). Three newly identified haplotypes found in three individuals in a heterozygote state could not be grouped in the previously published CCR5 evolutionary-based classification (Mummidi et al., 2000). This difficulty reflected the presence of a different combination of nucleotides at position -2459 that was not used for the classification (Figure 2). For example, haplotypes HHF*1 and HHF*2 had adenine at position -2459 whereas the newly identified haplotypes HHF*1A and HHF*2A had guanine instead. The new haplotypes were designated based on their position in relation to their descendant haplotype (Figure 2). The most common haplotype in the studied population was HHC (46.1%) followed by HHE (20.0%). Other haplotypes such as HHA (14.3%) and HHF*2 (12.2%) were less common. Two haplotypes (HHD and HHG*1) were both present at 2.6%. Minor haplotypes were HHF*1 and HHG*2 (the CCR5D32-bearing haplotype), both present at 0.4%. The HHB haplotype was not detected in any of the subjects (Table 3).

Population divergence
The frequency of the HHA haplotype varies among populations, with the highest frequencies in African populations (Pygmy, non-Pygmy and African-American). Two different haplotypes, HHC and HHE, are common in Asians and Europeans. The HHC haplotype is most common among Asian populations, including Thais (62%) (Nguyen et al., 2004), Omanis (46%) and Indians (36%), followed by Europeans (35%). The HHE haplotype had the highest frequency among Europeans (32%) compared to other populations. The HHD haplotype was highly prevalent among African-Americans (20%) but only present at 2.6% in Omanis (Table 3). The HHG*2 haplotype was less common as it was mostly limited to Europeans (8%) and relatively rare in Asians and Africans. HHB was a rare haplotype found mainly in Africans and was completely absent in other populations.
Corrected pairwise F ST distances were calculated using the CCR2-CCR5 haplotype frequency estimates (Table 3) and an unrooted NJ tree was constructed between Omanis and eight other populations. Figure 3 shows that the African populations were well-defined in one cluster that consisted of Pygmy and non-Pygmy populations while the second cluster (with bootstrap support of 82) involved the African-American population, which has a component of the African population mixed with Caucasian and Eurasian populations. These populations were further divided into two groups: the first group represented Indians (bootstrap support 88) and consisted of an isolated population composed mainly of southern Indians; the second group consisted of Ethiopian Jews, Europeans and Asians. Inter-10 Al-Mahruqi et al. He: expected heterozygosity = 1 -Sum pi 2 , where Sum pi 2 is the sum of the squared population allele frequencies.
estingly, the Omani population was closely grouped with Southeast Asian populations such as Thais.

Discussion
The aim of this study was to obtain baseline data on CCR2-CCR5 polymorphisms and to determine the haplotype profile of the Omani population. Our results confirmed the previous findings that the frequency of the CCR5D32 allele is very low in Asian populations (Martinson et al., 1997;Su et al., 2000). Six major allele positions were identified (MAF > %): CCR264I, CCR5-2554, CCR5-2459, CCR5-2135, CCR5-2086and CCR5-1835 There was high LD between the SNPs in the CCR2-CCR5 gene locus and this allowed assessment of the diversity found in these two regions in 11 haplotypes. The large number of haplotypes identified in this study, including three unique ones, indicated that the Omani population was genetically quite diverse when compared to other Asian populations such as Thais (Clark and Dean, 2004;Nguyen et al., 2004).   Gonzalez et al. (2001). 2. Korostishevsky et al. (2011). 3. Nguyen et al. (2004. 4. This study.  Korostishevsky et al. (2011) investigated the LD between four SNP positions in the promoter region of CCR5 and CCR2V64I among Ethiopian Jews. They observed significant LD between the three internal SNP positions (G2554T, T2135C and A2086G) while C1835T was found to be tightly linked only to CCR2V64I. In this study, we examined the LD at eight SNP positions and found that four internal SNP positions were tightly linked (p < 0.0001). CCR2V64I was significantly linked to four promoter SNPs in the CCR5 gene. These results demonstrate that the LD pattern between CCR5 promoter SNPs and CCR2V64I in the Omani population differed from that of Ethiopian Jews.
The analysis of the CCR2-CCR5 gene locus described here suggests that the Omani population is diverse and resembles the patterns reported in previous studies of Asian subpopulations. Four allele positions (-2554, -2459, -2135 and -2086) showed moderately high levels of heterozygosity. This heterozygosity reflects selection that favors the presence of these alleles to produce an excess of intermediate frequencies (Bamshad et al., 2002;Ramalho et al., 2010) that would be expected to reduce population differentiation. However, the other two tightly linked loci (CCR264 and -1835) showed low heterozygosity, suggesting that there may be selection at these two positions. Moreover, these two linked loci have a greater global distribution, especially in Asia and Africa, and are therefore presumed to be older mutations compared to CCR5D32, which is restricted to Europe (Martinson et al., 1997).
The diversity of the Omani population was confirmed by the identification of 11 haplotypes in the CCR2-CCR5 gene locus. This finding agrees with results obtained from mitochondrial DNA (Luis et al., 2004;Rowold et al., 2007) that showed high intra-population diversity in Oman. The authors of these studies suggested that this diversity was a consequence of a series of demographic incidents that occurred in this region, partly because of Oman's strategic location as a passage between Africa and Eurasia. These conclusions agree with our phylogenetic analysis of the CCR2-CCR5 haplotypes based on corrected F ST distances that clustered the Omani population with other Asian populations but not far from the Europeans. In addition, the recent discovery of African Nubian fossils in the Dhofar region, the southern part of Oman, suggests the presence of a migratory route of early Homo sapiens from the Horn of Africa across the southern Red Sea into Asia (Rose et al., 2011). This conclusion also agrees with previous genetic reports on Y-chromosome diversity (Semino et al., 2004;Cadenas et al., 2008) which showed that the diversity in Oman was partially explained by its position along the costal pathway associated with movements between large human conglomerates. Surprisingly, the legacy of the African haplotypes HHB and HHD is very low in the present Omani population, despite the fact that Oman had ruled a significant part of the East African coast for centuries. This may suggest a low degree of inter-marriage between the Omanis and coastal East African populations or that the genetic influxes from Asian populations replaced those from Africa. Further investigations on a larger sample size would be necessary to confirm these findings.
This study is the first to describe the common CCR2-CCR5 haplotypes in Oman. HHC and HHE were the most common haplotypes in the population studied. Further investigations are needed to delineate the functional relevance of the three newly identified haplotypes, particularly with regard to their role in the pathophysiology of HIV-1 infection and disease progression among HIV-1 infected Omani patients. Together, the results of this study show that the Omani population is genetically quite diverse. Future investigations should examine the pattern of genetic differences in the CCR2-CCR5 gene loci between the northern and southern parts of the country.