Population analysis of vitamin D receptor polymorphisms and the role of genetic ancestry in an admixed population

The vitamin D receptor (VDR) is an essential protein related to bone metabolism. Some VDR alleles are differentially distributed among ethnic populations and display variable patterns of linkage disequilibrium (LD). In this study, 200 unrelated Brazilians were genotyped using 21 VDR single nucleotide polymorphisms (SNPs) and 28 ancestry informative markers. The patterns of LD and haplotype distribution were compared among Brazilian and the HapMap populations of African (YRI), European (CEU) and Asian (JPT+CHB) origins. Conditional regression and haplotype-specific analysis were performed using estimates of individual genetic ancestry in Brazilians as a quantitative trait. Similar patterns of LD were observed in the 5′ and 3′ gene regions. However, the frequency distribution of haplotype blocks varied among populations. Conditional regression analysis identified haplotypes associated with European and Amerindian ancestry, but not with the proportion of African ancestry. Individual ancestry estimates were associated with VDR haplotypes. These findings reinforce the need to correct for population stratification when performing genetic association studies in admixed populations.


Introduction
The vitamin D receptor (VDR) is a member of the superfamily of nuclear receptors for steroid hormones that functions as a ligand-activated transcription factor (Dusso et al., 2005). The VDR associated with the secosteroid hormone 1,25-dihydroxyvitamin D3 (1,25(OH)2 Vitamin D3) and heterodimerized with the retinoid X receptor (RXR) binds to vitamin D3 response elements in the promoter region of responsive genes (Dusso et al., 2005). The genes that are up-or down-regulated by the complex of vitamin D3, VDR, RXR and other recruited proteins are associated with calcium homeostasis, bone metabolism, cell cycle, immunomodulation and other hormonal systems (Dusso et al., 2005;Lips, 2007). The broad range of vitamin D functions has focused attention on the VDR gene as an important candidate gene that could explain variations in specific phenotypes possibly connected with vitamin D metabolism (Valdivielso and Fernandez, 2006).
Much information has been generated since the first description of VDR polymorphism (Morrison et al., 1994) and has led to intense investigation of the allelic variation in the VDR gene in different ethnic populations (Nejentsev et al., 2004;Thakkinstian et al., 2004;Fang et al., 2005). The conflicting data on the association of VDR polymorphisms with specific phenotypes is sometimes confusing. The reasons commonly given to explain the difficulty in reproducing many results include uncontrolled environmental factors, population stratification, locus heterogeneity and different linkage disequilibrium (LD) patterns (Nejentsev et al., 2004;Thakkinstian et al., 2004;Fang et al., 2005). The VDR region consists essentially of three haplotype blocks located in the intergenic region of the VDR and COL1A1 gene, the 5' promoter region and the 3' region encompassing the untranslated region, with the frequency distributions of LD and haplotypes varying among European, African and Asian populations (Nejentsev et al., 2004;Fang et al., 2005).
Specific variations in the allelic frequencies of VDR polymorphisms among Europeans, Africans, Amerindians and Asians could increase the risk of spurious associations in studies of recently admixed populations such as Brazilians (Rosenberg and Nordborg, 2006). The current Brazilian population is one of the most heterogeneous in the world, descending from an admixture of Europeans, Amerindians and Africans during the last five centuries. The use of ancestry informative markers (AIMs) has revealed ample genetic heterogeneity in the Brazilian population (Callegari-Jacques et al., 2003;Parra et al., 2003;Marrero et al., 2005;Lins et al., 2010) and this characteristic may be used to control population stratification in association studies (Suarez-Kurtz et al., 2007).
Association studies in admixed populations relying only on self-reported ancestry or physical features to arrange volunteers in homogenous groups may produce spurious associations because of stratification generated by admixture (Cardon and Palmer, 2003;Ziv and Burchard, 2003;Suarez-Kurtz et al., 2007). This is particularly important when DNA markers used to conduct association studies and the phenotype investigated display different frequency distributions among the reference groups that gave rise to the admixed population (Pritchard and Donnelly, 2001;Rosenberg and Nordborg, 2006). Bone phenotypes differ among Africans and Europeans (Gilsanz et al., 1998;Jones et al., 2004) and several polymorphisms, including those of the VDR gene, have been investigated as candidates to explain its quantitative variation. The purpose of this study was to perform a greater in-depth analysis of variability in the VDR gene in admixed Brazilians and correlate this variability with individual genetic ancestry estimates in order to identify possible pitfalls when performing association studies in an admixed population.

Population sample
The Brazilian population sample (BRZ) consisted of 200 unrelated healthy subjects randomly chosen from individuals involved in no-cost paternity investigations from 2003 to 2005. All subjects signed an informed consent form that allowed the use of their DNA samples for paternity testing and further population genetics research. To avoid bias during analysis no attempt was made to classify the subjects according to morphological or social traits. The subjects were allocated to one of five groups (n = 40 each) based on their birthplace in one of the five geopolitical regions (Midwest, Northeast, North, Southeast and South) of Brazil. The research protocol was approved by the university Ethics Committee.

HapMap data and genotyping
The VDR genotypes of the HapMap population samples were retrieved from an online database (Data Rel 21a/phaseII Jan07, on NCBI B35 assembly, dbSNP b125). The total sample consisted of 89 unrelated East-Asian individuals (ASN) comprising 45 Han Chinese from Beijing (CHB) and 44 Japanese from Tokyo (JPT), 60 unrelated individuals from northern and western European origin (CEU) and 60 unrelated Yoruba individuals (YRI) from Ibadan, Nigeria.
The choice of VDR SNPs was based on markers of HapMap phase I and phase II data that were polymorphic in at least one population and dispersed with average intervening distances of 5 kb; Haploview software (Barrett et al., 2005) was used to establish the LD patterns. Subsequently, a minimum set of SNPs representing the original LD blocks was selected with a 90% prediction coverage . Sixteen SNPs were selected in addition to the five most studied VDR polymorphisms (rs11568820-Cdx2, rs10735810-FokI, rs1544410-BsmI, rs7975232-ApaI and rs731236-TaqI) used in our previous studies (Gentil et al., , 2009Lins et al., 2007Lins et al., , 2009Moreno Lima et al., 2007).
Estimates of genetic admixture in the Brazilian samples were calculated using a set of 28 autosomal ancestry informative markers (AIMs) selected from previous studies that reported large differences in allele frequency among European, African and Native American populations. Detailed procedures for calculating the ancestry estimates are described by Lins et al. (2010) who used the same population as the present study.
PCR primers and single base extension primers were designed using Primer3 based on recommendations of the SNaPshot Multiplex Kit protocol (Applied Biosystems). The SNPs were assembled into three multiplex panels and then genotyped by a modified single base extension methodology described elsewhere .

Statistical analysis
Estimates of allele frequency, deviations from Hardy-Weinberg equilibrium and pairwise genetic distance estimates based on Wright's Fst statistics were calculated using Arlequin v. 3.01 (Excoffier et al., 2005).
The linkage disequilibrium analyses were done by estimating the parameters D' and r 2 . The structure of haplotype blocks in each population was defined by solid spine of LD algorithm in Haploview version 3.32 (Barrett et al., 2005). This criterion defines a block when the first and last markers are in strong LD with all intermediate markers, thereby providing more robust block boundaries. After block definition, the haplotypes of the HapMap population samples were estimated for the blocks established in BRZ in order to compare the haplotype distributions. Haplotype estimates were then calculated using Whap software (Purcell et al., 2007) which is based on the expectationmaximization algorithm and uses the estimates of posterior probabilities to account for the ambiguity of haplotype phase in subsequent association tests. This package was developed to handle quantitative traits and covariates for regression analysis. In this case, the individual ancestry estimated for each ethnic group was set as a quantitative trait in BRZ. Initially, a likelihood ratio test (LRT) on the omnibus conditional regression test indicated whether there was a significant influence of haplotypes on the trait. Then, a haplotype-specific regression-based test comparing the haplotype effect against all other haplotypes indicated whether the effect observed on a specific haplotype was significant (Purcell et al., 2007). For quantitative traits, the conditional analysis approach is more robust and was chosen to model genotype conditional on trait, instead of trait conditional on genotype (the usual approach in such analyses) (Purcell et al., 2007). In addition, a standard approach run in Phase version 2.0.2 software (Stephens and Donnelly, 2003) was used to estimate recombination hotspots by comparing the median of recombination parameters among SNP pairs with the background rate assumed for the general human population (Crawford et al., 2004).

Results
The allele frequency distribution for the VDR gene was similar in the five geopolitical regions (Table 1) and a pairwise Fst test identified no significant difference among them (all p-values were > 0.050). However, a significant difference was found when the Brazilian subgroups were combined to form one group (BRZ) and compared with the HapMap populations in a pairwise Fst test. In this case, the Brazilian population was genetically more distant from the HapMap African derived population (BRZ-YRI Fst = 0.154; p < 0.001) than from the HapMap population with European background (BRZ-CEU Fst = 0.012; p = 0.009). The analysis of individual loci showed that only four out of 19 loci were significantly different (p < 0.05) in the pair BRZ-CEU (Table 2). In contrast, in the other population pairs, only a few loci did not differ significantly in their allele frequencies (Table 2).
Information on the VDR haplotype structure in the HapMap data showed that haplotype extension was greater in CEU, followed by ASN and YRI, which had more blocks of lower extension, compared to the others ( Figure 1). The CEU and ASN populations had similar LD patterns, with two blocks in the 5' region, one of them identical, and one block in the 3' region. A difference was observed only in the length of the first 5' and 3' haplotype blocks.
For a comparative inter-population analysis of haplotype block diversity, four SNPs were excluded from the BRZ dataset either because they lacked genotypes in HapMap (e.g., rs4077869 and rs2239185 missing in the CEU population and rs7302235 missing in the ASN population) or deviated from Hardy-Weinberg expectations in BRZ (rs4516035 p = 0.001). Overall, two blocks were observed in the Brazilian population, one in the 5' gene region and another at the end of the transcription region and the 3' UTR region of the VDR gene ( Figure 1). The 5' haplotype block contained the Cdx2, rs10783219 and rs3890734 SNPs and extended 13 kb, with mean linkage disequilibrium measures of D' = 0.924 and r2 = 0.175. The 3' haplotype block consisted of rs2248098, BsmI, ApaI, TaqI, Population analysis of VDR gene loci 379   (Table 3) and 19 in the 3' region of the gene (Table 4) when all populations were considered.
Regions with no haplotype blocks and a low LD were found between SNPs rs3890734 and rs2853559 and between SNPs rs2254210, FokI and rs886441, for which no block structure was found in any of the populations studied ( Figure 1). The test to identify possible recombination hotspots showed two regions of greater intensity relative to the background recombination rate. In Brazilians, the region at FokI and rs886441 had a recombination rate 60 times higher than the background rate, whereas between rs3890734 and rs2853559 the rate was 18 times higher ( Figure 1).
The test for population structure in the Brazilian population using autosomal AIMs identified a higher probability for a three-hybrid population, and assigned estimates of contributions as 0.771 ± 0.044 for European, 0.143 ± 0.019 for African and 0.085 ± 0.015 for Amerindian. The individual estimates of ancestry proportion showed that most individuals had a widely distributed three-hybrid pattern of variation with a trend towards a higher contribution by Eu-ropean ancestry. The individual estimates were later used as a quantitative trait in conditional regression tests. 380 Lins et al. HapMap populations: CEU = European derived ancestry, YRI = African derived ancestry, ASN = Asian derived ancestry; BRZ = total Brazilian population. N.A. = Data not available (includes monomorphic loci and missing data). p < 0.05 indicates a significant difference. Values on the y-axis are the median of the factor by which the recombination rate between loci (x-axis) exceeds the background recombination rate. The SNPs rs4077869, rs2239185, rs7302235 and rs4516053 are shown to facilitate location since they were excluded from population analysis.
The conditional regression omnibus test indicated a significant correlation with the 5' haplotypes only when estimates of European ancestry were used (p = 0.027) and for 3' haplotypes there was a significant correlation when using estimates of Amerindian and European ancestry (p = .018 and 0.041, respectively); there was no significance when using African ancestry proportions (p > 0.111).
3'H03) and two haplotypes (3'H02 and 3'H04) were significant for European ancestry (Table 4). When the 3' haplotype block was reduced to only the Bsm-Apa-Taq haplotype, a similar effect of ancestry was observed, but in this case the haplotype with a positive regression in Amerindian ancestry had a negative regression in European ancestry and vice-versa (Table 5).

Discussion
In this work, a panel of allelic diversity at the VDR gene locus was generated and the effect of genetic ancestry on haplotype distribution in an admixed population was evaluated. The results described here extend the information about VDR genotypes and provide the first map of VDR haplotypes for a Brazilian population.
The measures of population admixture evaluated with AIMs revealed that the biogeographical ancestral structure of the Brazilian population was European, African and Amerindian (in this order), as previously described (Lins et al., 2010). However, Fst analysis of VDR SNPs revealed a greater distinction between the Brazilian population and the HapMap population of African origin than with the HapMap population of European origin. These results indicate that, in an admixed population, recent admixture cannot always eliminate ancestral LD block structures along chromosomes (Sawyer et al., 2005;Tang et al., 2006). Consequently, complex levels of population admixture can create analytical risks in research involving loci associated with susceptibility to disease and in populations with different profiles, especially in those undergoing admixture. This is likely the case of the VDR gene and bone metabolism phenotypes, such as bone mineral density and osteoporosis (Thakkinstian et al., 2004;Uitterlinden et al., 2004).
Some studies have clearly demonstrated significant associations between European ancestry and body composition traits in admixed populations (Bonilla et al., 2004;Shaffer et al., 2007), while others have shown the relationship between VDR haplotypes and fracture risk in Whites or osteoporosis in European and Asian populations (Thakkinstian et al., 2004;Fang et al., 2005). The present work improves our understanding of the effect of genetic ancestry on the distribution of VDR gene haplotypes in admixed populations and reinforces the importance of correcting for population admixture in genetic association studies that encompass bone-related phenotypes.
The comparative analysis of VDR revealed genetic heterogeneity involving different haplotype blocks in the four populations studied. Considering the recent admixture of Latin American people, especially Brazilians, the variation in the patterns of LD seen here is not surprising in view of demographic events and genetic factors such as drift and recombination during the process of admixture (Gabriel et al., 2002;Liu et al., 2004;Sawyer et al., 2005). Haplotype structure analysis in other admixed populations has also revealed the importance of genetic heterogeneity since linkage disequilibrium increases or breaks down differently in different populations (Moraes et al., 2003;Boldt et al., 2006;Lohmueller et al., 2006;Nakamoto et al., 2006).
Random genetic drift generates large diversity among populations with the same continental origin, e.g., African, European, Amerindian or Asian (Rosenberg et al., 2002). In the present case, the HapMap populations used for comparison were not the most representative sources for the Brazilian parental population studied here and differed from those used to estimate individual autossomic ancestry (which also do not represent the Brazilian parental populations). This limitation is extremely important since the groups used here represent more general continental populations and their characteristics should therefore not be extrapolated to specific populations such as the many that constituted the admixture in current Brazilians, particularly when studying disease-related polymorphisms. As with 382 Lins et al. considerations about genetic ancestry and diseaseassociated SNPs, the patterns of LD and identification of genetic ancestry blocks should also be addressed prior to the selection of tagSNPs for association studies in admixed populations .
The distribution of the 5' haplotype for the VDR gene in BRZ was similar to that of the CEU population. The regression coefficient confirmed this relationship. In this case, the haplotype that showed a positive association with European ancestry in Brazilians (5'H01) was absent in the ASN population and had a low frequency in YRI (Table 3). In contrast, the haplotype that correlated negatively with European ancestry (5'H02) had a higher frequency in YRI and ASN than in CEU. These results indicate that both European and non-European ancestries contributed to haplotype variation in our admixed population. Notably, this same phenomenon occurred in the 3' region haplotypes, but in this case there was also a significant association with the Amerindian contribution (Table 4).
The African ancestry showed no correlation with any of the haplotypes examined. Indeed, variation in the extent and amount of LD in the VDR gene is lower in African populations than in European or East-Asians (Nejentsev et al., 2004;Fang et al., 2005), suggesting that extensive recombination precluded the extension of LD in African populations. In agreement with this, the 3' haplotypes defined by the BRZ population exhibited high diversity and low frequencies in YRI (Table 4) when compared to Bsm-Apa-Taq haplotypes, which had a higher LD (Table 5). Moreover, the FokI SNP was not in linkage disequilibrium with any of the SNPs in the populations tested. This observation corroborates previous findings (Nejentsev et al., 2004;Fang et al., 2005) and suggests a major site of haplotype breakage and recombination in the VDR gene that is independent of ethnicity.
Previous studies of the VDR gene in Brazilians only sampled a few loci but some complex phenotypes (Lazaretti-Castro et al., 1997;Hauache et al., 1998;de Brito Junior et al., 2004;Maistro et al., 2004;Goulart et al., 2006;Gentil et al., 2007Gentil et al., , 2009Moreno Lima et al., 2007;Rezende et al., 2007). Some studies have tried to correct the genetic heterogeneity of the Brazilian population by using self-reported ancestry or physical traits as proxy for different ethnic groups. For instance, Rezende et al. (2007) reported no difference in the distribution of VDR haplotypes in Brazilian self-reported Blacks and Whites. This may be explained by the fact that dissociation between physical appearance and genetic ancestry in Brazilians (Parra et al., 2003;Marrero et al., 2005) may have created spurious similarity in genotype and haplotype frequencies among these groups because of the population substructure. As shown here, admixture in the Brazilian population provides the opportunity to segregate the contribution of individual ancestry from genetic ancestry blocks at the VDR gene locus.
In conclusion, the results of this investigation provide a large map of haplotypes for the entire VDR gene and intragenic regions in a carefully sampled Brazilian population. Comparison with the HapMap data was essential for understanding the patterns of LD and haplotype variation among populations and for elucidating the effects of admixture on this diversity. Our findings also support studies that involve tagSNP selection between different ethnic populations (Nejentsev et al., 2004;Lins et al., 2009). Genetic polymorphisms such as those observed here may partly explain ethnic differences in vitamin D3 status and their relationship to bone phenotype and hormonal homeostasis, particularly in elderly women, as well as the role of environmental factors such as diet, lifestyle and sun exposure (Uitterlinden et al., 2004;Dusso et al., 2005;Valdivielso and Fernandez, 2006).