Allelic frequencies of six polymorphic markers for risk of prostate cancer

The aim of the present study was to evaluate the distribution of polymorphisms for the androgen receptor (AR) (CAG, StuI, GGN), SRD5A2 (Ala49Thr, Val89Leu) and CYP17 (MspA1) genes that are considered to be relevant for risk of prostate cancer. We studied 200 individuals from two cities in the State of São Paulo, by PCR, PCRRFLP and ASOH techniques. The allelic frequencies of the autosomal markers and the StuI polymorphism of the AR gene were very similar to those described in most North American and European populations. In relation to the CAG and GGN number of repeats, the study subjects had smaller repeat lengths (mean of 20.65 and 22.38, respectively) than those described in North American, European and Chinese populations. In the present study, 30.5% of the individuals had less than 22 CAG repeats and 45.5% had less than 23 GGN repeats. When both repeat lengths are considered jointly, this Brazilian population is remarkably different from the others. Further studies on prostate cancer patients need to be conducted to assess the significance of these markers in the Brazilian population. Correspondence


Introduction
In Brazil some 20,820 men will be diagnosed with prostate cancer this year alone and 7,320 men will die of the disease (1).In fact, prostate cancer is the third most common cancer in incidence among men in this country.Studies of risk factors such as occupation, diet, smoking, alcohol and sexual activity are still inconclusive (2)(3)(4)(5).However, age, ethnicity and family history clearly affect the risk of prostate cancer (6).
There is evidence to support the hypothesis of hormonal etiology of prostate cancer involving androgen action (7,8).Andro-gen is required for differentiation and growth of the prostate in utero and at puberty (8).Testosterone is synthesized from cholesterol by a series of reactions involving cytochrome P450 enzymes.Testosterone is converted to dihydrotestosterone (DHT), a more potent androgen, by 5α-reductase type 2 with NADPH as the cofactor in many androgendependent target tissues.DHT binds to the androgen receptor (AR), and the DHT-AR complex transactivates a number of genes with AR-responsive elements (9).These events ultimately result in cell division in the prostate.
In the steroid metabolic pathway, the cy-tochrome P450c17α (CYP17) gene is the rate-limiting step in androgen biosynthesis.The CYP17 gene maps to chromosome 10q24.3(10) and encodes the cytochrome P450c17α enzyme which catalyzes steroid 17α-hydroxylase/17,20 lyase activities.A T→C transition polymorphism creates an additional Sp1-type (CCACC box) promoter site, suggesting that this variant (A2 allele) may have an increased rate of transcription (11).
The AR gene, located on chromosome Xq11-12, encodes the androgen receptor, a ligand-activated transcription factor that mediates the androgenic response and stimulates the expression of genes associated with the differentiated phenotype of the prostate, such as prostate-specific antigen (12).The large exon 1, that encodes the transactivation domain, contains a highly polymorphic CAG microsatellite, a StuI single nucleotide polymorphism at codon 211 (G1733A), and a less polymorphic GGN repeat (13,14).
In transfection experiments, the CAG repeat length is inversely and linearly correlated with transcriptional transactivation activity, suggesting that a long polyglutamic chain interferes with AR function (15).It has been shown that shorter AR gene repeat lengths are related to a higher risk of prostate cancer development or progression (16,17).The effect of the length of the second AR gene microsatellite, consisting of GGN repeats encoding polyglycine, is unknown.However, Platz et al. (18) speculated that 23 repeats might represent the coding sequence for optimal AR protein conformation and activity.A deviation in repeat number in either direction away from the mean may represent slightly diminished AR activity.The StuI polymorphism allows the identification of two alleles (S1/S2), for which an association with increased risk of prostate cancer has been investigated (7).
The cloning and characterization of the 5α-reductase type 2 gene (SRD5A2, mapped at 2p23) has allowed the description of ex-tensive genetic polymorphisms in this gene (19).A valine to leucine polymorphism at codon 89 (Val89Leu) has been reported, with the Leu allele associated with lower steroid 5α-reductase activity.Since the Leu allele would lead to lower levels of intraprostatic DHT, a protective effect towards prostate cancer has been suggested (20).Another variant that changes an alanine to a threonine at amino acid 49 (Ala49Thr) was correlated with an increase in 5α-reductase activity and an association of the Thr allele with prostate cancer risk has been reported (21).
In the present study, we determined the frequencies of the aforementioned polymorphisms among 200 individuals from two cities in the State of São Paulo in order to evaluate the genotypic distribution of these prostate cancer markers.

Material and Methods
Polymorphisms of the AR, SRD5A2 and CYP17 genes were analyzed in DNA samples from 200 blood donors from the Hematology-Hemotherapy Center of the State University of Campinas (N = 118; mean age = 36.67± 9.76 years) and Hematology-Hemotherapy Center of the State University of São José do Rio Preto (N = 82; mean age = 39 ± 9.10 years).The protocol was approved by the Ethics Committee of the Faculdade de Ciências Médicas, Universidade Estadual de Campinas.All individuals were included in this study after giving informed written consent.
Information about the ancestors of all individuals investigated was obtained in order to characterize the ethnic composition of both samples.Among the blood donors from Campinas, 61% (N = 72) considered themselves to be white and 39% (N = 46) to be black or reported parents or grandparents of black ancestry.In the São José do Rio Preto sample, 67.3% (N = 56) considered themselves to be white and 31.7%(N = 26) to be black or reported black ancestry.Neither the São José do Rio Preto individuals nor those from Campinas reported Amerindian or Asian ancestry in the last two generations.
Genomic DNA from leukocytes was used as a template in the polymerase chain reaction (PCR).The fragment of the AR gene comprising the polymorphic CAG repeat was amplified with sense and antisense primers according to Yee et al. (22).The reaction was performed in 25 µl total volume containing 50 ng DNA, 20 mM Tris-HCl, pH 8.4, 50 mM KCl, 1.5 mM MgCl 2 , 10 mM dNTP mixture (0.2 mM dATP + 0.2 mM dTTP + 0.2 mM dGTP + 0.1 mM dCTP), 20 pmol of each primer, 2.5 units Taq DNA polymerase (Gibco-BRL, Gaithersburg, MD, USA), and 1 µCi [α-33 P]-dCTP (10 mCi/ml).The amplification consisted of an initial 5min denaturation step followed by 10 cycles of 1 min at 94ºC, 1 min at 55ºC and 1 min at 72ºC and of 25 cycles of 1 min at 94ºC, 1 min at 60ºC and 1 min at 72ºC.The PCR product was electrophoresed on a 6% denaturing polyacrylamide DNA sequencing gel, and the length was determined after comparison to standards derived from previously sequenced CAG repeats.Gels were transferred to filter paper (Whatman 3MM), dried and exposed to X-ray film for 16-20 h.
GGN amplification was performed with the sense and antisense primers described by Sleddens et al. (14).Reaction conditions were as described above, with the following modifications: 10 mM dNTP mixture (0.2 mM dATP + 0.2 mM dTTP + 0.1 mM dCTP + 1 mM total with 3:1 mixture of 7-deaza-dGTP:dGTP) and 5% DMSO.Thermocycling consisted of 12 min at 95ºC, 21 cycles of 1 min at 96ºC, 1 min at 64ºC and 1 min at 72ºC, followed by a touchdown reduction of annealing temperature of 0.5ºC per cycle until 54ºC and 25 cycles with annealing temperature at 54ºC.After electrophoresis and exposition to X-ray film, the length was determined by comparison with previously sequenced GGN repeats.PCR-restriction fragment length polymorphism (PCR-RFLP) analysis was used to identify the S1/S2 alleles of the StuI polymorphism.A 416-bp DNA fragment was amplified with the primers reported by Lu and Danielsen (23).The PCR consisted of an initial 5-min denaturation step followed by 35 cycles of 1 min at 94ºC, 1 min at 62ºC and 1 min at 72ºC.The presence of an A at nucleotide 1733 (S2 allele) results in a StuI site, which is abolished on the S1 allele.The PCR fragments were digested with StuI for 2 h at 37ºC and the digestion products were separated by 1.5% agarose gel electrophoresis.Specific digestion products consisted of 416 and 329/87 bp for the S1 and S2 genotypes, respectively.
The CYP17 A1/A2 alleles were detected by RFLP analysis of 629-bp DNA fragments amplified by PCR (35 cycles, annealing temperature of 58ºC) with the primers described by Lunn et al. (24).The A2 allele can be identified by the presence of a second MspA1 site created by the T→C substitution.The PCR fragments were digested with MspA1 for 2 h at 37ºC and the digestion products were separated by 3% agarose 1000 (Gibco-BRL) gel electrophoresis.Specific digestion products consisted of 577-, 577/305-and 272-, and 305/272-bp fragments for the A1/ A1, A1/A2 and A2/A2 genotypes, respectively.A 52-bp fragment was present in all samples due to an invariant MspA1 site that served as an internal control for complete digestion.
PCR-RFLP was also used to identify the Val89Leu alleles of the SRD5A2 gene.A pair of primers was used to produce a 369-bp DNA fragment (25) after 35 cycles of PCR with an annealing temperature of 65ºC.The presence of a G at nucleotide 296 (Val allele) creates an RsaI site, which is absent in the Leu allele.The PCR fragments were digested with RsaI for 2 h at 37ºC and the digestion products were separated by 3% agarose 1000 (Gibco-BRL) gel electrophoresis.Specific digestion products consisted of 169/106/73-and 21-, 169/106/94/73-and 21-, 169/106-and 94-bp fragments for the Val/Val, Val/Leu and Leu/Leu genotypes, respectively.
The SRD5A2 Ala49Thr alleles were analyzed by PCR-allele-specific oligonucleotide hybridization.A 309-bp DNA fragment was amplified with a pair of primers described by Makridakis et al. (21) after 35 cycles of PCR with an annealing temperature of 55ºC.The selected probes (ALA49-CTACCCGCCTGCCAGCCC and THR49-CTACCCGCCTACCAGCCC) were labeled with [γ-32 P]-dATP and used to identify the Ala49Thr alleles.The THR49 oligonucleotide was used as a sense primer to produce PCR fragments containing the G→A substi-tution in order to provide a positive control for this rare allele.

Statistical analysis
The allelic frequencies of polymorphisms involving the autosomal genes were calculated by gene counting.The goodness of fit to Hardy-Weinberg equilibrium as well as statistical analysis of differences between population samples were performed using the χ 2 test.For the AR gene, the distribution of the number of CAG and GGN repeats in both samples was compared using the independent t-test.The haplotype frequencies and the likelihood ratio test of linkage disequilibrium for genotypic data of unknown gametic phase were estimated using the Arlequin (26) program.
Preliminary comparative analysis showed no difference concerning ancestral origin in either sample.Since the Brazilian population is known to be highly miscegenated, further comparisons were not adjusted by ethnic background.

Results
The number of AR gene CAG repeats varied from 11 to 27 among blood donors from Campinas and from 13 to 28 in the individuals from São José do Rio Preto (Figure 1), with average numbers of 20.85 (SD ± 2.61) and 20.35 (SD ± 2.85), respectively.The distribution of CAG repeats was not significantly different (t = 1.28;P = 0.20).The GGN repeat length ranged from 16 to 27 (mean ± SD, 22.55 ± 1.33) in the Campinas sample and from 10 to 27 (22.15± 2.47) among blood donors from São José do Rio Preto (Figure 2), with no difference between these two samples (t = 1.48;P = 0.14).The results concerning StuI polymorphism analyses are shown in Table 1.No significant differences in S1/S2 allelic frequencies were found between samples (χ 2 = 3.10; P = 0.08).Since there were no differences for the AR  gene markers in either sample, these data were pooled together for haplotype analysis.Seventy-six haplotypes were estimated and the most frequent ones (≥4%) are shown in Table 2.When a possible association (linkage disequilibrium) was tested, the results were negative for CAG and GGN repeats (P = 0.14) as well as for CAG repeats and StuI polymorphism (P = 0.27).Linkage disequilibrium was found between GGN repeats and StuI polymorphism (P = 0.01).
With respect to the SRD5A2 gene, the G→A mutation that leads to the Ala49Thr substitution was not detected in the present study.The genotype frequencies of SRD5A2 Val89Leu and CYP17 A1/A2 polymorphisms were estimated in both population samples (Table 1).The distributions of SRD5A2 and CYP17 genotypes were found to be in Hardy-Weinberg equilibrium for each sample.However, the SRD5A2 Val89Leu genotypic frequencies were different between samples (χ 2 = 8.41; P = 0.015), with a lower than expected frequency of the Val/Leu genotype in the Campinas sample.Although no differences were detected when the allelic frequencies were compared (χ 2 = 1.84;P = 0.17), the Val allele was more frequent in the Campinas sample (0.68) than in the São José do Rio Preto sample (0.60).

Discussion
The "ethnic/racial" classification has been used by many investigators to denote origin by birth or descent rather than nationality.This is a difficult task, since the Brazilian population in general and especially that of the southeastern region has a high degree of admixture, even in the so-called "white" individuals for whom a significant contribution of Amerindian and African matrilineages has been demonstrated (27).
According to Templeton (28), human races do not exist under the traditional concept of a subspecies as a geographically circumscribed population showing sharp genetic differentiation.However, many traits and their underlying polymorphic genes show independent patterns of geographical variation.As a result, some combination of characters will distinguish virtually each population from all others.Thus, it is important to define the polymorphic spectrum of genes that may be involved in cancer diseases in each population in order to evaluate its usefulness in the prediction of risk.
The six polymorphic markers evaluated in the present study showed a closely similar distribution in these two samples of the Brazilian population.Hence, for further comparisons these data were pooled together.
With respect to AR markers, the mean number of CAG repeats (x = 20.6)observed in the present study was the lowest one among different populations, with a narrow range of distribution (11 to 28 repeats), while the highest value (x = 23) was reported in a Chinese population sample (Table 3) (29)(30)(31)(32)(33).The mean GGN number of 22.4 repeats in the Brazilian population was also lower than those found in North American and peat length shorter than 20 (13,34) versus 22% of the white men from two North American studies (16,17) versus only 10% of Chinese men (32).In the present study, 30.5% of the individuals had less than 20 CAG repeats, suggesting an intermediate-to-high risk condition in our highly miscegenated population.
According to Stanford et al. (17), men with a CAG repeat number lower than 22 have a 23% increase in risk of prostate cancer when compared to men with ≥22 repeats.Findings related to GGN number revealed that men who had ≤16 repeats experienced a 60% higher risk than men with longer repeat lengths (17).In the present study, almost 44% of the individuals had less than 22 CAG repeats, but only 1.5% had ≤16 GGN repeats.When both repeat lengths are considered jointly, 63.5% of the individuals in the Brazilian sample showed the combined <22 CAG >16 GGN repeat numbers and 35% had ≥22 CAG >16 GGN repeats.Only 1 and 0.5% of the individuals exhibited the <22 CAG ≤16 GGN and ≥22 CAG <16 GGN lengths, respectively (Table 4).This distribution is strikingly different from that for the North American population described by Stanford et al. (17), where the same stratification was done (χ 2 = 125.72;P = 0).In addition, when the present data were classified according to the categories proposed by Platz et al. (18) or by Hsing et al. (32) (Table 4), the combined repeat lengths showed remarkably significant differences in their distributions among these populations (χ 2 = 169.49;P = 0 and χ 2 = 72.44;P = 0, respectively).These statistical differences are obviously a consequence of the different mean CAG and GGN repeat values in each population and also of the different criteria used for the classification of combined CAG and GGN repeats that are based either on the mean (18) or on the median lengths within the populations (17,32).In our population, median lengths (CAG = 21 repeats and GGN = 23 repeats) would allow us to classify our Chinese populations (Table 3) (18,32).The allelic frequencies of StuI polymorphism (S1 = 0.2 and S2 = 0.8) were close to those reported by Lu and Danielsen (23) (S1 = 0.13 and S2 = 0.87) for white North Americans.
It has been suggested that part of the large "ethnic/racial" difference in prostate cancer risk may be explained by the observed variations in the CAG repeat length between populations.Previous studies have shown that the CAG repeat length is shortest in African Americans, intermediate in whites and longest in Asians (13,34), a fact probably related to higher, intermediate and lower prostate cancer risks.Indeed, 55% of African North Americans are reported to have a CAG re-data according to Platz et al. (18).In this situation, the differences between these two populations can be shown by the fact that only 8.9% of the North Americans (Table 4) (18) had less than 23 GGN repeats versus 45.5% of the Brazilian individuals.These data may be of concern, since 23 GGN repeats are considered to represent the coding sequence for optimal AR protein activity and shorter GGN repeat lengths appear to be associated with a moderate increase in the risk of prostate cancer.
Linkage disequilibrium between CAG and GGN repeats has been reported by Platz et al. ( 18), but has not been detected by other authors (34,35) in male base populations or in the present data.The relevance of linkage disequilibrium between GGN repeats and StuI polymorphism observed in the present study has to be further evaluated in prostate cancer patients.
Concerning the SRD5A2 Val89Leu polymorphism, the observed allelic frequencies do not differ from North American data (Table 5) (20,24,36,37).The absence of the Thr49 allele of the same gene may be attributed to its low frequency in our population.Indeed, allelic frequencies lower than 2.5% have been previously reported for this allele among the so-called African and Latino North Americans (21).It is noteworthy that a frequency of 3.5% for the Thr49 allele has been reported only in a sample of white North American prostate cancer patients (37), suggesting that this allele may be of relevance for this disease.
The CYP17 A1/A2 allelic frequencies (Table 5) are similar to those described for a mixed North American population (24) and do not differ from a European sample (38).On the other hand, significant differences are observed concerning the reported frequencies from Swedish (χ 2 = 7.73; P = 0.02) and Japanese populations (χ 2 = 18.93;P = 0.0001).
To our knowledge, this is the first report concerning the frequencies of these six poly-morphic markers in South America.The allelic frequencies of the autosomal markers are closely similar to those described in most North American and European populations.In relation to these polymorphisms, further studies should be conducted on prostate cancer patients to verify if the alleles that are considered of relevance in most populations can also be taken into account in Brazilian urban populations, where significant contributions from European and African gene pools can be found.In contrast, remarkable differences from reported literature data were observed for the CAG and GGN repeat number combinations.Because of the importance of AR in prostate cancer etiology, further investigations are needed to evaluate the combined effect of CAG and GGN repeats and their use as molecular markers for the identification of men at higher risk of developing prostate cancer in the Brazilian population.

Figure 2 .
Figure 2. Distribution of GGN repeats among individuals from Campinas and São José do Rio Preto.

Figure 1 .
Figure 1.Distribution of CAG repeats among individuals from Campinas and São José do Rio Preto.

Table 2 .
Estimated haplotype frequencies for the AR gene.

Table 3 .
CAG and GGN repeats in different populations.

Table 4 .
Prevalent combined numbers of CAG and GGN repeats of the AR gene in different populations.

Table 5 .
Distribution of SRD5A2 and CYP17 genotypes in different populations.