ANRIL rs1333049 C/G polymorphism and coronary artery disease in a North Indian population - Gender and age specific associations

Abstract Many studies conducted worldwide substantiate a role of genetic polymorphisms in non-coding regions linked with coronary artery disease (CAD). One such single nucleotide polymorphism (SNP) of a non-coding RNA in the INK4 locus (ANRIL) i.e. rs1333049 C/G in the vicinity of cell cycle regulating genes is documented to have a role in CAD risk. In this study we aimed to determine the association of ANRIL rs1333049 C/G with CAD in a North Indian population. Five hundred disease free controls and 500 CAD patients were genotyped using allele specific ARMS-PCR method. High risk association of rs1333049 was seen in both heterozygous and mutant genotypes (OR=2.883, 95% CI=1.475-5.638 and p=0.002 and OR=6.717, 95% CI=3.444-13.102 and p < 0.001 respectively). Gender stratified analysis revealed risk association in both heterozygous and mutant genotypes in males. However, risk association in the mutant genotype and females was documented. Similarly, risk association was seen in subjects above 40 years of age in heterozygous and mutant genotypes. Similarly, risk association was reported in obese, sedentary lifestyle, positive family history and smoking in the heterozygous and mutant genotype and with diabetes in the mutant GG genotype. The study revealed high risk association of ANRIL rs1333049 with CAD and other risk factors.


Introduction
Coronary artery disease (CAD) has become epidemic worldwide and a major barrier to sustainable human development. It has been lately observed that around 16.5 million people above 20 years of age in United States of America (U.S.) suffer from CAD. Not only that, the prevalence increases in both genders with a gradual increase in age (Sanchis-Gomar et al., 2016). The incidence in developing countries like India is also alarming and studies have reported a boost in CAD prevalence since past half century. There are a number of established key modifiable and non-modifiable factors like age, gender, genetics, smoking, dyslipidemia, hypertension, diabetes, obesity, high-fat diet, physical inactivity, drug abuse, alcohol consumption and mental stress attributing significant risk towards the disease.
An individual's risk of harboring CAD is inflected by the interplay between genetic and lifestyle factors established by the multifactorial nature of CAD. A genetic component in CAD is validated from the increased risk in first degree relatives of the affected individuals, high lifetime risk in the offspring if parents are affected and high concordance in monozygotic than dizygotic twins. The first Genome Wide Association (GWA) studies for CAD were published in 2007 and since then, a number of genetic variants at various chromosomal loci specific to CAD in various populations have been identified (Scheffold et al., 2011).
GWA studies document a locus on chromosome 9p21 linked to CAD (Jarinova et al., 2009;Cunnington et al., 2010;Ahmed et al., 2013).Though this 58 kb locus lacks the genes associated with atherosclerosis, an antisense non-coding RNA in the INK4 locus (ANRIL) gene dwells within the vicinity of cell cycle regulating genes in this region. It is reported to be in strong linkage disequilibrium with cell cycle proliferatory genes such as cyclin dependant kinase inhibitors 2A and 2B (CDKN2A and CDKN2B) (Cunnington et al., 2010). CDKN2A is basically a tumor suppressor gene and encodes two proteins viz. p14ARF and p16. p16 controls the G1 to S transition in the cell cycle and p14ARF stimulates cell cycle arrest in G2 phase, subsequently leading to cell death. CDKN2B lies adjoining the CDKN2A and encodes proteins that inhibit the cell cycle G1 progression (Cunnington et al., 2010).
CDKN2B anti-sense RNA (CDKN2B-AS1) spans about 126.3 kb and overlaps with CDKN2B (p15) at the 5' end and comprises of 20 exons that are prone to alternative splicing (Jarinova et al., 2009) and reported to be linked to CAD risk (Matsuoka et al., 2015;Dehghan et al., 2016), hypertension (Bayoglu et al., 2016) and stroke (Bai et al., 2014). CDKN2B-AS or CDKN2B-AS1 or INK4 are used as synonyms for ANRIL. The ANRIL locus is reported to alter the expression of neighbouring genes by apparently acting either by chromatin remodeling, DNA methylation, gene silencing or RNA interference (Jarinova et al., 2009).
The SNP rs1333049 C/G is positioned in the 3'UTR (untranslated region) of CDKN2B-AS1 and considered to have a crucial role in advancement of cardio and cerebrovascular disease by modifying dynamics of vascular smooth muscle cell proliferation (Cunnington et al., 2010). The Wellcome Trust Case Control consortium study has documented rs1333049 as displaying powerful association with CAD (Consortium, 2007). The association of rs1333049 was studied in CAD (Dechamethakun et al., 2014;Haslacher et al., 2016), atherosclerosis (Bochenek et al., 2013) and Alzheimer's disease (Popov et al., 2010).
The present study was conducted with the aim of determining allelic and genotypic frequencies of ANRIL rs1333049 and risk association with CAD and other selected parameters in a North Indian population.

Study population
One thousand individuals aged 25-70 years of both sexes were enrolled to evaluate the role of LOX1 rs11053646 G/C and rs1050283 C/T polymorphisms in CAD. Five hundred patients belonging to North Indian states (Jammu and Kashmir, Haryana, Chandigarh, Punjab, Himachal Pradesh, New Delhi, Uttaranchal, Uttar Pradesh, Uttarakhand and Rajasthan) visiting the Department of Cardiology at Postgraduate Institute of Medical Education and Research, Chandigarh and documented CAD on coronary angiogram with more than 50% stenosis in at least one epicardial coronary artery) were registered as cases. Subjects with acute/chronic infection, hepatic dysfunction, renal dysfunction, severe heart failure, hypo or hyperthyroidism, pregnancy and malignancy were excluded. Five hundred healthy individuals satisfying the inclusion criteria (with absence of any cardiac disorder, chronic diseases such as diabetes, hypertension, hypo-or hyperthyroidism, tuberculosis, hepatitis, AIDS, malignancy and pregnancy) were enrolled as controls. Subjects with history of smoking, alcohol consumption and tobacco chewing were also excluded. Majority of the controls were donors at the blood donation camps. A written informed consent was given by all participants prior to enrollment. The study was approved by the Institutional Ethics Committee, Panjab University, Chandigarh, India and performed according to the "Ethical Guidelines for Biomedical Research on Human Participants, 2006" as proposed by the Indian Council of Medical Research and Ministry of Health, Govt. of India.

Biometric and biochemical measurements
Anthropometric parameters like height, weight, waist to hip ratio, BMI and blood pressure were noted. Risk factors for CAD like diabetes, hypertension, dyslipidemia, family history, smoking and drinking habits were recorded. Lipid profile, fasting blood glucose, hsCRP, uric acid Apolipoprotein A1 and Apolipoprotein B determination was done by standard biochemical methods.

DNA isolation, SNP selection and genotyping
Five milliliters of venous blood sample was collected in EDTA-coated vials and DNA was isolated by the sodium saline citrate buffer method (Roe et al., 1996). Genotyping of the ANRIL rs1333049 C/G polymorphism was done by allele-specific ARMS-PCR using sequence specific primers (forward primer for C allele: TCC TCA TAC TAA CCA TAT GAT CAA CAG TTC, forward primer for G allele: TCC TCA TAC TAA CCA TAT GAT CAA CAG TTG, internal control primer sequence: GAA GAT CAT ACC CGA AGT AGA GCT GC. For all forward primers a common reverse primer was used, with the sequence ATA CCA CAG TGA ACA TAA TTG TGC ATA CAT). The PCR was carried out in a thermal cycler with a total volume of 25 mL containing: 10X PCR Buffer, 3 mM MgCl 2 , 1 mg/mL nuclease free BSA, 50 pmol each of allele specific forward primer, reverse primer and internal control primer, 10 mM of each dNTP, 0.125 U Taq polymerase and 2 ml genomic DNA. The PCR cycle included an initial denaturation step of 5 min at 94°C, followed by 35 cycles with denaturation for 30 s at 94°C , annealing for 30 s at 59°C elongation for 30 s at 72°C and a final elongation of 10 min at 72°C.
Separate PCR was performed for both alleles of one SNP. The DNA fragments obtained were separated on 3% agarose gel stained with EtBr followed by visualization with UV transilluminator. A 280 bp fragment signified CC or GG genotype and 500 bp signified the internal control. If bands were seen for both alleles, it was interpreted as the heterozygous genotype ( Figure 1). Correctness of genotyping was checked by re-genotyping of 10% of the samples. Results from the repeated samples were 100% consistent with our primary results.

Statistical analysis
Continuous variables that were not normally distributed were expressed as the mean ± standard deviation (SD). Categorical variables were reported as counts and percentages. Chi-square test was used to calculate the difference between baseline characteristics. To investigate the association of SNP and the susceptibility to CAD, multivariate logistic regression was applied adjusting for age and gender. Furthermore, recessive and dominant models were analysed. Stratified analysis for gender and age was also done for the assessment of association and expressed in odds ratio (OR) and 95% confidence interval (CI). A statistical significance of p < 0.05 was considered for the analysis. All the data was analysed using SPSS version 20.0 (SPSS, Inc., Chicago, IL) and Epi Info version 3.4.7 (CDC, Atlanta, GA).

Results
The distribution of allele frequencies of the selected polymorphism followed the Hardy-Weinberg equilibrium. The baseline parameters of patients and controls are listed in Table 1. The results revealed a statistically significant variation between the two groups with respect to age, gender, smoking, drinking, waist to hip ratio, lifestyle, family history, dyslipidemia, diabetes, diet, hypertension, occupation, exercise, fasting blood sugar, uric acid, TC, VLDL, LDL, Apo A1, Apo B, but not with total lipids, triglycerides, BMI, HDL and hsCRP.
The genotypic frequencies revealed the wild (CC) genotype to have a higher frequency in control (49.2%) than in CAD patients (33.2%). The heterozygous genotype (CG) was found to be highly prevalent among the cases (62.8%) in comparison to the controls (34.0%) with OR=2.883, 95% CI=1.475-5.638, p=0.002 and the homozygous mutant (GG) genotype had a higher frequency in controls (16.8%) than in the cases (4.0%) thereby conferring an increased risk with a high significance p < 0.001, OR=6.717 and 95% CI=3.444-13.102 (Table 2). The dominant and recessive models were also analysed to see the association of the polymorphism with CAD. A protective association was seen in the dominant model (OR=0.582, 95% CI=0.402-0.842 and p=0.004) whereas elevated risk association with CAD was observed in the recessive model (OR=4.609, 95% CI=2.431-8.741 and p < 0.001).
The data was stratified on the basis of age i.e. below 40 years and above 40 years (Table 3). High risk association in subjects above 40 years was documented for both the heterozygous and the mutant genotypes, with OR=2.647, 95% CI=1.287-5.447 and p=0.008 and OR=5.506, 95% CI=2.688-11.278 and p < 0.001 respectively. Also the G allele showed risk association (OR=1.600, 95% CI=1.226-Sex and age specific associations 3  2.088 and p < 0.001). For the age group less than 40 years, there is no mutant GG genotype in the cases. Therefore, calculations regarding allelic frequencies were not possible because one value is entirely missing. Nevertheless, a nonsignificant association could be seen in subjects below 40 years of age with the selected polymorphism. Stratification of the data on the basis of gender was also done (

Discussion
This study aimed to understand risk association of ANRIL rs1333049 C/G to CAD in a North Indian population. Sequence specific ARMS-PCR was used for genotyping and results showed a considerable risk association towards CAD and the same was observed in the recessive model (Table 1). Moreover, the allelic frequencies also conferred a significant association with CAD (p < 0.05). Also, the rs1333049 C/G polymorphism showed risk towards CAD for age above 40 years, males and females, obesity, sedentary lifestyle, family history, diabetes and smoking.
A few studies from India have tried to explore genetic polymorphisms at this selected locus. A GWA study done with a South Indian population on 9p21 locus reported two SNPs (rs2383207 and rs10757278) conferring elevated risk to CAD (AshokKumar et al., 2011). Also, the work done by Kumar et al. (2011) on the North Indian population, reports three SNPs (rs2383206, rs1333040 and rs10116277) at 9p21 locus to be associated with CAD risk. The rs10757278 polymorphism at the same locus also correlates with CAD risk as reported by two studies by (Maitra et al., 2008;Bhanushali et al., 2011). Only two studies report data on rs1333049 C/G and CAD risk in West and North Indian populations. (Bhanushali, et al., 2013) recruited 229 CAD patients and 136 controls from West India and revealed an association towards CAD with an OR=2.460, 95% CI=1.139-5.314 and 4 Kaur et al.  Kashyap et al. (2018) in their study on North Indian population reported risk for both the allelic and genotypic frequencies. This study also showed an association with CAD with an OR=6.717, 95% CI=3.444-13.102 and p < 0.001. Thus, the results point towards the fact that both the North Indians as well as the West Indians are susceptible to CAD due to this polymorphic change in ANRIL rs1333049.
The multiple conventional risk factors for CAD such as diabetes, hypertension, dyslipidemia, etc. become additive with increasing age, thereby contributing to atherosclerosis leading to CAD. A positive risk association in the subjects above 40 years of age was seen in the study with the mutant genotype having an OR=5.506 with a highly significant p < 0.001 (Table 2). However, Bhanushali et al. (2013) reported the SNP to be robustly associated with premature or Sex and age specific associations 5 the early onset CAD which is also supported by the results of Meng et al. (2008), Ellis et al. (2010) and the association was supported by meta-analysis study done by Palomaki et al. (2010). But in present study, no mutant in the CAD patients was found below 40 years of age in the pre-specified subgroup analysis based on age. The overall frequency of GG mutant genotype in our selected population is 10.4% i.e. only 104 individuals have the GG genotype thereby pointing towards its low prevalence in our selected North Indian population.
The discrepancy in results emphasizes the need to genotype all the risk variants particularly at this locus, as this will help in delineating the varied risk associations in different populations to CAD. However, the impact of the polymorphism with disease extent and severity is disputable with Ye et al. (2008) and Dandona et al. (2010) stating it as a predictor of severity, whereas Anderson et al. (2008) and Chen et al. (2009) contradicting it. Additionally, the present study results showed a strong association with the family history which is in accordance with previous work (Preuss et al., 2010;Scheffold et al., 2011). Gender stratified analysis depicted significant association in both the genders which is in harmony to the results reported by Ahmed et al. (2013) on Northern Pakistani population.
In summary, we conclude that ANRIL rs1333049 C/G is associated with susceptibility to CAD in North Indian population and also associations with many risk factors have been documented. Although 9p21 locus association with risk of CAD is very well recognized, relationship with the clinical outcomes remains unclear and unanswered. The chosen SNP is intronically located but still can affect gene expression. Therefore, future studies with higher sample size, multiple SNPs from the locus and linkage studies are needed to authenticate our results that might cause identification of more SNPs at this particular locus as biomarkers for CAD predisposition.
Analyzing the SNPs which are substantially associated with CAD in North Indian population will be useful to identify promising SNP-CAD associations unique to the population. Moreover, CAD poses threat not only to an individual and his family but also to the community and the nation on the whole as the most productive years of one's life is spent struggling with the disease. The drastic change in lifestyle and eating habits and the increased tendency to rely on machines and other forms of assistance has substantially decreased one's physical effort and rendered individuals highly susceptible to CAD. Comprehending the genetic foundation of CAD is highly needed these days that will help in screening individuals at high risk and will also lay the groundwork for the coacervation of genetic data and routine clinical practice, which can one day spearhead the arena of "personalized medicine". 6 Kaur et al.