Gene expression profile analysis of human intervertebral disc degeneration

In this study, we used microarray analysis to investigate the biogenesis and progression of intervertebral disc degeneration. The gene expression profiles of 37 disc tissue samples obtained from patients with herniated discs and degenerative disc disease collected by the National Cancer Institute Cooperative Tissue Network were analyzed. Differentially expressed genes between more and less degenerated discs were identified by significant analysis of microarray. A total of 555 genes were significantly overexpressed in more degenerated discs with a false discovery rate of < 3%. Functional annotation showed that these genes were significantly associated with membrane-bound vesicles, calcium ion binding and extracellular matrix. Protein-protein interaction analysis showed that these genes, including previously reported genes such as fibronectin, COL2A1 and β-catenin, may play key roles in disc degeneration. Unsupervised clustering indicated that the widely used morphology-based Thompson grading system was only marginally associated with the molecular classification of intervertebral disc degeneration. These findings indicate that detailed, systematic gene analysis may be a useful way of studying the biology of intervertebral disc degeneration.


Introduction
The intervertebral disc (IVD) is a cartilaginous structure that shows degenerative and age-related changes earlier than other connective tissues in the body (Urban and Roberts, 2003). Disc degeneration occurs when its cells die or become dysfunctional, especially in an acidic environment. During degeneration, the IVD becomes dehydrated and vascularized, and there is an ingrowth of nerves (Hughes et al., 2012). As a multi-factorial process, disc degeneration is influenced by factors such as genetic predisposition, lifestyle conditions, including obesity, occupation, smoking and alcohol consumption (Samartzis et al., 2012), aging phenomena (Gruber and Hanley, 2007) and other health factors, such as diabetes. Disc degeneration is important clinically because of a strong association between disc degeneration and back pain (Luoma et al., 2000), which is a major public health problem in modern industrialized societies. In addition, the altered physiology of the IVD during degeneration is associated with many other clinical symptoms or conditions, such as lower limb pain, paraesthesia, spinal stenosis and disc herniation (Hughes et al., 2012).
Previous studies have identified some molecular markers associated with disc degeneration. For example, asporin was found to be expressed at higher levels in degenerated human IVD (Gruber et al., 2009). The mRNA expression levels of ADAMTS-7 and ADAMTS-12 were significantly higher in endplate cells of degenerative discs compared with those in non-degenerative discs . In addition, genetic polymorphisms in collagen I, collagen IX, collagen XI, aggrecan, extracellular matrix-degrading enzymes, inflammatory cytokines, such as IL (interleukin)-1, IL-6 and TNFa (tumor necrosis factors a), Fas/FasL and vitamin D receptors have been associated with the development of intervertebral disc degeneration (IVDD) (Kalb et al., 2012).
In this study, we compared the gene expression profile data of IVDD patients with that of healthy controls. The gene profiles were downloaded from the gene expression omnibus (GEO) database and the intervertebral disc data were classified into two groups according to the Thompson grading system (Thompson et al., 1990).
2009) and GSE23130 (Gruber et al., 2012)). The signal intensities were re-calculated with the robust multi-array average (RMA) (Irizarry et al., 2003) using the Entrez Gene-based center custom Chip Description File (CDF) developed by University of Michigan (Dai et al., 2005). The single sample classified as Thompson grade I was considered as healthy (Gruber et al., 2009) and ignored before merging the GSE15227 and GSE23130 data. ComBat (Johnson et al., 2007) was applied to the merged dataset to remove potential unwanted batch effects. The samples were divided into two groups based on the Thompson grading system. Grade IV and V samples were considered as more degenerated (Gruber et al., 2009). Significant analysis of microarray (SAM) (Tusher et al., 2001) was applied to identify differentially expressed genes using a delta of 1. Functional annotation was done on DAVID (Huang da et al., 2009) using significantly differentially expressed genes.

Protein-protein interaction analysis
Differentially expressed genes were submitted to STRING 9.0 (Szklarczyk et al., 2011) to detect possible protein-protein interactions. EntrezGene IDs were translated into official gene symbols by DAVID before submitting. A threshold score of 0.4 was applied as default. Only those interactions that were proven experimentally were considered. A simple Perl script was used to exclude genes/proteins not in the query gene list.

Cluster analysis
The data for each gene was median-centered before clustering. Centered correlation was applied as a similarity metric and average linkage as a clustering method. The analysis was done in Cluster 3.0 and a heatmap was generated by TreeView. A one-sided Fisher's exact test was used to assess the consistence of the molecular and Thompson grade classifications.

Differentially expressed genes in IVD degeneration
Eleven disc samples (eight Grade IV and three Grade V) were classified as the more degenerated group while the remaining 26 samples (10 Grade II and 16 Grade III) were classified as the less degenerated group. One thousand sample permutations were used to estimate the false discovery rate (FDR). When delta was set to 1,555 genes were overexpressed in the more degenerated disc samples (Table 1) and 98 genes were underexpressed in the more degenerated disc samples (Table 2). Both groups had an FDR < 3%. The overexpressed gene list was referred to as IVDD_UP and the underexpressed gene list as IVDD_DN. Several of the IVDD_UP genes have been reported before. For example, asporin, which is present in the cartilage extracellular matrix (ECM) (Henry et al., 2001) and may be genetically as-sociated with osteoarthritis (Loughlin, 2005), showed higher expression levels in Thompson Grade IV human discs compared to lower grade discs (Gruber et al., 2009). A highly conserved secreted serine protease, HTRA1, which degrades numerous extracellular matrix proteins, was also found in IVDD_UP. HTRA1 mRNA and protein levels are significantly elevated in degenerated disc tissue (Tiaden et al., 2012). The gene expression of cathepsin K, a cysteine protease that cleaves the triple helical domains of collagen types I and II, was significantly greater in more degenerated discs (grades III and IV) compared to healthier discs (grades I and II) (Gruber et al., 2011).

Functional annotation of differentially expressed genes
Differentially expressed genes were submitted to DAVID (Huang da et al., 2009) for functional annotation. Databases for disease information, functional categories, gene ontology, curated pathways and protein domains were included to provide a comprehensive knowledge source. For the IVDD_UP genes, 195 functional annotation clusters were identified. The top three clusters ranked by the enrichment score were mainly associated with membranebounded vesicles, calcium ion binding and extracellular matrix, with enrichment scores of 4.7, 3.61 and 3.25, respectively. Noticeably, these enriched genes were also related to the biological process of skeletal system development, including cartilage development. Among the IVDD_DN genes, there were only 19 clusters with the highest enrichment score being only 0.87; this score was for a cluster involving mainly zinc fingers and the Krueppelassociated box.

Protein-protein interactions among differentially expressed genes
STRING showed that 159 experimentally proven protein-protein interactions were formed by 175 genes in IVDD_UP (Figure 1). Fibronectin 1, COL2A1 (collagen, type II, a1) and b-catenin were the three genes/proteins that had the most interacting partners (Figure 2). Of these 175 genes, 107 had only one interacting partner in IVDD_UP and thus resembled the phenomenon of scale-free biological networks (Barabasi and Albert, 1999). Surprisingly, no interactions were observed amongst IVDD_DN genes.

Cluster analysis of IVD tissues
Unsupervised hierarchical clustering was done on 22 disc tissues of GSE23130. The experimental results obtained using all 18,818 genes or only 1,984 genes with a minimum sample standard deviation of 1 were highly similar ( Figure 3). A one-tailed Fisher's exact test showed a marginally significant association between the Thompsonbased classification and the unsupervised molecular classification (p = 0.04799 for the 18,818-gene case and 0.09133 for the 1,984-gene case).

Discussion
Our microarray analysis of the gene expression profile of degenerated IVD detected hundreds of differentially expressed genes that may be associated with IVD degeneration. Functional annotation supported the biological relevance of our findings. Protein-protein interaction analysis of IVDD_UP revealed genes that may have key roles in IVD degeneration. Fibronectin 1 (FN1), a molecular marker for fibrosis (Leask and Abraham, 2004), was upregulated in punctured mouse tail discs that showed progressive degenerative changes and were fibrocartilaginous (Yang et al., 2009). Earlier studies reported that the degradation fragments of fibronectin (Fn-f) accumulate in the disc during degeneration (Oegema et al., 2000) and induce IVD degeneration in rabbits (Greg Anderson et al., 2003). Given that HTRA1 can induce fibronectin proteolysis and one of the resultant fragments was found to be increased in HTRA1-treated IVD cell cultures as well as in disc tissue from patients with IVD degeneration, it was hypothesized that HTRA1 promoted IVD degeneration through the proteolytic cleavage of fibronectin and subsequent activation of resident disc cells (Tiaden et al., 2012). The overexpression of these two genes was consistent with this hypothesis. Interestingly, fibronectin mRNA and protein levels were significantly down-regulated following the upregulation of HTRA1 in rhesus monkey choroid-retina endothelial cells (RF/6A) and human umbilical vein endothelial cells (HUVECs) (Jiang et al., 2012), suggesting that some other unknown factors may be involved in the interaction between HTRA1 and fibronectin in IVD tissues. 450 Gene analysis of intervertebral discs The IVDD_UP group of genes contained four collagen genes: COL11A1, COL1A2, COL3A1 and COL2A1. COL2A1 interacted with eight other proteins encoded by genes in IVDD_UP. Mutations in COL2A1 give rise to a spectrum of phenotypes that predominantly affecting cartilage and bone. These chondrodysplasias are typically characterized by a disproportionately short stature, eye abnormalities, cleft palate and hearing loss. Increasing evidence has also implicated COL2A1 mutations in individuals with isolated degenerative joint disease (Kannu et al., 2010). The relationship between COL2A1 gene polymorphisms and knee osteoarthritis were also investigated in Han Chinese women (Xu et al., 2011).
Mutations in the COL11A1 gene (also known as STL2) have been identified in some people with Stickler syndrome (Martin et al., 1999). Other mutations in this gene cause segments of DNA to be skipped when the protein is being made, resulting in an abnormally short proalpha chain. Alterations in COL11A1 impair collagen func-tion and can lead to hearing loss, tearing of the lining of the eye, and bone and joint abnormalities (Rodriguez et al., 2004). COL1A2 is a protein found in most connective tissues. Mutations in this gene are associated with asteogenesis imperfecta, Ehlers-Danlos syndrome, idiopathic osteoporosis and atypical Marfan syndrome (Vasan et al., 1991;Ward et al., 2001;Hartikka et al., 2004). However, the symptoms associated with mutations in this gene tend to be less severe than with mutations in the gene for a1 type I collagen since the a2 form is less abundant (Bou-Gharios et al., 2004). Multiple messages for this gene result from multiple polyadenylation signals, a feature shared with most of the other collagen genes (Zuo et al., 2012).
COL3A1 is a protein that, in humans, is encoded by the COL3A1 gene located on chromosome 2 (Cutting et al., 1990). COL3A1 is a fibrillar collagen found in extensible connective tissues such as skin, lung and the vascular system, frequently in association with type 1 collagen. Mutations in this gene often lead to the exclusion of multi-exons    Superti-Furga et al., 1988). Other studies have shown that (1) IVD cells respond strongly to changes in the osmotic environment by altering their mRNA expression, e.g., human cells cultured over five days showed increased expression of aggrecan and collagen II in the nucleus and annulus cells in the presence of elevated osmolarity (Wuertz et al., 2007), and (2) there is an increase in collagen-II, aggrecan and Sox-9 protein expression in the nucleus pulposus (NP) and anulus fibrosus (AF) regions of discs from running exercised rats compared with non-exercised controls (Brisby et al., 2010). Our result may indicate an as yet unknown role for COL2A1 in degenerated IVD. Proper regulation of Wnt/b-catenin signaling is required for development and organization of the IVD (Kondo et al., 2011). Furthermore, b-catenin was overexpressed in IVD extracted from patients with IVD degeneration and X-ray and micro-CT analyses revealed osteophyte formation and narrowing of the disc space in three-month-old b-catenin conditional activation (cAct) mice (Tang et al., 2012). Our results may provide clues for studying the biological mechanism of b-catenin in human disc tissue.
Unsupervised clustering has provided insights into the molecular heterogeneity of complex diseases such as cancer and is useful in cancer diagnosis and therapy. The application of this analysis in our study showed that the widely used morphology-based Thompson grading system was only marginally associated with the molecular classification of IVDD. This interesting finding indicates that there is scope for improving the clinical assessment of the progress of IVDD.