Gene expression profiling of clinical stages II and III breast cancer

Clinical stage (CS) is an established indicator of breast cancer outcome. In the present study, a cDNA microarray platform containing 692 genes was used to identify molecular differences between CSII and CSIII disease. Tumor samples were collected from patients with CSII or CSIII breast cancer, and normal breast tissue was collected from women without invasive cancer. Seventy-eight genes were deregulated in CSIII tumors and 22 in CSII tumors when compared to normal tissue, and 20 of them were differentially expressed in both CSII and CSIII tumors. In addition, 58 genes were specifically altered in CSIII and expression of 6 of them was tested by real time RT-PCR in another cohort of patients with CSII or CSIII breast cancer and in women without cancer. Among these genes, MAX, KRT15 and S100A14, but not APOBEC3G or KRT19, were differentially expressed on both CSIII and CSII tumors as compared to normal tissue. Increased HMOX1 levels were detected only in CSIII tumors and may represent a molecular marker of this stage. A clear difference in gene expression pattern occurs at the normal-to-cancer transition; however, most of the differentially expressed genes are deregulated in tumors of both CS (II and III) compared to normal breast tissue. Correspondence


Introduction
A current hypothesis of tumorigenesis suggests that cancer cells sequentially acquire hallmarks of malignancy, which reflect genetic alterations that drive the progressive transformation of normal cells into highly malignant derivatives.During breast cancer development this process seems to be clinically and pathologically manifested as a sequence of defined stages according to the extent of disease, which is determined on the basis of information about tumor size, nodal status and distant metastases, i.e., the most powerful indicators of disease prognosis (1).However, individual prognosis varies sig-nificantly within each subgroup, indicating the heterogeneity of the current tumor stages.USA Surveillance, Epidemiology and End Results (SEER) cancer registries show that 5-year specific breast cancer mortality rates range from 4.7 to 30.2% for clinical stage II (CSII) and from 17.8 to 54.5% for CSIII, suggesting the overlapping nature of these tumors (2).
Recent findings indicate that the potential for distant metastasis and overall survival probability may be attributable to biological characteristics of the primary tumor, reflected by a specific gene expression signature present upon diagnosis (3)(4)(5)(6)(7)(8)(9)(10)(11).Hence, new evidence suggests that tumor gene profiling might be viewed as a valuable source of additional information supplementing clinical and pathobiological markers.
Biologically relevant genes and biochemical pathways involved in breast tumor development are not completely understood and their elucidation may allow tailored molecular-based preventive and therapeutic approaches.Thus, in the present study we investigated the gene profile of breast tumor samples obtained from CSII or CSIII patients compared to normal breast tissue.

Patients and Methods
Forty-seven patients were prospectively studied at three reference centers for cancer treatment in São Paulo State, Brazil: Instituto Brasileiro de Controle do Câncer, São Paulo, Hospital do Câncer A.C. Camargo, São Paulo, and Hospital Amaral Carvalho, Jaú, from January 2002 to August 2003.These patients were primarily included in a study to determine the gene expression profile associated with the response to neoadjuvant chemotherapy based on doxorubicin (12).The study was approved by the Ethics Committee of each Institution and written informed consent was obtained from all participants.
Invasive breast cancer was confirmed histopathologically in samples obtained by core or incisional biopsy.Clinical staging included physical examination with inspection and palpation of the skin, mammary gland, and lymph nodes (axillary, supraclavicular and cervical).Imaging studies included a chest X-ray, abdominal ultrasound and bone scintillography.According to the classification of the American Joint Committee on Cancer (1997), 5 patients were clinically staged as IIA, 2 patients as IIB, 24 patients as IIIA, and 16 patients as IIIB.Patient characteristics according to CS (II versus III) are presented in Table 1.Most of the patients were post-menopausal (>50%).Previous invasive cancer had not been detected in any patient, except in one who had been previously submitted to unilateral mastectomy for invasive ductal carcinoma 37 months before presenting contralateral breast cancer and had already received adjuvant chemotherapy.
Infiltrating ductal carcinoma was diagnosed in most of the patients (>82%) independent of clinical stage.Other types of invasive carcinoma detected in CSIII patients were lobular (N = 2), mixed ductal and lobular (N = 1), cribriform (N = 1), medullary (N = 1), and apocrine (N = 1).Histological type was not defined in two patients, one of them clinically staged as II and the other as III.Most tumors were of high histological grade (II or III), with only 16% of CSII patients and 6% of CSIII patients presenting a grade I tumor.There were no differences in immunohistochemical expression of estrogen receptor, p53 or ErbB2 between CSII and CSIII patients (Table 1).Nine normal samples were obtained from perilesional mammary tissue from patients submitted to resection of benign lesions or noninvasive carcinoma (5 fibroadenomas and 3 in situ ductal carcinomas).One patient was operated for a reduction mammoplasty.The median age of the patients without invasive cancer was 42 years (range: 21-73 years).

cDNA microarray assembly, hybridization and analysis
To assemble the cDNA microarray glass slides, literature and SAGE libraries were reviewed in order to select genes expressed in mammary tissue and breast cancer.Some open reading frame-expressed sequence tags (13) identified as being expressed in other cancer types, such as head and neck and stomach, were also added to the slides, some of them corresponding to unknown genes.
Sequences representing 692 genes were then chosen from the Human Cancer Genome Project bank (Fundação de Amparo à Pesquisa do Estado de São Paulo, FAPESP/ Instituto Ludwig de Pesquisa sobre o Câncer), or synthesized by PCR (14).Inserts were amplified by PCR using M13 reverse and forward primers from the cDNA clones.Amplicons purified by gel filtration and clones were printed in three or six replicates onto Corning slides using a Flexys Robot (Genomic Solutions, Ann Arbor, MI, USA).Some genes were represented by two clones corresponding to different regions of the cDNA.cDNA microarray platform and data, complying with MIAME format, have been submitted to the Gene Expression Omnibus data repository under accession numbers GPL 1727 and GSE2048 (www.ncbi.nlm.nih.gov/projects/geo).
Samples obtained from tumor biopsies were hand dissected to eliminate normal tissue, fibrosis, and adipose tissue and, after microscopic analysis, only samples composed of at least 80% malignant cells were further processed.Histologic analysis was performed to select only normal tissue and non-tumor samples were also hand dissected to discard adipose tissue and fibrosis.
Total RNA from frozen specimens was isolated using Trizol reagent (Invitrogen Corporation, Carlsbad, CA, USA) according to the manufacturer's protocol.RNA quality was confirmed by agarose gel electrophoresis after visualization with ethidium bromide.Only RNA samples with a >1 ratio for 28S/18S ribosomal RNA were further processed.A two-round RNA amplification procedure was carried out by combining antisense RNA amplification with a template-switching effect according to a previously described protocol (15).At the start, 3 µg total RNA was used to yield about 60 µg amplified RNA.Three to 5 µg amplified RNA were then used in a reverse transcriptase reaction in the presence of Cy3-or Cy5labeled dCTP (GE Healthcare Life Sciences, Little Chalfont, St. Giles, UK) and Super-Script II (Invitrogen-Life Technology, Carlsbad, CA, USA).The HB4A normal epithelial mammary cell line, kindly donated by Drs.Mike O'Hare and Alan Mackay (Ludwig Institute for Cancer Research, University College of London, London, UK) (16), was used as reference for the hybridizations.These cells were processed in the same manner as the breast tissue samples.
A mixture of equal amounts of breast tumor samples and HB4A cDNA labeled probes was hybridized on cDNA microarray slides.Dye swap was performed for each sample analyzed to control for dye bias.Prehybridization was carried out in a humidified chamber at 42 o C for 16-20 h and hybridization at 65 o C in a GeneTac Hybridization Station (Genomic Solutions).
Hybridized arrays were scanned with a confocal laser scanner (Arrayexpress, Perkin-Elmer, Wellesleym, MA, USA), using iden-tical photomultiplier voltage for all slides and data were recovered with the Quantarray software (Perkin Elmer) using histogram methods.After image acquisition and quantification, saturated spots (signal intensity higher than 63,000) as well as unreliable low-intensity spots, defined as those within the 95% percentile of intensity distribution of known empty spots, were removed from the analysis.Replica (3-6 times) spots representing the same gene were identified, average signal intensity was determined and spots with low reproducibility between technical replicates (mean plus 2 SD cut-off) were excluded, and the average signal was then once again evaluated without these spot values.Transcripts missing in at least two arrays were also eliminated from analysis.Quantified signals were submitted to log transformation and to Lowess normalization.
The permuted Student t-test (10,000 permutations) was used to determine the level of significance of the difference in expression of each individual gene and false discovery ratio (FDR) was employed as a test for multiple analysis correction.Genes were considered to be differentially expressed if they satisfied an FDR level of 0.05.Hierarchical clustering analysis based on Euclidean distance and complete linkage was performed using the genes differentially expressed.Reliability of the clustering was assessed by the bootstrap technique using TMEV software (17).
Meta-analyses of our results along with those of public data (4,9,10) found in the Oncomine website http://141.214.6.50/oncomine/main/index.jsp were performed following a previously described procedure (18).Briefly, gene libraries were downloaded, data format was standardized, and common gene sets were determined.Duplicate gene entries were reduced to one by evaluating the mean expression in the same sample.Next, individual analysis of each data set was performed by evaluating the level of significance of the difference (P value, Student t-test) of each gene, as done in our own study.A proper meta-analysis was then carried out by determining a summary statistics for each gene (which accounts for all P values in all studies) and 100,000 random permutated summary statistics by randomly choosing P values from a study.The summary P value for each gene is the fraction of random permutated summary statistics, which are equal to or lower than the "real" summary statistic.

Real-time quantitative reverse transcription-polymerase chain reaction
Two micrograms of total RNA was reverse-transcribed using the oligo(dT) primer and Superscript II (Invitrogen).Real-time RT-PCR was performed using SYBR-green I (Sigma, St. Louis, MO, USA) in a Light-Cycler™ system (Roche Diagnostics, Mannheim, Germany) or a Rotor-gene system (Corbett Research, Mortlake, Australia), or alternatively using Taqman chemistry and an ABI Prism 7700 sequence detection system (Applied Biosystems, Foster City, CA, USA).
For the Taqman assay, primers and probes were synthetized by the custom TaqMan ® Gene Expression Assays service (Applied Biosystems) (Table 2).Two microliters of the diluted c-DNA was amplified in a final volume of 25 µL with 1X Mix Assays-by-Design ™ (Applied Biosystems).Thermal cycling consisted of 2 min at 95ºC, 40 cycles at 95ºC for 15 s, and 60ºC for 1 min.Normal human mammary glands (RNA from 2 Caucasian women 26 and 27 years old) (BD Biosciences Clontech, Palo Alto, CA, USA) were used as reference.
Experiments were performed in duplicate and relative expression of the genes of interest was normalized to that of ß-actin (ACTB).Gene expression in each sample was then compared with expression in normal human mammary glands or HB4A cells, as indicated.The comparative C T method (∆∆C T ) was used for quantification of gene expression and relative expression was calculated as 2 -∆∆C T (19).The relationship between gene expression of samples analyzed by both cDNA microarray and quantitative RT-PCR was determined by Pearson correlation.At least 19 samples were analyzed and variables were significantly correlated if a critical value for Pearson r ≥ 0.456 and P ≤ 0.05 (two-tailed) were both attained.
Gene expression from samples of a second cohort of patients evaluated only by quantitative RT-PCR was analyzed by the Kruskal-Wallis test and found to be statistically different if they satisfied a two-tailed level of significance ≤0.05.The Mann-Whitney U-test was subsequently used to detect differences between groups.

Results
Twenty-two genes were differentially expressed by CSII tumors and normal breast tissue, representing approximately 3.2% of the genes analyzed, and unsupervised hierarchical clustering permitted the proper discrimination of normal and tumor samples (Figure 1).
When CSIII tumors were compared to normal breast tissue, 78 differentially expressed genes were identified (11.4% of the genes) which could correctly cluster 100% of the normal samples as well as 92.5% of the tumor samples (Figure 2).Three of 40 tumors seemed to cluster inappropriately, two of them grouping with the normal ones: I20, obtained from a 48-year-old patient with a T3N1M0 ductal carcinoma, histological  grade 2, ER positive (80% of the cells), PR positive, ErbB2 negative, and P53 negative, and I23, obtained from a 63-year-old patient with a T3N2M0 invasive cribriform carcinoma.After mastectomy, the tumor of the latter patient was found to be a histological grade 3 invasive ductal carcinoma negative for ER, PR, ErbB2, and p53 expression as determined by immunohistochemistry staining.Another sample (Q144) from a woman with an apocrine estrogen receptor-negative CSIII carcinoma presented a peculiar gene expression that prevented its correct clustering among the other tumors.Among the 78 genes differentially expressed in CSIII tumors as compared to normal breast tissue, 20 were also deregulated in CSII tumors in relation to normal tissue (Table 3).Therefore, two of 22 genes, JUNB and MUC4, were under-expressed only in CSII tumors compared to normal tissue.In addition, 58 genes were exclusively deregulated in CSIII tumor samples in relation to normal breast (but not in the CSII tumor and normal tissue comparison; Table 4).
The reliability of the cDNA microarray analysis was evaluated by quantitative RT-PCR for 5 selected genes (FOS, REL, SLC9A3R1, GATA 3, FN3KRP), and 21 breast tumor samples were re-analyzed.The first 3 genes were considered to be differentially expressed in CSIII tumors (Table 4) on the basis of the criteria described in methods for cDNA microarrays analysis.Although this cDNA microarray platform also contained spotted clones of FN3KRP and GATA 3, these genes were not found to be differentially expressed in the tumor/normal tissue comparison.The correlation between the two assays was significantly positive for 60% of the genes, FOS, GATA 3, and SLC9A3R1 (Table 5), indicating that RT-PCR data agreed with the cDNA microarray data.
To further determine whether some genes found to be primarily deregulated only in CSIII tumors by cDNA microarray analysis (S100A14, MAX, KRT19, HMOX1, APOBEC3G, and KRT15, Table 4) could be  Genes exclusively differentially expressed in CSIII tumors and in normal tissue at a false discovery ratio level of 0.05.Genes printed in bold type and with an asterisk were also found to be differentially expressed in tumor and normal tissue in the meta-analysis.These genes were found to be differentially expressed by tumors from both clinical stages (CSII and CSIII) when compared to normal breast tissue, at a false discovery ratio level of 0.05.Genes printed in bold type and with an asterisk were also found to be differentially expressed between tumor and normal tissue in the meta-analysis.Meta-analysis of public data (4,9,10) and of our own data.Genes printed in bold type and with an asterisk were differentially expressed in tumor and normal tissue in our analysis.

Gene expression of clinical stages II and III breast cancer
markers of this clinical stage or were also deregulated in CSII tumors, we analyzed their expression by quantitative RT-PCR in an independent cohort of patients.Tumor samples from 31 patients clinically staged as CSII breast cancer and 22 patients staged as CSIII, median age 51 and 52 years, respectively, were then analyzed.Tumors were mainly invasive ductal carcinomas (90 and 81% of CSII and CSIII patients, respectively).Normal breast tissue obtained from another group of 18 women without cancer was used for comparison.The median age of these women was 37 years and they had their breasts operated for benign conditions, mainly fibroadenomas or fibrocystic conditions (83.3%).
Confirming our cDNA microarray results, 4 genes, MAX, HMOX1, KRT15, and S100A14, were differentially expressed, and another, APOBEC3G, showed a trend towards a differential expression between CSIII tumors and normal breast tissue, as analyzed by real-time RT-PCR (Figure 3).KRT19 expression, however, was similar in normal samples and CSIII tumors.On the other hand, MAX, KRT15 and S100A14 were also differentially expressed in CSII tumors and normal breast tissue.Therefore, among the 6 genes analyzed by real-time RT-PCR, only increased levels of HMOX1 may be a molecular marker of CSIII breast cancer.
We also performed a meta-analysis of the genes differentially expressed in breast tumors versus normal tissue, identified in some other published microarray reports (4,9,10) including our own data.Among the 692 genes represented on our cDNA microarray slides, 198 elements overlapped the elements of the large scale array platforms used by these other three groups.In the meta-analysis, 14 genes (Table 6) were differentially expressed in breast tumors and in normal tissue (summary P value <0.05), 9 of which were also found to be differentially expressed in tumors analyzed by us, indicating the sensitivity of the present cDNA microarray approach.

Discussion
To evaluate whether gene profile could characterize a particular clinical stage, transcripts from CSII and CSIII breast cancer samples were compared.The pattern of expression of 22 genes in CSII tumors and 78 genes in CSIII tumors best distinguished them from normal breast tissue.However, most of the genes differentially expressed by

Gene expression of clinical stages II and III breast cancer
CSII tumors (20/22) could not be considered specific markers of this stage since they were also found to be deregulated in the CSIII tumor/normal tissue comparison.
In general, tumor cells progressively lose the expression of genes involved in cell adhesion and in the maintenance of myoepithelial cell layers, such as several laminin chains and P-cadherin (CDH3), a specific myoepithelial marker (20,21).Other markers of myoepithelial cells, such as Serpin B5, a protease inhibitor, and caveolin-1, were down-regulated in CSIII tumors, suggesting a reduced proportion of these normal cells within tumors upon stage progression (22).Accordingly, a lower expression of a group of basal cytokeratins, generally expressed by myoepithelial cells (KRT5, 14,15,17), was observed in these advanced tumors.On the other hand, another keratin typical of luminal cells such as KRT8, but not KRT19, was over-expressed in tumors as compared to normal tissue.KRT19 ( 23) is expressed by both normal mammary glands and breast adenocarcinomas, and even though our cDNA microarray data showed a higher expression in CSIII tumors as compared to normal tissue, these values were greatly variable, and such results were not later confirmed by quantitative RT-PCR in another cohort of CSIII patients.
We also detected a reduced proliferation signature in tumors.Some growth factors and receptors, such as fibroblast growth factor receptors or substrates (FGFR1, FGFR2, ABL1, EPS8) and early transcription genes with transcription factor activity (JUN, JUNB, FOS, MYC, MAX), were down-regulated in malignant disease compared to normal breast tissue.In contrast, other genes affecting proliferation emerged as being more expressed in the CSIII tumor profile, such as REL and proliferating cell nuclear antigen, as opposed to TGFß receptor 2, which was less expressed in CSIII tumors than in normal tissue, and may inhibit epithelial cell proliferation.
Genes linked to cytoskeleton regulation and signal transduction through lipids were up-modulated in tumors, such as CAPZA2, a member of the F-actin capping protein α subunit family, thought to modulate second messenger generation through the phosphoinositide cycle and ultimately controlling cell survival and cell cycle, members of the tetraspanin superfamily (sarcoma amplified sequence) which can be associated with phosphatidylinositol kinase, as well as CCT3, a molecular chaperone involved in actin and tubulin folding, previously reported to be up-regulated in ovarian carcinoma (24)(25)(26).In addition, lysophospholipase 1, which hydrolyzes lysophosphatidylcholine, as well as the Na + /H + exchanger regulatory factor (SLC9A3R1), which induces cytoskeleton reorganization, were preferentially overexpressed in CSIII tumors (27).
Enhanced tumor expression of genes characteristically expressed by stromal cells, such as collagen 1α2, metalloproteases 11, 12, hepsin (a trypsin-like protease) and fibronectin 1, may be derived from non-epithelial components of the tumors, as previously suggested (28).
Several other genes were more expressed in CSIII tumors than in normal tissue, including ch-TOG (29), coding for a protein which could potentially lead to chromosome segregation defects, APOBEC3G (30), that can act as a DNA mutator, and CBX3, CROMOBOX homolog 3, linked to transcriptionally repressed heterochromatin structure (31).
Heme oxygenase-1 (HMOX1) is an inducible enzyme that catalyzes oxidative degradation of heme to form biliverdin (subsequently reduced to bilirubin by the enzyme biliverdin reductase), carbon monoxide, and free iron, representing a key enzyme for the protection of cells against "stress".In cancer, elevated HMOX1 expression has been described as a protumoral molecule because of its anti-apoptotic effects in colon cancer (32) and in gastric cancer cells (33), its pro-angiogenic effects in human pancreatic cancer (34), as well as a BCR-ABL-dependent survival factor in chronic myeloid leukemia (35).In addition, inhibition of HMOX1 expression may sensitize pancreatic cancer cells to chemotherapy (36).In contrast, in breast cancer cell lines and animal models of breast cancer, HMOX1 induction inhibits cell proliferation (37).Although HMOX1 expression in cancers is still not well defined, our results suggest that its overexpression occurs in more advanced stages of breast cancer.In addition, an increased biliverdin reductase A expression in tumors indicates that malignant cells may acquire a protective response to cellular stress.
A clear difference in gene expression pattern occurs at the normal to cancer transition; however, most of the differentially expressed genes are deregulated in tumors of both clinical stages (II and III) compared to normal breast tissue.Differential expression of 4 of 6 genes found to be differentially expressed by cDNA microarray analysis in the CSIII tumor versus normal tissue comparison was validated by quantitative RT-PCR in another cohort of CSIII patients, but only one of them, HMOX1, was identified as differentially expressed in CSIII tumors exclusively by both assays.Although the gene expression profile suggests the overlapping nature of CSII and CSIII breast cancer, it seems that about 15% of genes may be characteristically modulated in the latter specific clinical stage.

Figure 1 .
Figure 1.Unsupervised hierarchical clustering of 7 clinical stage II breast cancer samples and 9 normal breast samples (N) based on 22 differentially expressed genes at a false discovery ratio level of 0.05.Tumor identification (beginning with I, Q) appears at the top of the figure and each column represents gene expression of a single tumor.UniGene cluster ID or gene ID or ORESTES is shown in each row.The upper colored bar indicates the variation in gene expression in target samples as compared to reference cells (HB4A), i.e., red, more expressed and green, less expressed in target samples.The colored lines of the dendrogram stand for the support for each clustering, black and gray meaning more reliable and yellow and red less reliable.The metric used was Euclidean distance, with complete linkage for distance between clusters.

Figure 2 .
Figure 2. Unsupervised hierarchical clustering of 40 clinical stage III breast cancer tumors and 9 normal breast samples (N) based on 78 differentially expressed genes at a false discovery ratio level of 0.05.Tumor and gene identification, upper colored bar, and colored lines of the dendrogram, as described in Figure 1.

Table 1 .
Breast cancer patient characteristics.

Table 2 .
Sequences of primers and probes and conditions of quantitative PCR.

Table 4 .
Genes differentially expressed in clinical stage III (CSIII) tumors and in normal tissue according to their Gene Ontology annotation (biological process).

Table 3 .
Genes commonly differentially expressed in CSII and CSIII tumors compared to normal breast tissue according to their Gene Ontology annotation (biological process).

Table 6 .
Fourteen genes differentially expressed in breast cancer and in normal tissue as determined by a meta-analysis.

Table 5 .
Correlation between gene expression evaluated by cDNA microarray and quantitative RT-PCR.