Gene trio signatures as molecular markers to predict response to doxorubicin cyclophosphamide neoadjuvant chemotherapy in breast cancer patients

In breast cancer patients submitted to neoadjuvant chemotherapy (4 cycles of doxorubicin and cyclophosphamide, AC), expression of groups of three genes (gene trio signatures) could distinguish responsive from non-responsive tumors, as demonstrated by cDNA microarray profiling in a previous study by our group. In the current study, we determined if the expression of the same genes would retain the predictive strength, when analyzed by a more accessible technique (real-time RT-PCR). We evaluated 28 samples already analyzed by cDNA microarray, as a technical validation procedure, and 14 tumors, as an independent biological validation set. All patients received neoadjuvant chemotherapy (4 AC). Among five trio combinations previously identified, defined by nine genes individually investigated (BZRP, CLPTM1, MTSS1, NOTCH1, NUP210, PRSS11, RPL37A, SMYD2, and XLHSRF-1), the most accurate were established by RPL37A, XLHSRF-1 based trios, with NOTCH1 or NUP210. Both trios correctly separated 86% of tumors (87% sensitivity and 80% specificity for predicting response), according to their response to chemotherapy (82% in a leave-one-out cross-validation method). Using the pre-established features obtained by linear discriminant analysis, 71% samples from the biological validation set were also correctly classified by both trios (72% sensitivity; 66% specificity). Furthermore, we explored other gene combinations to achieve a higher accuracy in the technical validation group (as a training set). A new trio, MTSS1, RPL37 and SMYD2, correctly classified 93% of samples from the technical validation group (95% sensitivity and 80% specificity; 86% accuracy by the cross-validation method) and 79% from the biological validation group (72% sensitivity and 100% specificity). Therefore, the combined expression of MTSS1, RPL37 and SMYD2, as evaluated by real-time RT-PCR, is a potential candidate to predict response to neoadjuvant doxorubicin and cyclophosphamide in breast cancer patients.


Introduction
Correspondence: M.A.A.K. Folgueira, Disciplina de Oncologia, Departamento de Radiologia e Oncologia, Faculdade de Medicina, USP, Av.Dr. Arnaldo, 455, 4º andar, Sala 4112, 01246-903 São Paulo, SP, Brasil.Fax: +55-11-3082-6580.E-mail: makoike@lim24.fm.usp.brA major benefit of primary chemotherapy in breast cancer is the opportunity to increase breast-conserving surgery rates (1,2).However, for a more precise indication of this treatment, it is crucial to identify responsive and non-responsive patients.In this ideal situation, responsive patients, who might present a reduction in tumor dimension, would be offered primary chemotherapy, while nonresponsive patients, who might present stable or progressive disease, would be offered breast surgery at once.
Much work has been done in order to identify predictive markers of tumor response to primary chemotherapy and proliferation index, tumor grade, expression of hormone receptors and HER2, among others (3,4), all seem to play a role.In addition, basal-like and HER2(+)/ER(-) subtypes are more sensitive to anthracycline-based neoadjuvant chemotherapy than luminal breast cancers (4).
www.bjournal.com.brBraz J Med Biol Res 43 (12) 2010 Moreover, many studies have been carried out to identify a gene expression profile predictive of drug response in breast cancer patients employing cDNA microarray (5)(6)(7)(8)(9)(10) or RT-PCR techniques (11).Most investigators have tried to detect expression profiles associated with pathological complete response (7)(8)(9)11), which is a surrogate marker for improved overall survival.Others, however, have searched for expression profiles related to clinical response (5,6,10), which allow identification of patients who may benefit from tumor reduction and enhanced possibility of breast-conserving surgery.Differential expression profiles have been identified, and the response to various regimens, based on anthracycline (5,8), taxanes (6) or both drugs (7,(9)(10)(11), has been analyzed.Some of these studies included samples only in a training group (8,10), while others assessed the reproducibility of the model in an independent group of patients, regardless of the use of cross-validation analysis (5,6,9,11).In common, these studies have employed the same technique (cDNA microarray or RT-PCR) to detect gene expression in both the training and validation groups.
Our objective was to determine whether expression of some of these trios of genes, as evaluated by real-time RT-PCR, which is a more accessible mRNA identification and quantification technique, would retain its predictive strength to separate tumors according to response to primary chemotherapy based on doxorubicin and cyclophosphamide.

Patients
The study was approved by the local Ethics Committee of the Instituto Brasileiro de Controle do Câncer and Hospital das Clínicas da Faculdade de Medicina da USP, São Paulo, and breast cancer patients, candidates for neoadjuvant chemotherapy on a routine basis, gave written informed consent to participate.
Twenty-eight patients whose samples had been studied by cDNA microarray in a previous study (5) had enough material available for the determination of gene expression by real-time RT-PCR and their samples were included in a technical validation group.Another 14 patients were prospectively accrued, and their tumors were analyzed in a biological validation group (Table 1).
Patients included in the technical and biological validation groups presented mainly a locally advanced disease and were treated with 4 cycles of neoadjuvant doxorubicin and cyclophosphamide.There were no differences between groups, except for estrogen receptor immuno-expression, which was detected in a higher proportion of tumors from the technical validation group (Table 1).
Breast tumor dimension was evaluated by clinical examination before the first cycle and approximately three weeks after the last cycle of chemotherapy.Responsive patients were those whose tumor achieved a reduction of at least 30% of the longest diameter of the primary tumor (Table 1).Around 80% of patients from both technical and biological validation groups presented at least a partial response (Table 1).No association was detected between response to chemotherapy and menopausal status, clinical stage, histological type, histological grade, hormone receptor status, HER2 and P53 immuno-expression (data not shown).

RNA extraction, reverse transcription and real-time PCR
Tumor samples collected before chemotherapy were examined by light microscopy and hand dissected if necessary, to guarantee the presence of at least 80% invasive tumor cells.Total RNA was isolated using Trizol reagent (Invitrogen Corporation, USA).RNA quality was determined by ratio of absorbance at 260 and 280 nm (higher than 1.6) and by 28S/18S rRNA band intensities, as verified by 1% agarose gel electrophoresis under denaturing conditions and visualization through ethidium bromide (>1.5 ratio).
Total RNA (2 µg) was reverse transcribed using the oligo dT 12-18 primer (GE Healthcare Life Sciences, UK) and Superscript III (Invitrogen Corporation).Real-time RT-PCR was carried out using SYBR-green I (Sigma, USA) in a Rotor-gene system (Corbett Research, Australia).Primer sets were designed based on a coding region closer to the 3' end of the gene, using Primer3 (http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi)(Table 2).Sequences present in different exons, preferentially separated by long introns, were selected according to sequences deposited at http://www.ncbi.nlm.nih.gov/nucleotide.To avoid nonspecific product formation, BLAST alignment analysis (http:// blast.ncbi.nlm.nih.gov/Blast.cgi) was carried out.
All samples were analyzed in duplicate and cycling conwww.bjournal.com.brBraz J Med Biol Res 43(12) 2010 ditions were 95°C for 5 min, followed by 40 cycles at 95°C for 20 s, 60°C for 15 s and 72°C for 20 s.PCR assays were analyzed with the Rotor-Gene 6 System software (Corbett Research).Average values were used for quantification and cDNA obtained from a pool of breast samples was used as reference.

Reference gene selection and real-time RT-PCR normalization
Expression stability of four housekeeping genes, including ACTB (actin, beta), GUSB (glucuronidase, beta), PPIA [peptidylprolyl isomerase A (cyclophilin A)], and RPLP0 (ribosomal protein, large, P0) was determined in target samples using the geNorm software tool (available at http://medgen.ugent.be/~jvdesomp/genorm/).All candidate genes presented an M value below the 1.5 cut-off, indicating a small variability of their expression among samples.ACTB, PPIA and RPLP0 presented similar expression in responsive and non-responsive samples, whereas GUSB was significantly less expressed in non-responsive samples (P = 0.006, Mann-Whitney test), being excluded as a reference gene from this analysis (data not shown).
The relative normalized expression ratio was calculated from the real-time PCR efficiency and the crossing point deviation of an unknown sample versus a control divided by the normalization factor, which was the geometric mean of the relative expression values of ACTB, PPIA and RPLP0.

Statistical analysis
Statistical analysis was performed using the SPSS 11.0 version software (SPSS, USA).Relative normalized expression data were first transformed to their natural logarithm (ln) format and the Shapiro-Wilk test was performed to determine the distribution of the values.The results obtained by cDNA microarray and real-time RT-PCR were then compared using the Spearman or Pearson correlation test, as appropriate.The level of significance of the differential expression between responsive and non-responsive samples was determined by the Mann-Whitney test and a two sided P ≤ 0.05 value was considered to be significant.
Predictive models using gene expression values were designed using linear discriminant analysis.Gene sets showing greater accuracy in leave-one-out cross-validation analysis were further tested with samples from the biological validation group.

Results
We first determined whether there was a correlation between the mRNA quantification techniques (cDNA microarray vs real-time RT-PCR) for the expression values of all nine genes studied.Samples from the technical validation group (N = 28) were used in this analysis, as their gene expression had already been evaluated by cDNA microarray.A significant positive correlation between values was found , XLHSRF-1 (heat shock regulated 1), no significant negative correlation was detected.Among the correlated genes, RPL37A was the only one with significant differential expression in both techniques for the same 28 samples, indicating lower expression in non-responsive than in responsive tumors (cDNA microarray, P = 0.021 vs RT-PCR, P = 0.011; Mann-Whitney test).Our next step was to examine whether real-time RT-PCR expression values from these nine genes grouped into trios could provide results similar to those obtained by cDNA microarray in separating responsive from non-responsive tumors, using linear discriminant analysis.Initially, we evaluated the three-dimensional distribution of tumors according to the expression of five gene trios previously identified.Samples from the technical validation group were used to generate the best separation plane to discriminate responsive from non-responsive tumors.Among the five trios, the best separation of tumors was achieved by the RPL37A, XLHSRF-1 based trios (with NOTCH1 or NUP210, as third genes), which could correctly classify 86% (21/23 responsive) and 60% (3/5 non-responsive) of the technical validation group.In a cross-validation analysis (leave-one-out) 82% of samples maintained the adequate separation considering both gene trios (20/23 responsive and 3/5 non-responsive).Conversely, the other three trios presented less than 50% correct sample discrimination in cross-validation analysis.Next, samples from the biological validation group were spatially distributed using the pre-established features from these two trios, resulting in a 71% (8/11 responsive and 2/3 non-responsive) correct classification by both of them (Figure 1A and B).Since neither of these trios could discriminate samples with the same accuracy as in previous cDNA microarray analysis, we decided to search for other predictive trios, combining the real-time RT-PCR expression values of genes with at least a trend towards a positive correlation between techniques (MTSS1, NUP210, PRSS11, RPL37A, and SMYD2).Expression of RPL37A, SMYD2 and MTSS1 presented the highest accuracy, correctly separating 93% of the technical validation group samples (22/23 responsive; 4/5 non-responsive) and 86% accuracy by cross-validation analysis (21/23 responsive and 3/5 non-responsive).This trio was further tested using samples from the biological validation group, which were spatially distributed using the pre-established features, resulting in 79% proper classification (8/11 responsive and 3/3 non-responsive) (Figure 1C).Hence, this predictive model presented 72% sensitivity, 100% specificity, and positive and negative predictive values of 100 and 50%, respectively, in discriminating responsive tumors from an independent sample group.
As this newly identified trio was not one of the top 10 trios detected by cDNA microarray analysis, we re-evaluated its strength in separating responsive from non-responsive samples from all 44 patients included in our previous study (5).cDNA microarray expression for the 28 patients included in the technical validation group (who had samples analyzed by both cDNA microarray and RT-PCR) was used to delineate a separating plane, which correctly classified 89% of the samples (21/23 responsive and 4/5 non-responsive).Cross-validation analysis also demonstrated 86% accuracy (20/23 responsive and 4/5 non-responsive).Thus, using the same samples and gene expression evaluated by cDNA microarray, four samples were erroneously classified.However, only two of them were misclassified by both RT-PCR and cDNA microarray analysis (one of them responsive and the other non-responsive).The remaining 16 samples of the 44 included in our previous study, which were analyzed only by cDNA microarray (there was not enough material available for RT-PCR analysis), were later included as another validation group.The median age of this group of patients was 46 years, 94% had invasive ductal carcinoma (75% ER+, 69% PR+ and 56% ErbB2+), clinical stage III disease (mean dimension of the primary tumor before chemotherapy: 82 mm), and 75% of them were considered to be responsive after treatment.In this new biological validation group of 16 patients, 81% of the samples were properly separated (11/12 responsive and 2/4 non-responsive) (Figure 1D).This model presented 92% sensitivity, 50% specificity, and positive and negative predictive values of 85 and 67%, respectively, in discriminating responsive tumors.

Discussion
We have performed a more comprehensive analysis of tumor gene expression profile using cDNA microarray, in which ten trios of genes presented high accuracy in discriminating samples according to drug response (5).We have determined whether expression of some of these trios could predict the response of canine mammary carcinomas using an in vitro tissue slice culture and RT-PCR assay.In this model, however, they could not predict in vitro responsiveness to doxorubicin.In this case, inter-species genetic heterogeneity may have contributed to determining a diverse gene expression associated with tumor response (12).
In the present study, we observed that expression of the same trios of genes previously identified and now evaluated by real-time RT-PCR did not provide predictive information with the same accuracy as that obtained with cDNA microarray data, even when the same 28 samples (of 44) previously analyzed were used.This can be explained in part by the fact that only one among the five trios included all three genes (PRSS11, SMYD2 and MTSS1) with at least a trend towards a positive correlation, considering the expression values obtained by the two techniques (cDNA microarray and real-time RT-PCR).
This difficulty in reproducing gene expression profiles identified by cDNA microarray and real-time RT-PCR methodology has been reported by other investigators and a wide range of validation rates, represented by significant positive correlations between the techniques, was detected (34-71%) (5,6,(13)(14)(15)(16).This can be explained by disparieties in technical principles and normalization approaches between these two methods.Microarray data are frequently normalized globally, and therefore the expression levels of all genes are assumed to be constant among samples.Conversely, RT-PCR expression data are normalized by a much smaller number of reference genes (e.g., often only one), assumed to be constant, despite the fact that no single gene is expressed at a constant level in all biological samples.Based on these observations, we took extreme care in choosing reference genes for RT-PCR in order to select those with stable expression among all cancer samples and between groups (response vs non-response).
Using RT-PCR expression values of the pre-selected genes, we identified the RPL37A, SMYD2 and MTSS1 trio as a new, potential predictive marker.This discriminating power could represent the sum of effect of these three genes within the process of response to chemotherapy response.A higher expression of RPL37A and MTSS1 and a lower expression of SMYD2 contributed to graphic localization of responsive compared to non-responsive samples, using both procedures (cDNA microarray and RT-PCR).RPL37A encodes a ribosomal protein that is a component of the 60S subunit.In agreement with our data, higher expression of structural constituents of ribosomes was described in breast cancer specimens responsive to letrozole in the neoadjuvant setting (17).SMYD2 may repress transcriptional p53 activity by lysine methylation (Lys172), exerting an oncogenic and drug resistance action through inhibition of p53-mediated cell death pathways (18).MTSS1 codes for a metastasis suppressor protein and participates in the assembly of actin filaments.Its negative regulation correlates with proliferation, adhesion loss and invasion and a higher breast tumor expression of MTSS1 was correlated with increased patient overall survival and disease-free survival (19).
Although the RPL37A, SMYD2, MTSS1 trio does not overlap with other identified predictive gene expression profiles, we believe that a wide range of gene combinations can be used to predict tumor response to chemotherapy.Accordingly, differential profiles have been shown to identify samples with the same prognosis, indicating that various gene groups can distinguish specific tumor behaviors (20).
In conclusion, expression of the RPL37A, SMYD2, MTSS1 gene trio, as evaluated by RT-PCR, is a potential candidate for a predictive marker of response to neoadjuvant AC chemotherapy in breast cancer patients.

Figure 1 .
Figure 1.A, Three-dimensional distribution of samples from technical and biological validation groups according to the expression of gene trios.A, RPL37A, XLHSRF-1, NOTCH1.B, RPL37A, XLHSRF-1, NUP210.C and D, RPL37A, SMYD2, MTSS1.A, B, C, D, The expression of gene trios was evaluated in samples from a technical validation group (N = 28, circles) and biological validation group (N = 14, crosses).Expression values from samples of the technical validation (training) group were first used to generate a separation plane (green line) between responsive (green) and non-responsive tumors (red), using linear discriminant analysis.Then, samples of the biological validation group were spatially distributed according to these pre-established features to determine the accuracy of the model.A and B, Gene trios previously identified in cDNA microarray analysis had their expression now evaluated by RT-PCR and these results are shown as their natural logarithm values.C and D, Gene trio newly identified by RT-PCR analysis.C, Expression values, as evaluated by RT-PCR, are shown for samples of the technical (N = 28) and biological validation groups (N = 14).D, Expression values evaluated by cDNA microarray experiments are shown for the technical validation group (N = 28) and for another 16 samples from patients submitted to neoadjuvant chemotherapy, who had their tumors analyzed only by cDNA microarray, and who constituted another biological validation group.Gene names shown on the axis indicate increasing values of gene expression.

Table 1 .
Patient characteristics and response to neoadjuvant chemotherapy.