Impact of the integration of proton magnetic resonance imaging spectroscopy to PI-RADS 2 for prediction of high grade and high stage prostate cancer

Objective To compare the predictions of dominant Gleason pattern ≥ 4 or non-organ confined disease with Prostate Imaging Reporting and Data System (PI-RADS v2) with or without proton magnetic resonance spectroscopic imaging (1H-MRSI). Materials and Methods Thirty-nine men underwent 3-tesla endorectal multiparametric MRI including 1H-MRSI and prostatectomy. Two radiologists assigned PI-RADS v2 and 1H-MRSI scores to index lesions. Statistical analyses used logistic regressions, receiver operating characteristic (ROC) curves, and 2x2 tables for diagnostic accuracies. Results The sensitivity and specificity of 1H-MRSI and PI-RADS v2 for high-grade prostate cancer (PCa) were 85.7% (57.1%) and 92.9% (100%), and 56% (68.0%) and 24.0% (24.0%). The sensitivity and specificity of 1H-MRSI and PI-RADS v2 for extra-prostatic extension (EPE) were 64.0% (40%) and 20.0% (48%), and 50.0% (57.1%) and 71.4% (64.3%). The area under the ROC curves (AUC) for prediction of high-grade prostate cancer were 0.65 and 0.61 for PI-RADS v2 and 0.72 and 0.70 when combined with 1H-MRSI (readers 1 and 2, p = 0.04 and 0.21). For prediction of EPE the AUC were 0.54 and 0.60 for PI-RADS v2 and 0.55 and 0.61 when combined with 1H-MRSI (p > 0.05). Conclusion 1H-MRSI might improve the discrimination of high-grade prostate cancer when combined to PI-RADS v2, particularly for PI-RADS v2 score 4 lesions, but it does not affect the prediction of EPE.


INTRODUCTION
Prostate cancer (PCa) is diagnosed in approximately 230,000 men in the United States each year (1) , the majority of whom will possess favorable risk disease and in whom conservative approaches including active surveillance may be prudent (2) . Multiparametric magnetic resonance imaging (mpMRI) of the prostate has gained considerable utilization in the setting of newly diagnosed disease to identify occult, higher-grade or stage elements missed by conventional biopsy (3,4) . Moreover, when coupled with real time ultrasonography, fusion mpMRI biopsy has demonstrated superior PCa detection rates compared with traditional template guided biopsy (5) .
With growing integration of mpMRI as an adjunct diagnostic modality, the need to standardize acquisition protocols and study reporting is evident as it may facilitate benchmarks for consistency in both clinical care and research settings alike (6) . The American College of Radiology, the AdMeTech Foundation, and the European Society of Urogenital Radiology have partnered and recently presented a new version of the Prostate Imaging Reporting and Data System (PI-RADS v2), which integrates results of T2-weighted (T2W), high b-value diffusion-weighted image (DWI), and dynamic contrast enhanced (DCE) MRI (7) . Proton MR spectroscopic imaging ( 1 H-MRSI), previously an optional tool, was not included in the current version of the document. 1 H-MRSI has, though, been recognized as a useful non-invasive method for evaluating metabolic characteristics of prostatic lesions, yielding identifiable signatures that may allow for the discrimination of high-grade tumors (8) . However, 1 H-MRSI is susceptible to false positive related to choline contamination from the seminal vesicles or urethra (9) , or by prostatitis (10) . Furthermore, the ACRIN 6659 study that was published by Weinreb et al. found no added benefit for 1 H-MRSI compared with T2W alone to localize PCa to the gland sextant (11) .
In this context, we sought to compare the diagnostic performance of PI-RADS v2 with or without 1 H-MRSI for predicting PCa with dominant Gleason pattern ≥ 4 or non-organ confined disease at the time of surgery.

MATERIALS AND METHODS
The Institutional Review Board approved this retrospective single center study. Informed consent was prospectively obtained from all patients authorizing the use of clinical data in future studies. Consecutive subjects were identified through searches of our Urological Oncological Database, Prostate MR Imaging Database, and electronic medical records. Inclusion criteria: biopsyproven PCa; 3-tesla endorectal prostate mpMRI, including 1 H-MRSI; radical prostatectomy within six months of imaging; no treatments between imaging and surgery.
Forty patients seen between January 2013 and December 2014 fulfilled these criteria, but one was excluded because of a hip replacement that distorted the 1 H-MRSI data. Therefore, 39 men formed the study population. Patients were clinically risk stratified using the Cancer of the Prostate Risk Assessment score (CAPRA) (12) . CAPRA is an easy to calculate validated nomogram that predicts outcomes across multiple treatment approaches and it predicts an individual's likelihood of metastasis, cancerspecific mortality, and overall mortality. The score is calculated using points assigned to: age at diagnosis, PSA at diagnosis, Gleason score of the biopsy, clinical stage and percent of biopsy cores involved with cancer. Three categories were assigned: low (scores 0-2), intermediate (scores [3][4][5], and high risk (scores 6-10).

MRI technique
Scans were acquired on a 3-tesla scanner (GE Healthcare, Waukesha, WI, USA) using the body coil for excitation and an endorectal coil (E-Coil; Medrad, Pittsburgh, PA, USA) filled with perfluorocarbon (Flutech_T14 TM; F2 Chemicals, UK) and a phased-array coil for reception. Images were post-processed to compensate for the reception profile of the endorectal coil (13) .
-3D 1 H-MRSI using a water and lipid-suppressed double-spin-echo point resolved spectroscopy sequence (MLEV-PRESS) with spectral-spatial pulses for the two 180° excitation pulses, and outer-voxel saturation pulses (thickness/gap = 3 mm/0 mm; TR/TE = 2000 ms/85 ms; NEX = 1; phase encoding steps = 16 × 10 × 8; FOV = 86 × 54 × 43 mm 3 yielding a nominal spatial resolution of 0.16 cm 3 ). A PRESS volume was selected using the oblique axial T2W images that incorporated the entire prostate while minimizing inclusion of the rectum and peri-prostatic lipids. The PRESS volume was shimmed using an automated phase mapping algorithm, followed by manual shimming of the x, y and z gradients until a water line-width of ≤ 12 Hz was obtained. An interleaved flyback echo-planar spectroscopic readout with a spectral bandwidth of 1012 Hz was used in the left-right dimension. Acquisition time = 7 min 50 s.
The 1 H-MRSI data were processed using custom processing software (14) . The raw data acquired with the modified PRESS incorporating the flyback echo-planar readout trajectory were reordered as previously described (15) and processed in the same manner as the conventional 4D 1 H-MRSI dataset (14) . The spectral data were apodized with a 2-Hz Lorentzian function in the frequency domain, with no filtering in the spatial dimensions. Data were Fourier transformed in the time domain and in three spatial domains. Spectral phase, baseline, and frequency corrections were iteratively made and metabolite peak areas calculated as previously described (14) . The 3D 1 H-MRSI spectral arrays and associated metabolite peak area ratios were overlaid on the corresponding transverse T2W images using the open-source spectral processing package SIVIC.

Image interpretation
Two radiologists (8 and 5 years of experience with 1 H-MRSI and 2 years of experience with PI-RADS v2, i.e. since its initial publication), unaware of the clinical and pathologic data, independently reviewed all scans on a PACS workstation (Impax; Agfa, Mortsel, Belgium) in a single session. To mimic clinical practice, the radiologist could review the T2W, DWI, and DCE sequences in any order, alone or in conjunction. The radiologists had no access to 1 H-MRSI images at this stage. Up to four suspicious foci were identified and PI-RADS v2 scores assigned to each (Table 1) (7) .
Next the radiologists reviewed the 3D spectral arrays to assign a 1 H-MRSI score to all suspicious lesions previously assigned a PI-RADS v2 score, i.e. lesions that received a PI-RADS v2 score ranging from 3 to 5. All usable voxels were scored using the five-point scale based on the area ratio of the citrate and choline peaks (Table 1).

Surgical technique and histologic evaluation
Experienced urologists performed all radical prostatectomies. Pelvic lymph node dissection was performed based on pre-operative surgical risk. Prostatectomy specimens were marked with ink and fixed overnight in 10% buffered formalin. The glands were sectioned using whole-mount histology at 3 mm intervals in a plane perpendicular to the prostatic urethra. Experienced academic pathologists, unaware of imaging findings, reviewed the histological slides in all cases. The size, location, and Gleason score of all cancer foci seen in the prostate, and the presence, location, and extent of extra-prostatic disease were recorded.

Statistical analysis
The primary outcomes were the predictions of highgrade PCa, defined as Gleason score ≥ 4+3, and highstage disease, defined as extra-prostatic extension (EPE) (≥ T3A) at radical prostatectomy on a per patient basis. In the event of multiple lesions, only the index lesion was considered for analyses. The index lesion was defined as the lesion with the highest overall PI-RADS score. If two or more lesions received the same score, the index lesion was the one associated with clear EPE. If none of the lesions demonstrated EPE, the index lesion was the largest one. We assessed the sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV), and accuracy of the overall PI-RADS v2 score and 1 H-MRSI score assigned to suspicious lesions for the detection of these outcomes. For high-grade disease, the overall PI-RADS v2 scores 1 to 3 were considered a negative result. For non-organ confined PCa, the overall PI-RADS v2 scores 1 to 4 were considered a negative result. This was because the presence of EPE on mpMRI determines an overall PI-RADS v2 score of 5. For both analyses, 1 H-MRSI was dichotomized as negative (score ≤ 3) or positive (score 4 or 5).
We compared the areas under the receiver operating characteristic (ROC) curve of univariate logistic regression models that included the overall PI-RADS v2 score or 1 H-MRSI score; and those to the area under the ROC curve derived from the multivariate models that included the overall PI-RADS v2 and 1 H-MRSI scores. As mentioned above, if more than one lesion was suspected on mpMRI, only the index lesion was utilized in the analyses.
All analyses were performed using Stata version 13.1 (College Station, TX). P values < 0.05 were considered statistically significant.
The specificity of 1 H-MRSI (assigned to a suspicious lesion) to predict Gleason pattern ≥ 4+3 was higher than the specificity of the overall PI-RADS v2 score (56.0%, reader 1, and 68.0%, reader 2, versus 24%, both readers). For the detection of stage ≥ T3a, the use of 1 H-MRSI scores to further characterize suspicious lesions led, for reader 1, to an increase in sensitivity (64% versus 20%) associated with a decrease in specificity (50% versus 71.4%). No clear differences were seen for reader 2. The performance characteristics are outlined in Table 3. Table 4 details the AUCs for the prediction of Gleason pattern ≥ 4+3 and extraprostatic disease. These results are also illustrated in Figure 2.  4 -Lenticular or non-circumscribed, homogeneous, moderately hypointense, and < 1.5 cm in greatest dimension 5 -As above, but ≥ 1.5 cm in greatest dimension or definite extraprostatic extension/invasive behavior

-Focal mildly/moderately hypointense on ADC and isointense/mildly hyperintense on high b-value DWI
4 -Focal markedly hypointense on ADC and markedly hyperintense on high b-value DWI; < 1.5 cm in greatest dimension 5 -As above, but ≥ 1.5 cm in greatest dimension or definite extraprostatic extension/invasive behavior DCE (-) No early enhancement; or diffuse enhancement not corresponding to a focal finding on T2 and/or DWI; or focal enhancement corresponding to a lesion demonstrating features of benign prostatic hyperplasia on T2WI (+) focal, and; earlier than contemporaneously with enhancement of adjacent normal prostate tissue, and; corresponds to suspicious finding on T2W and/or DWI Overall score 1 -Very low (clinically significant cancer is highly unlikely to be present) 2 -Low (clinically significant cancer is unlikely to be present) 3 -Intermediate (the presence of clinically significant cancer is equivocal) 4 -High (clinically significant cancer is likely to be present) 5 -Very high (clinically significant cancer is highly likely to be present)  Analysis of the shape of the ROC curves shows that the addition of 1 H-MRSI to PI-RADS v2 improves the prediction of high-grade PCa when lesions are characterized as PI-RADS v2 score 4. There were no statistically significant differences between the AUC of overall PI-RADS v2 (0.65, reader 1; and 0.61, reader 2) and 1 H-MRSI (0.75, reader 1; and 0.70, reader 2) for either reader. The AUC of overall PI-RADS v2 combined with 1 H-MRSI was significantly higher than the AUC of overall PI-RADS v2 alone for reader 1 (0.77; p = 0.04), but not for reader 2 (0.70; p = 0.21).

DISCUSSION
Our results suggest that the addition of 1 H-MRSI to PI-RADS v2 might improve the detection of PCa with Gleason pattern ≥ 4+3, in particular of PI-RADS v2 score 4 lesions; however, it does not seem to increase the detection of high stage (≥ T3a) disease.
Different from its initial version, PI-RADS v2 does not include 1 H-MRSI. Yet, the PI-RADS Steering Committee encourages "the continued development of promising MRI methodologies", including 1 H-MRSI (16) , and state that these technologies will be considered for inclusion in future versions, pending new data. While the PI-RADS v2 document does not provide specific reasons for not including 1 H-MRSI, it is known that it is a complex technique with limited acceptance outside specialized centers due to its long acquisition time, need for local expertise, and general reliance on endorectal coil imaging. Yet, 1 H-MRSI warrants continue attention; new hardware and software developments may make it more manageable.     Based on previous studies, 1 H-MRSI improves tumor localization (17,18) , volume estimation (19,20) , staging (21) , tissue characterization (22) , and identification of recurrent disease after therapy (23,24) . A multicenter study showed that positive MR spectroscopy findings are likely to reflect higher tumor grade and/or volume (25) . These studies, though, do have limitations, and there are, also, those with less encouraging results; the ACRIN study published in 2009, for example, showed no difference of accuracy when comparing combined T2W and 1 H-MRSI and T2W alone (11) .
Our results show that the overall PI-RADS v2 score is very sensitive to detect Gleason pattern ≥ 4+3, but its specificity is very low. This suggests PI-RADS v2 is a good option to detect the disease, but not necessarily to characterize it. The use of 1 H-MRSI, however, led to a 50% increase in specificity, and might at least in some cases help to identify men with high-grade PCa. Our results showed this is particularly true when a lesion receives a PI-RADS v2 score of 4. These results are aligned to those of Giusti et al., who showed that metabolic ratios correlate with Gleason scores (26) , and they are similar to those of a meta-analysis in which 1 H-MRSI had a higher specificity than T2W and increased the specificity of the combination of T2W and DWI (18) . While the comparison of overall AUCs (i.e. summary of data for all lesions) found an improvement of discrimination between men with and without PCa Gleason pattern ≥ 4+3 using the combined PI-RADS v2 and 1 H-MRSI for one reader only, the assessment of the shape of the curves shows a clear separation between the lines of the ROC curves of PI-RADS v2 alone and PI-RADS v2 combined with 1 H-MRSI at the segment that includes only PI-RADS v2 score 4 lesions for both readers. It is possible that this discrepancy in results is due to differences in readers' experience. 1 H-MRSI is a complex technique and interpretation can be challenging. It is important to make note of this fact, as these same challenges are likely to be found at other sites that lack radiologist with experience with 1 H-MRSI.
The metabolic nature of 1 H-MRSI might explain why it did improve the detection of EPE, as EPE is typically detected on anatomical images. The overall PI-RADS v2 score, however, does include an anatomical assessment. Furthermore, both readers assigned an overall PI-RADS A B C D score of 4 or 5 to more than 80% of these suspicious lesions, and an overall PI-RADS v2 score of 5, at least in some instances, characterizes definite EPE (16) . An increase in specificity after utilizing 1 (29) and, more recently, Polanec et al. (30) found that 1 H-MRSI did not increase the detection and nor improved the grading of PCa. While several possible explanations exist for these discrepancies, the exercise of explaining them is likely not warranted, as the first version of PI-RADS is quite different from PI-RADS v2 and should no longer be utilized. More important, perhaps, is to recognize that considerable interest exists in optimizing the identification high grade or stage disease among men with clinically localized PCa as such determinations may improve management decisions. And that other imaging techniques, including 1 H-MRSI, may be helpful.
This study has limitations. First, this was a retrospective study with the limitations inherent to this type of design. The population studied was highly selected and included only men who had endorectal mpMRI and radical prostatectomy. We, therefore, probably incurred selection bias and our patients may not fully represent all men with PCa. This is illustrated by the fact most of our lesions were characterized as PI-RADS v2 4 and 5, as men with lower scores are less likely to have cancer and to undergo surgery. However, we considered the need for an adequate standard of reference more important than the limited generalizability. One possible option to prostatectomy is MR-guided biopsy, which can be performed in-bore or by fusion with ultrasound. Accordingly, some of our results, in particular the positive and negative predictive values, do not apply to all men with suspected PCa nor to all men who are under active surveillance and typically have low-grade low-volume disease. Second, we examined endpoints of high-grade and/or non-organ confined disease, but not more distant oncologic endpoints including biochemical recurrence or metastatic progression. Prospective studies with extended follow up may be warranted to definitively evaluate the role of 1 H-MRSI in improved delineation of PCa outcomes. Third, PI-RADS was designed with the intent to improve the detection of tumors with Gleason score ≥ 3+4, and not ≥ 4+3 as we proposed. This could, perhaps, explain the low specificity of PI-RADS v2 found in this study. More important, though, is that we may have overestimated the diagnostic performance of both PI-RADS v2 and 1 H-MRSI due to spectrum bias. Spectrum bias refers to the fact that it is usually easier to detect advanced disease than early-stage disease, as subtle abnormalities can be hard to distinguish from normal findings. This typically leads to a higher diagnostic accuracy when a study includes in a population with advanced disease than when the subjects have less severe disease. We opted for characterizing as high-grade tumors only tumors with Gleason score ≥ 4+3 because many institutions consider men with Gleason 3+4 as candidates for active surveillance, while a Gleason score ≥ 4+3 is universally considered an indication for definitive therapy.
In summary, 1 H-MRSI might improve the discrimination of pathological Gleason score ≥ 4+3 when added to the PI-RADS v2, in particular for lesions that receive a score of 4, but it does not affect the prediction of PCa stage ≥ T3a.