Differentiation of relapsing-remitting and secondary progressive multiple sclerosis: a magnetic resonance spectroscopy study based on machine learning

ABSTRACT Introduction: Magnetic resonance imaging (MRI) is the most important tool for diagnosis and follow-up in multiple sclerosis (MS). The discrimination of relapsing-remitting MS (RRMS) from secondary progressive MS (SPMS) is clinically difficult, and developing the proposal presented in this study would contribute to the process. Objective: This study aimed to ensure the automatic classification of healthy controls, RRMS, and SPMS by using MR spectroscopy and machine learning methods. Methods: MR spectroscopy (MRS) was performed on a total of 91 participants, distributed into healthy controls (n=30), RRMS (n=36), and SPMS (n=25). Firstly, MRS metabolites were identified using signal processing techniques. Secondly, feature extraction was performed based on MRS Spectra. N-acetylaspartate (NAA) was the most significant metabolite in differentiating MS types. Lastly, binary classifications (healthy controls-RRMS and RRMS-SPMS) were carried out according to features obtained by the Support Vector Machine algorithm. Results: RRMS cases were differentiated from healthy controls with 85% accuracy, 90.91% sensitivity, and 77.78% specificity. RRMS and SPMS were classified with 83.33% accuracy, 81.81% sensitivity, and 85.71% specificity. Conclusions: A combined analysis of MRS and computer-aided diagnosis may be useful as a complementary imaging technique to determine MS types.


INTRODUCTION
Multiple sclerosis (MS) is an inflammatory autoimmune disorder of the central nervous system. In 2013, Lublin et al. reviewed MS phenotypes and classification from 1996. They described MS phenotypes as: clinically isolated syndrome (CIS), relapsing-remitting multiple sclerosis (RRMS), and progressive multiple sclerosis (PMS). RRMS is characterized as active or non-active. PMS, which can be primary progressive (PP) or secondary progressive (SP), has four possible sub-classifications considering the disability level 1 . Clinical symptoms and findings, cerebrospinal fluid (CSF) examinations, and magnetic resonance imaging (MRI) findings have been used to diagnose MS 2,3 . In particular, the widespread use of MRI has revolutionized the diagnosis and monitoring of MS.
Recent studies have emphasized that MR spectroscopy (MRS) is a convenient alternative method to analyze MS, understand its pathogenesis, and determine its course 4,5,6,7 . Vingara et al. calculated metabolic changes in MRS data obtained from RRMS and healthy control groups by statistical methods and declared that MRS data would be useful in clinical trials 8 . Kirov et al. characterized and followed metabolic changes between the control group and early RRMS patients using MRS data 9 . Furthermore, N-acetylaspartate (NAA) peak decreases when an orientation from RRMS to SPMS occurs during the course of the disease. NAA peaks were shortened when RRMS switched to SPMS during the MS course 10,11 . Pan et al. determined MRS metabolic values in RRMS, SPMS, and PPMS groups and calculated metabolic changes by statistical methods 12 . Narayana et al. compared metabolic values between PPMS and control groups using automatic analysis software 13 . Changes in NAA levels outside MS brain lesions and inside spinal plaques were studied in benign versus non-benign MS, and the values obtained were compared with healthy controls 14,15 . Current literature mostly has data relevant to metabolite changes in MS; however, a few studies tried to simultaneously obtain an MS-diagnosis and a quantitative disease severity-prediction with the aid of MRS. The reason for this scarcity may be the difficulty in examining and interpreting MRS signals. Therefore, artificial intelligence and computer-aided diagnosis (CAD) are novel and effective methods that can contribute to overcoming the problems mentioned above 16,17,18 . To the best of our knowledge, very few studies have addressed the determination of MS types with the help of combined approaches, adopting both MRS and advanced machine learning algorithms. Ion-Margineanu et al. classified CIS, RRMS, PPMS, and SPMS patients using machine learning algorithms trained on clinical data (e.g., patient age, disease duration, and Expanded Disability Status Scale -EDSS) combined with lesion loads and magnetic resonance metabolic features 19 .
This study examined the combination of MRS and a machine learning method for binary classification of healthy controls-RRMS and RRMS-SPMS. Moreover, we discussed the effectiveness of MRS in MS diagnosis.

Patient population and imaging
MS data were obtained from 61 consecutive MS patients who voluntarily participated in this study, which was conducted in the MS Clinic of the Neurology Department at the Bezmialem University Hospital between June and December 2015, following the McDonald criteria (2010) 20 . In addition, a healthy control group was created with demographic characteristics similar to those of the RRMS group. Two neurologists with clinical experience in MS and blinded to each other confirmed the diagnosis of healthy controls, RRMS, and SPMS. Among 61 MS patients, 36 were diagnosed with RRMS and the remainder with SPMS. The healthy control group consisted of 30 participants with a similar age to that of the RRMS group and no statistically significant difference (p=0.18). Four patients, who had additional neurological disorders, were excluded from the patient group (migraine, brain tumor, etc.). Table 1 presents demographic and clinical features of the study population.
MRS was performed with a 1.5T Siemens Avanto ® MRI scanner. MRS data were obtained from short echo time single-voxel 1 H spectroscopy (SE) signals in STEAM sequence, and the parameters used were: repetition time (TR)=2000 ms, echo time (TE)=32 ms, and spectral width (SW)=1000 Hz.
For an automatic specification of MS types via single-voxel spectroscopy, a CAD system was designed, consisting of four different basic steps described in Figure 2. Data acquisition, signal processing, feature extraction, and  classification stages comprised the different sub-processing steps. All experimental studies were performed with a laptop working on a Windows 7 operating system, 4-core 2.4 GHz i7 processor, and 16 GB memory. The TARQUIN (Version 4.3.6) and MATLAB (Version R2010b) programs were used in all experiments.

Statistical analysis
This study used the SPSS (Version:20.0) software for all statistical data analysis. Concentrations of NAA, choline (Cho), creatine (Cr), and myo-inositol (MI) metabolites clinically collected from healthy control and MS groups and rates of these metabolites were statistically analyzed. Statistical significance was set as p<0.05. Data were compared to a normal distribution using the Kolmogorov-Smirnov test and histograms. Normally distributed data were analyzed with Student's t-test, and non-normally distributed data were assessed with the Mann-Whitney U-test. In addition, a boxplot was used to show metabolites and their rates using the OriginPro software (Version 9.3).

Single-voxel spectroscopy processing
Single-voxel spectroscopy (SVS) data were obtained from Siemens .rda files. SVS raw data were analyzed with the TARQUIN software. TARQUIN is an accurate and robust algorithm for assessing and quantifying single-voxel MRS analysis in the time domain 21 . TARQUIN has some pre-processing and fitting modules for quantifying MRS metabolites. Eddy current correction using Klose's method 22 , water removal by Hankel singular value decomposition (HSVD), phase correction, automatic referencing, basis-set simulation, signal model, and constraint fitting were applied by TARQUIN for pre-processing and quantitation of either the time or frequency domain. Time-domain signals were transformed into frequency-domain ones using Fourier transform for the actual quantification. The main metabolites of interest area ranging from 5.5 to 9.0 ppm in this study. 19,23,24 .
Conventional MRI and SVS data were examined by two radiology experts with at least 10 years of experience in the field. All SVS data were reviewed for quality and assessed with quality control (QC) criteria. Following the experts' opinion, SVS spectra of insufficient quality were not included in the final data set. In addition, all SVS data reached the TARQUIN quality control values for two parameters -fullwidth half-maximum (FWHM) and signal-to-noise (SNR) ratio. The FWHM obtained from TARQUIN was ≤0.15 ppm. The SNR obtained from TARQUIN was >5 19,24 .
MRS spectra and metabolite changes in healthy controls and RRMS and SPMS patients were identified with the help of these procedures. Figure 3 shows sample MR images and MRS spectra of healthy controls and RRMS and SPMS patients.

Feature extraction/selection
SVS comprises 1024 data points in the TARQUIN software. In this step, SVS data features were extracted, and the most representative ones were determined. The current study used the peak integration (PI) method of MATLAB to obtain significant features 25 . The PI method calculates peak values of the most important metabolites, such as NAA, Cho, Cr, and MI, and the area under these peaks for each selected metabolite resonance. Fifteen ranges were used for short TE spectra, which were integrated into a window of 0.15 ppm around the expected chemical shift of the main resonance of the metabolites 26 . These values were used as classification input.

Classification
The feature vectors obtained in the feature extraction step were used to classify healthy controls-RRMS and RRMS-SPMS. Feature standardization was carried out for each classification task. Four-fold cross-validation was used in the feature standardization step. In this method, feature sets of each patient were randomly divided into four parts -one used for the test and the remaining three for training. This process was repeated until each of the four folds was used as the testing set. This procedure was repeated until all feature sets from all patients were tested.
We also used a Support Vector Machine (SVM), which is frequently adopted in fields such as image processing, statistics, and machine learning. This method can classify two or more classes of linear or non-linear data. It counts with optimization techniques, which attempt to find the optimal separating plane between the two classes. The SVM algorithm classifies the features that cannot be separated linearly with kernel functions. Linear, radial basis, polynomial, and gaussian kernel functions are commonly used 27,28 . This study used the quadratic kernel function. Quadratic kernel function is a popular form of polynomial kernel function. Polynomial kernel functions whose "d" value is 1 receive the name of linear kernel function; when this value is 2, they are named quadratic kernel function 29 . Determining hyperparameters is critical to the performance of quadratic kernel functions. This study adopted grid-search and k-fold cross-validation methods to find optimal hyperparameter tuning (C, γ, r, and d). In the hyperparameter optimization process via grid-search with cross-validation, all results were observed for combinations of all values in a determined interval, and the best combination was chosen for the hyperparameter group. In the grid-search method, C (2 -10 , 2 -9 , ..., 2 1 ), γ (2 -10 , 2 -9 , ..., 2 1 ), r (2 -10 , 2 -9 , ..., 2 1 ), and d (0, 1, 2, 3, 4, 5) intervals were chosen for hyperparameter tuning. Class imbalance is a common problem in machine learning algorithms. Thus, we set the class_ weight parameter to 'balanced' to adjust for class imbalance.

RESULTS
In this study, SVS data obtained from 30 healthy controls and 36 RRMS and 25 SPMS patients were used as datasets. First, we assessed the detectability of MS types according to metabolite changes by performing a basic statistical analysis of the dataset. Based on the analysis, the mean levels of NAA peaks were 5.93±2.92, 9.24±2.01, and 7.70±2.85 in healthy controls, RRMS patients, and SPMS patients, respectively. These values may reflect a decreasing trend in NAA peak in progressive forms of MS. In healthy controls, the mean level of Cr and Cho metabolites were 2.93±1.75 and 2.83±1.86, respectively. The mean levels of the Cr and Cho metabolites were 5.88±1.41 and 5.89±1.42, respectively, in RRMS patients and 4.93±1.95 and 4.93±2.11, respectively, in SPMS patients. Figure 4 shows a box-plot with the statistical details of the dataset used in the study. As seen in Figure 4, metabolite ranges are closer in the RRMS and SPMS groups. Therefore, differentiating MS types with the help of basic statistical methods is difficult.
Second, the performance of the proposed CAD system, which was developed to overcome the mentioned limitation in the differentiation of MS types, was evaluated according to accuracy (Acc), sensitivity (Sen), and specificity (Spe) parameters.
We used binary classification (healthy controls-RRMS and RRMS-SPMS) to differentiate MS types. In the first evaluation, healthy controls and RRMS patients were categorized in binary classification. Forty-six SVS data randomly selected from the dataset were used for training, and the remaining 20 were used for tests (70% training, 30% test). Table 2 presents the results obtained.
According to test results, 10 of the 11 patients diagnosed with RRMS and 7 of the 9 individuals considered healthy controls by neurologists were correctly classified by the proposed CAD system. Acc, Sen, and Spe of the CAD system were 85%, 90.91%, and 77.78%, respectively. Furthermore, RRMS and SPMS patients were classified using the SVM method. Forty-tree MRS data were used for training, and the remaining 18 MRS data were used for testing. Table 3 reports the test results of the RRMS and SPMS classification.
According to the experiments, 9 of the 11 patients diagnosed with RRMS and 6 of the 7 patients diagnosed with SPMS by neurologists were correctly classified by the proposed CAD system. Consequently, Acc of the system was 83.33%.
The second evaluation used a k-fold (k=4) cross-validation technique 30 . In this method, the SVS dataset was randomly divided into four parts -one used for the test and the remaining three for training. Tables 4 and 5 describe the binary classification results of the 4-fold cross-validation.
As shown in Table 5, the 4-fold cross-validation results of RRMS and SPMS were: Acc: 81.96±4.91%, Sen: 83.33±5.55%, and Spe: 80±5.15%.  Total 12 8 20 examined the usability of MRS in MS and identified MS types using machine learning approaches. The literature has few studies addressing MS detection and classification based on machine learning and MRS data 19 . For example, Vingara et al. distinguished MRS data from RRMS and control groups with an accuracy of 86% using advanced statistics 8 . In contrast to their methodology, we applied machine learning algorithms instead of statistical methods, which allowed us to differentiate between healthy controls, RRMS patients, and SPMS patients. To the best of our knowledge, our study is the first to demonstrate the possibility of automatic differentiation between healthy controls, RRMS cases, and SPMS cases with high accuracy and machine learning methods. According to our results, we can affirm that SVS associated with machine learning approaches has the potential to contribute further to identifying MS types. Differentiating between healthy controls, RRMS cases, and SPMS cases is clinically important since the type of MS determines the treatment strategy. If the RRMS-SPMS differentiation occurs at a very early stage, the treatment algorithm can be organized accordingly 36 .
Corroborating other studies 5,6 , we also found that the most determining metabolite in distinguishing MS types is NAA. Abd El-Rahman et al. have stated that RRMS and SPMS patients can be identified with the help of MRS; however, they did not use the computer-aided machine learning method. The same study detected a significant decrease in MS plaques and NAA and Cr peaks among SPMS patients. At the same time, the Cho peak showed no significant changes 11 . Similarly, a study presented by Aboul-Enein reported decreases in NAA, Cho, and Cr peaks in parallel with increasing disease severity 10 . In our study, the most significant change was observed in the NAA peak. Certain decreases were found in Cho and Cr peaks, but in contrast to the studies mentioned above, the levels of NAA/Cr and NAA/Cho ratios showed no significant differences. Furthermore, MI peak levels decreased with the progress of the disease.
Some limitations of our study and areas for future research should be mentioned. The most important factor that determined the success of our approach is the training dataset. If the MRS dataset is enriched with healthy control, RRMS, and SPMS samples, the success of our method increases due to better learning of MS cases. Another limitation of the study was obtaining MRS data from a single MR scanner. In future studies, the proposed CAD can be evaluated with MRS data collected from different MR scanners. Moreover, a future study is planned in which RRMS, SPMS, and PPMS will be compared separately with sufficient numbers of patients in each group. Also, a new feature extraction method can be proposed for MRS data.
In conclusion, we have investigated the ability of SVS associated with a machine learning approach in differentiating between healthy controls, RRMS cases, and SPMS cases.

DISCUSSION
The literature has emphasized that MRS may be used as a complementary imaging technique in the follow-up and for understanding the disease mechanisms 4,7 . Moreover, the NAA metabolite should be taken into consideration when determining MS types. Other metabolites do not demonstrate any significant change regarding disease classification 5,6 . Many studies have analyzed the changes in metabolite levels to diagnose MS. However, MRS is still not a preferred imaging technique for MS diagnosis. This failure may be related to many reasons, such as the difficulty of conventional radiologists in analyzing and interpreting MRS signals, the lack of precise imaging standardization, and the inability to achieve the intended specificity and sensitivity in clinical practice 31 .
CAD approaches based on MRS are generally recommended to detect tumors, determine tumor grades, and differentiate tumors from other brain lesions 32,33,34,35 . Our study We found that healthy controls-RRMS and RRMS-SPMS can be used with a moderate degree of sensitivity and specificity. In future works, novel CAD approaches combined with MRS might provide supportive means for MRI to diagnose and classify different MS types.