An Effi cient Skin Cancer Diagnostic System Using Bendlet Transform and Support Vector Machine

: Skin is the outermost and largest organ of the human body that protects us from the external agents. Among the various types of diseases affecting the skin, melanoma (skin cancer) is the most dangerous and deadliest disease. Though it is one of the dangerous forms of cancer, it has a high survival rate if and only if it is diagnosed at the earliest. In this study, skin cancer classifi cation (SCC) system is developed using dermoscopic images. It is considered as a classifi cation problem with the help of Bendlet Transform (BT) as features and Support Vector Machine (SVM) as a classifi er. First, the unwanted information’s such as hair and noises are removed using median fi ltering approach. Then, directional representation based feature extraction system that precisely classifi es curvature, location and orientation is employed. Finally, two SVM classifi ers are designed for the classifi cation. The performance of the SCC system based on Bendlet is superior to other image representation systems such as Wavelets, Curvelets, Contourlets and Shearlets.


INTRODUCTION
Nowadays, the incidence and mortality rate of skin cancer increasing worldwide.Hence, an automated SCC system is required for early detection and prevention of skin cancer.Dermoscopic imaging is the primary technique for the diagnosis of skin cancer.Ruiz et al. (2011) developed a decision support system for SCC.It uses ABCD (Asymmetry, Border, Colour and Diameter) rule with many descriptors such as modification ratio, anisotropy, sharpness variation, and roundness and components average of the colour.Then, three classifiers that were based on k-Nearest Neighbour (kNN), Bayesian and Multilayer perceptron were employed for the classifi cation using the selected features.Deep Neural networks based SCC system was developed by Esteva et al. (2017).
Only the biopsy-proven images were used to train and test the Google Inception architecture.
Texture and colour features based SCC system was implemented by Almansour et al. (2016).Colour features were extracted from different colour spaces such as RGB, YC b C r , CIE Lab and HSV.Texture features by employing local binary pattern and grey level co-occurrence matrix techniques were extracted.Finally, SVM classifi er classifi es the dermoscopic images.A combination of three different features was used for SCC by Nasir et al. (2018).Colour features from four colour spaces, shape features by the histogram of oriented gradients and fractal features were combined.From the combination, dominant features were selected and classifi ed by the use of SVM classifi er.
ABCD rule was applied for the SCC system by Ozkan & Koklu (2017).It uses four different classifiers such as kNN, SVM, decision tree and Artificial Neural Network (ANN) to classify the dermoscopic images.The various parameters of ABCD rule were determined by various image processing approaches such as de-noising, segmentation, enhancement (Zaqout 2016).Then, a total dermoscopic score was computed and used for the classification.Different combination of features from ABCD rule was analyzed by Ma et al. (2017) for SCC using SVM classifier.ABCD rule with a 7-point checklist (De vita et al. 2012) was used for the diagnosis of skin cancer.
Sonia (2016) developed Contourlet Transform based SCC system.It uses a non-sub-sampled version of the transform for feature extraction and Bayesian classifier.10 fold cross-validation was used for performance analysis.Wavelet transform based SCC system was developed by Jain et al. (2012).The approximation coefficients from the decomposed dermoscopic images were given to the probabilistic neural network.Wavelet and Curvelet transform based SCC system were described by Abu mahmoud et al. (2013).Statistical features were extracted from the sub-band coefficients and then the feature space dimension was reduced by the principal component analysis.It uses ANN for the classification with a back propagation algorithm.
In this paper, a SCC system with dermoscopic images by the use of BT and SVM classifier is presented.The organization of the rest of this article is as follows: The definition and details about the SCC system with BT features and SVM classifier are discussed in Materials and Methods.It also gives the details about the materials used in this system.The results obtained by the SCC system are presented in Results and Discussions and finally conclusion of this study is given.

MATERIALS AND METHODS
The SCC system consists of three different modules; preprocessing, feature or information extraction and classification.The extracted features or information from the dermoscopic images affect the overall performance if they contain redundant data such as noise and hair.Hence, the unwanted information is removed using median filtering approach in the preprocessing module.The preprocessing step should preserve edges and curves in dermoscopic images while removing noises.The non linear filtering approach has potential improvement over linear filter (Arias-Castro & Donoho 2009) where the preservation of edges and curves are of critical importance in medical image analysis.Hence, median filtering is used in this study which not only removes the noises but also hairs in the dermoscopic images.In the feature extraction module, the preprocessed dermoscopic image is represented by BT and then the energies of sub-band features are extracted.In the classification module, the extracted features are classified using SVM classifier.The framework of the proposed skin cancer diagnostic system is shown in Figure 1.

BT based feature extraction
Medical images consist of regions with different kinds of tissues which are separated by smooth curves.These smooth curves provide various information about the image and are defined by multiple functions i.e., piecewise in nature.Hence, the extraction of these curves helps to classify the images.To extract the curves, different multi-resolution analysis is being used in the last years.The directional representation systems (Curvelet (Donoho et al. 2005), Contourlet (Do & Vetterli 2005), and Shearlet (Lim 2010)) differ from regular wavelets (Mallat 2008) where the degree of directions or orientation varies with the level of decomposition.This property locates the boundary curves precisely and also gives many directional information.One of the advantages of Shearlets is that it detects the nonsmooth corner points.However, the drawback of existing systems is that they cannot classify the curvature precisely.This can be overcome by the BT with an additional parameter in Shearlets called bending.The construction of BT from Shearlets is as follows: Shearlet Transform ST (ψ ) can be written as where t T is the translation operator, a A D is the dilation and s S D is the shearing operator.The operators T and D are defined by Lim (2010), The definition of scaling matrix and the shearing matrix are given below: where is an integer The scaling operator in ( 4) is called parabolic scaling (Lessig et al. 2017).The main difference of BT with Shearlet transform lies in the scaling operation where the parabolic scaling is replaced by α the scaling method.It is defined by Lessig et al. (2017), and a where a a A a (6) where the scaling anisotropy is defined by the α parameter (Lessig et al. 2017).With this α scaling method, the decay rate of bendlets provides more accurate curvature and directions of different regions in an image.More information about different image representations system can be found in Wavelets (Mallat 2008), Curvelet (Donoho et al. 2005), Contourlet (Do & Vetterli 2005), Shearlet (Lim 2010) and Bendlet (Lessig et al. 2017).
After representing the dermoscopic image by the image representation system, characterizing textures in each sub-image is very important.This step not only reduces the feature space In this study, the mean of the magnitude of each BT sub-image is extracted for the SCC system.It is defined by where {(t i , c i ), i=1 ,2,3 ,. .. N } represents the subimage of size R x C. Sub-image coordinates are represented by (i, j).

SVM Classifier
The where w and b are the weight and bias value computed by using T = {(t i , c i ), i=1 ,2,3 ,. .. N } respectively.The optimal separating hyperplane using these parameters can be written (Erasto 2001) as where C is the user defined parameter which controls the trade-off between the empirical risk and the model complexity.The above formulation can be rewritten for non-linear case (Erasto 2001) as where are support vectors that consist of a subset of T. The support vectors and coefficients are computed from T via structural risk minimization.The Radial Basis Function (RBF) kernel is used in this study which is Gaussian in nature.RBF with window width of σ (Erasto 2001) is defined as More information about SVM can be found in (Cortes & Vapnik 1995).The classification system is implemented with the help of dermoscopic images in PH2 database.This database comprises of three classes of dermoscopic images; normal, benign and malignant that requires a 3-class classification system.Though the original SVM classifier deals with binary (2-class) classification, it can be extended to multiclass classification by single machine methods (modifying the optimization problems) and decomposition strategies (one-vs-all and onevs-one techniques).However, these methods deal with larger optimization problem which is computationally more expensive (Rosales-Perez et al. 2018).Hence, a simple strategy of 2-phase classification (Thivya et al. 2016) is employed where the classification scenario in each phase is a 2-class problem.
In the 1 st phase, the available images in the PH2 database are grouped into two classes; one with negative labels that consists of normal dermoscopic images and another with positive labels which includes abnormal (benign and malignant) dermoscopic images.The task is to classify the given image as either normal or abnormal (NoAb) using SVM classifier (SVM-I in Figure 1).The performance of SVM model is validated using k-fold cross-validation.
The k-fold cross-validation procedure is as follows: At first, the dermoscopic images in the two classes (normal and abnormal) are randomly divided into k groups such that the number of images in each fold is approximately equal.The images in one of the k-fold are treated as test set and the images in the remaining k-1 folds are treated as training set.The SVM classifier is trained using training set and then evaluated using the test set.The performance metrics are evaluated and retained for the particular k-fold.The same steps are repeated for all unique folds and then summarize the performance metrics of all folds.In machine learning, the commonly used k value is 10 (Ozkan & Koklu 2017, Sonia 2016).Figure 2 shows the cross-validation procedure with 10 folds.
The next phase of classification is activated only when the result of 1 st phase classification is abnormal.In the 2 nd phase of classification (SVM-II), the benign dermoscopic images are considered as one class with negative labels and malignant images as another class with positive labels.In this phase, the abnormal severity is again classified into benign or malignant (BeMa).The performance of the system at this phase is also evaluated using k-fold cross-validation which uses only benign and malignant dermoscopic images.

PH2 Database
The proposed skin cancer diagnostic system is analyzed using the PH2 database which is freely downloadable from (Mendonca et al. 2013 and PH2 database link).It contains 200 dermoscopic colour images with melanocytic lesions.The resolution of dermoscopic images is 768x560 pixels.There are 80 normal and 120 abnormal images available for the classification.Figure 3 shows sample normal and abnormal images in PH2 database.

RESULTS AND DISCUSSIONS
The performance of the system is validated using k-fold (k=10) cross-validation and measured using accuracy, sensitivity, specificity and confusion matrix.Table I shows the basic .Hence, for each phase of classification 100 (10 x 10) combinations are tried.For each pair of (C, σ ), the performance is measured by 10-fold crossvalidation.It is found that the best pair is (2 4 , 2 2 ) for 1 st phase of classification and (2 7 , 2 4 ) for the 2 nd phase of classification.Tables II and III report the performances of the SCC system for the classification of the first phase and second phase respectively.The measurements in these tables are computed using the formulae in Table I.For a particular decomposition level (from 1 to 4), the BT sub-band features are extracted at different directions (2, 4, 8, 16 and 32).These features are classified using the 2-phase classification process.
It is observed from Table II that the 3 rd level BT features provide better performance than the features extracted at other levels.The maximum sensitivity, specificity and accuracy of the NoAb phase reach 97.5%, 100% and 98.5% respectively at 3rd level BT with 8 directions.It is observed that the performance of SCC system decreases when the level of bendlet decomposition is above 3 rd level and the number of directional sub-bands for a particular decomposition level is above 8 directions.The highly redundant decomposition produces more number of subbands that do not have any discriminating power for the classification.The inclusion of these subbands features makes the classification system ineffective that reduces the system performance.
All the important performance metrics such as sensitivity, specificity and accuracy of BeMa phase reach 100% at 3 rd level BT with 8 directions.The obtained results demonstrate the effectiveness of BT as a feature extraction technique for the SCC system.A comparison of performance metrics with different image It is observed from the performance comparisons of different image representation techniques in Figures 4 and 5 that BT performs well.It is obvious that wavelet transform is the least performer in this study as it provides less directional representation.Also, the performance of Contourlet is better than Curvelet as the approximations of Contourlet is more accurate than Curvelet.
The reason for the poor performance of wavelet is that they cannot characterize the boundary curves which separate different regions in dermoscopic images.These boundary curves are efficiently extracted by directional representation systems such as Curvelet and Contourlet which increases their performances over wavelet.The ability of Shearlet over Curvelet and Contourlet is the detection of non-smooth boundary curves between the image regions.BT is superior than Shearlet as it uses α scaling instead of parabolic scaling that helps to classify curvature precisely.Hence, the features extracted from the BT sub-bands are significant than other image representation systems.
Table IV shows the performance comparison of the system with existing approaches for SCC using PH2 dataset.As the system operates in 2-phases, the performance metrics shown in the table are the average performances of 2-phases in this system.It shows that the BT and SVM based system performed good as compare to existing SCC system in the literature.

CONCLUSION
In this paper, a framework for the SCC system based on BT and SVM classifier is presented.the dermoscopic images into normal/abnormal and then cancerous/noncancerous. Results show that the SCC system using BT and SVM classifier provides better accuracy, specificity and sensitivity over wavelets and Shearlets.Also, the SCC system does not require any manual intervention for the diagnosis.The validation of the SCC system on a larger database will be done in the near future.

Figure 1 .
Figure 1.BT and SVM based SCC system.

Figure
Figure 3. Sample images in the PH2 database (a) Normal images (b) melanoma images.
It includes directional representation based feature extraction and two SVM classifiers for classification.Compared to wavelets and Shearlets, BT can classify singularities in images more precisely.From the sub-bands of BT, energies are extracted.The 2-phase classification designed in this study classifies as the number of correctly classified skin cancer images to the total number of images tested.

Figure
Figure 4. Performance comparisons of different image representation techniques for the classification of NoAb phase.

Table I .
Performance matrices of SCC system.

Table II .
SVM-RBF Classifier performance for NoAb phase.

Table III .
SVM-RBF Classifier performance for BeMa phase.

Table IV .
Comparison of BT and SVM based SCC system results with existing systems.
REFERENCESABU MAHMOUD M, AL-JUMAILY A & TAKRURI MS. 2013.Wavelet and curvelet analysis for automatic identification of