Acessibilidade / Reportar erro

Diabetes classification using a redundancy reduction preprocessor

Introduction

Diabetes patients can benefit significantly from early diagnosis. Thus, accurate automated screening is becoming increasingly important due to the wide spread of that disease. Previous studies in automated screening have found a maximum accuracy of 92.6%.

Methods

This work proposes a classification methodology based on efficient coding of the input data, which is carried out by decreasing input data redundancy using well-known ICA algorithms, such as FastICA, JADE and INFOMAX. The classifier used in the task to discriminate diabetics from non-diaibetics is the one class support vector machine. Classification tests were performed using noninvasive and invasive indicators.

Results

The results suggest that redundancy reduction increases one-class support vector machine performance when discriminating between diabetics and nondiabetics up to an accuracy of 98.47% while using all indicators. By using only noninvasive indicators, an accuracy of 98.28% was obtained.

Conclusion

The ICA feature extraction improves the performance of the classifier in the data set because it reduces the statistical dependence of the collected data, which increases the ability of the classifier to find accurate class boundaries.

Diabetes; Clustering; Efficient coding; Independent Component Analysis; Support Vector Machine


Sociedade Brasileira de Engenharia Biomédica Centro de Tecnologia, bloco H, sala 327 - Cidade Universitária, 21941-914 Rio de Janeiro RJ Brasil, Tel./Fax: (55 21)2562-8591 - Rio de Janeiro - RJ - Brazil
E-mail: rbe@rbejournal.org