EVALUATION OF DATA MINING CLASSIFICATION AND CLUSTERING TECHNIQUES FOR DIABETES

Tuba PALA and Ali Yilmaz Camurcu

Duzce University, Golyaka Vocational School,
Fatih Sultan Mehmet Vakif University, Department of Computer Engineering

Abstract

Analysis of the diagnosis and treatment records by computer programs in the field of medicine constitutes a treatment-supporting factor. Like in many fields, in the field of medicine, too, the use of the methods of data mining has been increasing. It has been aimed to develop the system which will give doctors help for effective treatment and early diagnosis with the result to be obtained in the medical data sets by realizing medical decision support system design in which the methods of data mining are used. After the pre-processing stage in the data mining process, in the data classification stage, Support Vector Machines (SVM), Naive Bayes, Decision Trees, Artificial Neural Networks (ANN), Multilayer Perceptron (MLP), Logistic Regression (LR) algorithms have been used. The success evaluation of data mining classification algorithms have been realized through the data mining programs Weka and RapidMiner. Multilayer Perceptron algorithm has been the best algorithm with the highest success percentage in both of the programs; Decision Trees has been the algorithm which has the lowest success percentage again in both of the programs. This study has indicated that data mining can be a useful tool in the medical field. Doctors can be provided convenience in the progress of the disease and the treatment for patients for whom prediction of disease is made by making morbidity or non-morbidity prediction of the diabetes which is seen on many people in the world and in our country beforehand.

Keywords: Medical Decision Support System, Data Mining, Classification, K-Means Clustering, Diabetes Data Set

Full Download