PREDICTION OF DIABETIC RETINOPATHY AMONG TYPE II DIABETIC PATIENTS USING DATA MINING TECHNIQUES
Zuraida Khairudin1 , Nurfatin Adila Abdul Razak2 , Hezlin Aryani Abd Rahman3 , Norbaizura Kamarudin4 and Nor Azimah Binti Abd Aziz5
1,2,3,4 Faculty of Computer and Mathematical Sciences, UiTM Shah Alam, Selangor, Malaysia
5 Faculty of Medicine, UiTM Sg Buloh, Selangor, Malaysia
Diabetic retinopathy is one of the leading causes of visual disability and blindness worldwide. It is estimated that 4.8% out of 37 million cases of blindness were due to diabetic retinopathy, globally. It affects patients suffering from prolonged diabetes, which probably results in permanent blindness. The earliest symptoms surfaced when the patients have vision problems. Therefore, regular eyes examination and early intervention normally controls this disease. Many studies for early intervention and prevention of diabetic retinopathy uses various predictive models. The booming of database and digital storage technology creates an abundance of health records. Thus, data mining techniques helps uncover meaningful patterns while attending to sensitivity health record issues. Hence, this study took the data mining approach in predicting the presence of diabetic retinopathy narrowing to only Type II diabetic patients as well as to determine the risk factors that contribute to the presence of diabetic retinopathy. The data mining models selected for this study is the Logistic Regression, Decision Tree and Artificial Neural Network. The dataset of 361 Type II diabetic patients from Ophthalmology Clinic, UiTM Medical Specialist Centre were selected between January 2014 to December 2018, consists of 17 variables. The result shows that the Logistic Regression using Forward selection method model is the best model since it had the highest sensitivity (Sen=50.0%), specificity (Spe=79.03%) and accuracy rate (Acc=66.36%) on the validation dataset compared to other Logistic Regression selection options. Meanwhile among the Decision Tree models, DT using Gini is the best model. Logistic Regression (Forward) and Decision Tree (Gini) were then compared with Artificial Neural Network model (Sen=56.25%, Spe=70.97%, Acc=64.55%). The results demonstrated that Logistic Regression using Forward selection method was the best model to predict the presence of diabetic retinopathy among the Type II diabetic patients compared to other models. The significant risk factors associated with the presence of the diabetic retinopathy obtained are duration of diabetes, HbA1C level, diabetic foot ulcer, nephropathy, and neuropathy.
Keywords: Data Mining. Diabetic Retinopathy, Predictive Modelling, Type II Diabetes Mellitus.
Published On: 29 September 2020