Type 2 Diabetes Mellitus Screening and Risk Factors Using Decision Tree: Results of Data Mining

  •  Shafi Habibi    
  •  Maryam Ahmadi    
  •  Somayeh Alizadeh    


OBJECTIVES: The aim of this study was to examine a predictive model using features related to the diabetes type 2 risk factors.

METHODS: The data were obtained from a database in a diabetes control system in Tabriz, Iran. The data included all people referred for diabetes screening between 2009 and 2011. The features considered as “Inputs” were: age, sex, systolic and diastolic blood pressure, family history of diabetes, and body mass index (BMI). Moreover, we used diagnosis as “Class”. We applied the “Decision Tree” technique and “J48” algorithm in the WEKA (3.6.10 version) software to develop the model.

RESULTS: After data preprocessing and preparation, we used 22,398 records for data mining. The model precision to identify patients was 0.717. The age factor was placed in the root node of the tree as a result of higher information gain. The ROC curve indicates the model function in identification of patients and those individuals who are healthy. The curve indicates high capability of the model, especially in identification of the healthy persons.

CONCLUSIONS: We developed a model using the decision tree for screening T2DM which did not require laboratory tests for T2DM diagnosis.

This work is licensed under a Creative Commons Attribution 4.0 License.