Abstract
The current dissertation investigates the application of machine learning in diabetes prediction at early stages through a comparison of the performances of three classification models, including Logistic Regression, Decision Tree, and Random Forest. Inspired by the increased and spread cases of diabetes worldwide, the research objective is to facilitate early diagnosis by providing interpretable and accurate predictive models. Based on Pima Indians Diabetes Dataset containing 768 clinical records, the research implemented data preprocessing including KNN imputation and outlier processing, feature scaling, and formation of interaction features. Quantitative and comparative approach was made to train and test the models based on such metrics as accuracy, precision, recall, F1 score, and ROC-AUC. The Logistic Regression proved to have the optimal level of accuracy and interpretability (AUC 0.88), that is reasonable to be used in clinical practice. Random Forest showed the best predictive accuracy although it was not transparent. The most influential features identified in prediction were glucose, BMI and age. The research suggests the usage of Logistic Regression as the solution deployed in mobile and community-based screening devices. The future research should involve more significant and heterogeneous samples of data and implement interpretability techniques like SHAP or LIME and expand the framework to predict some other chronic conditions, including hypertension.
Library of Congress Subject Headings
Diabetes--Forecasting--Data processing; Diabetes--Diagnosis; Machine learning
Publication Date
12-2025
Document Type
Thesis
Student Type
Graduate
Department, Program, or Center
Graduate Programs & Research
Advisor
Sanjay Modak
Advisor/Committee Member
Khalil Al Hussaeni
Recommended Citation
Bawazir, Ahmed Abdelqadir, "EARLY DIABETES PREDICTION USING MACHINE LEARNING: A COMPARATIVE STUDY OF CLASSIFICATION MODELS" (2025). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/12361
Campus
RIT Dubai
Plan Codes
PROFST-MS
