Assessing Diabetes Risk Factors Using Logistic Regression: A Kaggle-Based Study

Kale Kawu Kale

Department of Mathematics and Statistics, Computational Lab, Yobe State University, Nigeria.

Babagana Modu *

Department of Mathematics and Statistics, Yobe State University, Nigeria.

*Author to whom correspondence should be addressed.


Abstract

This study investigates the relationship between diabetes and its associated risk factors, with the aim of identifying key predictors and evaluating their impact on disease progression. Variables such as age, smoking status, cholesterol levels, blood pressure, BMI, and glucose levels were analyzed in relation to diabetes outcomes. A total of 4,240 records were preprocessed to address missing values and outliers using imputation and the Z-score method, respectively. The class imbalance ratio was calculated to be 35.8, indicating a significant imbalance favoring the diabetes-positive class. Logistic regression was employed as the modeling technique for analysis. The findings revealed that glucose levels and age are the most significant predictors of diabetes, with the model achieving an accuracy of 97.2%, sensitivity of 98.5% and specificity of 4%, suggesting that individuals with higher glucose levels or advancing age are at greater risk. While other factors also contributed to the model, their influence varied and was comparatively moderate. It is important to note that the results may be affected by the high class imbalance, as the majority of cases in the binary classification were diabetes-positive. In conclusion, the study highlights the importance of regular health monitoring and early intervention, particularly for older individuals.

Keywords: Logistic regression, diabetes, models, risk factors, predictors, confusion matrix


How to Cite

Kale, Kale Kawu, and Babagana Modu. 2025. “Assessing Diabetes Risk Factors Using Logistic Regression: A Kaggle-Based Study”. Asian Journal of Probability and Statistics 27 (4):132-41. https://doi.org/10.9734/ajpas/2025/v27i4745.

Downloads

Download data is not yet available.