A Machine Learning Approach to Care Optimization for UTI Patients: A Case Study of the Tamale Teaching Hospital, Northern Region, Ghana

Mohammed Fuseini Dokurugu

Department of Statistics, Faculty of Physical Sciences, University for Development Studies, Tamale, West Africa, Ghana.

Amadu Yakubu *

Department of Statistics, Faculty of Physical Sciences, University for Development Studies, Tamale, West Africa, Ghana.

Zakaria Abdul-Ganiu

Department of Population and Reproductive Health, School of Public Health, University for Development Studies, Tamale, West Africa, Ghana.

*Author to whom correspondence should be addressed.


Abstract

Background: Urinary tract infections (UTIs) are a common clinical concern with significant public health implications. While sociodemographic factors are believed to influence UTI risk, their predictive value remains inconsistent across populations. This study aimed to identify sociodemographic predictors of UTI status using both traditional statistical methods and machine learning (ML) models.

Methods: A cross-sectional analysis was conducted in the northern region of Ghana on a sample of 2,598 individuals. Descriptive statistics and chi-square tests were used to examine associations between UTI status and sociodemographic variables. Five ML classification models-boosting, random forest, decision tree, k-nearest neighbours, and support vector machine-were trained and evaluated using accuracy, precision, recall, and F1 score. Data were split into training (64%), validation (16%), and test (20%) sets.

Results: Age and gender were significantly associated with UTI status (p<0.001), while education, occupation, religion, and marital status showed no significant relationships. Among the ML models, the boosting classifier achieved the highest average F1 score (0.675), followed by random forest (0.666) and decision tree (0.671). Age (58%) and gender (42%) were the only variables with meaningful predictive importance in the boosting model. The model demonstrated moderate accuracy (68%) on the test set.

Conclusion: Age and gender are the most influential sociodemographic predictors of UTI status in this population. The boosting ML model outperformed other classifiers, offering a moderately accurate tool for UTI risk prediction. These findings support the development of targeted, demographically informed prevention strategies and highlight the potential of ensemble ML methods in clinical prediction tasks. Future studies should incorporate clinical, behavioural, and microbiological data to improve predictive performance and generalizability.

Keywords: Urinary tract infection, machine learning, antimicrobial resistance, K-nearest neighbours, support vector classification, decision tree


How to Cite

Dokurugu, Mohammed Fuseini, Amadu Yakubu, and Zakaria Abdul-Ganiu. 2026. “A Machine Learning Approach to Care Optimization for UTI Patients: A Case Study of the Tamale Teaching Hospital, Northern Region, Ghana”. Asian Journal of Probability and Statistics 28 (2):10-18. https://doi.org/10.9734/ajpas/2026/v28i2862.

Downloads

Download data is not yet available.