Performance Evaluation of Hybrid SVM-RF and XGBoost-RF Architectures for Classifying Gender-Based Violence Tweets on X

Dan Kipkosgei Kogei *

Department of Mathematics, Physics and Computing, Moi University, Eldoret, Kenya.

Subby Mackenzie Mino

Department of Mathematics, Physics and Computing, Moi University, Eldoret, Kenya.

*Author to whom correspondence should be addressed.


Abstract

The rapid development and growth of social media arena such as twitter (currently referred to as X space) has really changed how individuals communicate, share the experiences, and engage in public discourse. Social media platforms have immensely become very significant spaces where the survivors and advocates can share their experiences, raise awareness, and mobilize vital actions against GBV. Despite of recent advancements in machine learning, several challenges persist in the classification of GBV related tweets. Tweets are limited in length and often contain informal language, abbreviations, emojis, and misspellings, which makes feature extraction and semantic understanding more difficult compared to longer, well-structured texts. Most existing studies rely on either traditional machine learning models or deep learning approaches independently, with limited research on hybrid models that combine algorithms such as SVC-RFC, and XGBoost-RFC for GBV classification. Therefore this study sought to compare the hybrid models of SVC-RFC and XGBoost-RFC for classifying GBV tweets. The study used secondary data from Zindi platform with 39650 observations, data exploration, cleaning and analysis was done in Jupyter notebook. Data was partitioned into 80% and 20% training and testing respectively. The weighted Precision, weighted recall and weighted F1-score and confusion matrix were used as evaluation metrics because of class imbalances. Analysis of the hybrid models revealed that the Hybrid XGBoost-RF classifier outperformed the Hybrid SVM-RF classifier with weighted precision, recall and f1 score of 0.9991, 0.9991 and 0.9991 respectively. The study demonstrates that training of hybrid models through voting can lead to more reliable and accurate predictions and classifications in sensitive application areas such as GBV and other text classification tasks. Based on the findings of this study, it is recommended that the Hybrid XGBoost-RF classifier should be used for GBV classification tasks due to its superior performance in terms of weighted evaluation metrics. Its effectiveness suggests that it can be reliably applied in real world scenarios where precise classification is very important. Additionally, this hybrid approach may be extended to other complex classification problems, particularly those involving imbalanced or sensitive datasets, as it demonstrates strong predictive capability.

Keywords: Machine learning, gender based violence, support vector machines, random forest, XGBoost, hybrid models, precision, recall, F1 score


How to Cite

Kogei, Dan Kipkosgei, and Subby Mackenzie Mino. 2026. “Performance Evaluation of Hybrid SVM-RF and XGBoost-RF Architectures for Classifying Gender-Based Violence Tweets on X”. Asian Journal of Probability and Statistics 28 (5):61-72. https://doi.org/10.9734/ajpas/2026/v28i5895.

Downloads

Download data is not yet available.