Diagnosing Multicollinearity of Logistic Regression Model

Main Article Content

N. A. M. R. Senaviratna
T. M. J. A. Cooray

Abstract

One of the key problems arises in binary logistic regression model is that explanatory variables being considered for the logistic regression model are highly correlated among themselves. Multicollinearity will cause unstable estimates and inaccurate variances that affects confidence intervals and hypothesis tests. Aim of this was to discuss some diagnostic measurements to detect multicollinearity namely tolerance, Variance Inflation Factor (VIF), condition index and variance proportions. The adapted diagnostics are illustrated with data based on a study of road accidents. Secondary data used from 2014 to 2016 in this study were acquired from the Traffic Police headquarters, Colombo in Sri Lanka. The response variable is accident severity that consists of two levels particularly grievous and non-grievous. Multicolinearity is identified by correlation matrix, tolerance and VIF values and confirmed by condition index and variance proportions. The range of solutions available for logistic regression such as increasing sample size, dropping one of the correlated variables and combining variables into an index. It is safely concluded that without increasing sample size, to omit one of the correlated variables can reduce multicollinearity considerably.

Keywords:
Logistic regression, multicollinearity, tolerance, variance inflation factor, condition index

Article Details

How to Cite
Senaviratna, N. A. M. R., & A. Cooray, T. M. J. (2019). Diagnosing Multicollinearity of Logistic Regression Model. Asian Journal of Probability and Statistics, 5(2), 1-9. https://doi.org/10.9734/ajpas/2019/v5i230132
Section
Original Research Article

References

Field A. Discovering statistics using SPSS. 3rd Ed. California: SAGE Publications Inc; 2009.

Liao D, Valliant R. Condition indexes and variance decompositions for diagnosing collinearity in linear model analysis of survey data. Survey Methodology. 2012;38(2):189-202.

Belsley DA. A guide to using the collinearity diagnostics. Computer Science in Economics and Management. 1991;4(1):33-50.

Shen J, Gao S. A solution to separation and multicollinearity in multiple logistic regression. Journal of Data Science. 2008;6(4):515-531.

Azar Y. Some new methods to solve multicollinearity in logistic regression. Communication in Statistics. 2017;46(4):2576-2586.

Schaefer RL, Roi LD, Wolfe RA. A ridge logistic estimator. Communication in Statistics. 1984;13(1): 99-113.

Midi H, Sarkar S, Rana S. Collinearity diagnostics of binary logistic regression model. Journal of Interdisciplinary Mathematics. 2013;253-267.

Mayers RH. Classical and modern regression with applications. PWS-Kent Publishing Company; 1990.

Menard S. Applied logistic regression analysis. 2nd Ed. A Sage University Paper; 2002.

Menar S. An introduction to logistic regression diagnostics. 1st Ed. Thousand Oaks: SAGE Publications Inc; 2011.

Rana S, Midi H, Sarkar S. Validation and performance analysis of binary logistic regression model. Proceedings of the WSEAS International Conference on Environment, Medicine and Health Sciences. 2010;51-55.

Senaviratna NAMR, Cooray TMJA. Detecting multicollinearity of binary logistic regression model. Second International Conference on Multidisciplinary Research, Sri Lanka. 2018;15.