Dispersion of Count Data: A Case Study of Poisson Distribution and Its Limitations

Xhavit Bektashi *

Department of Applied Software Engineering, Faculty of Informatics, Mother Teresa University-Skopje, 1000 Skopje, Republic of North Macedonia.

Shpëtim Rexhepi

Department of Applied Software Engineering, Faculty of Informatics, Mother Teresa University-Skopje, 1000 Skopje, Republic of North Macedonia.

Nora Limani–Bektashi

Department of Food Technology, Faculty of Technological Science, Mother Teresa University-Skopje, 1000 Skopje, Republic of North Macedonia.

*Author to whom correspondence should be addressed.


Abstract

Poisson distribution is one of the widely known distribution in the field of probability and statistics by statisticians. It has been widely applied in modeling of discrete observations including but not limited to the number of customers in a shop within a specified period, the number of accidents occurring within a specified time or the number of claims experienced by an insurance company within a specified period of time. Poisson regression model has been widely used in events where one response variable is influenced directly by other independent variables. One thing about Poisson model is that it is strict on the property of dispersion as it assumes that count data is equidispersed which is not the case in practice. By this assumption, the Poisson model states that the variance of the count data is equal to the mean which is not practically true. In most cases, the variance of real count data is always greater than the mean, a phenomenon described as over dispersion. This gives Poisson model a loss in its frequent use in modelling count observations. This paper seeks to study the concept of dispersion, how Poisson regression is applied and its possible limitations. A deep study of Poisson model is done, its properties up to the fourth moments outlined. A graphical representation of its probability density function is drawn from simulated data and its shape noted under different rates as it resumes symmetry as the rate increases. A histogram is also presented. An application to real data is done in R programing language and proof that Poisson regression is very poor on this analysis given. Finally, a counter distribution appropriate for taking care of over dispersion is analyzed and results compared. AIC is used to conclude that NB is better than Poisson regression model.

Keywords: Poisson, dispersion, binomial regression, overdispersion, equidispersion, maximum likelihood estimation


How to Cite

Bektashi, Xhavit, Shpëtim Rexhepi, and Nora Limani–Bektashi. 2022. “Dispersion of Count Data: A Case Study of Poisson Distribution and Its Limitations”. Asian Journal of Probability and Statistics 19 (2):18-28. https://doi.org/10.9734/ajpas/2022/v19i230464.

Downloads

Download data is not yet available.