23º SINAPE - Simpósio Nacional de Probabilidade e Estatística

Página Inicial » Inscrições Científicas » Trabalhos

Dados do Trabalho



Data de titulação

Fevereiro 2017

Instituição de titulação

Universidade Federal de Minas Gerais

RESUMO (abstract)

When someone is dealing with discrete response variables, the Poisson regression model is commonly used for fitting count data, just as described by Lawless (1987), Karlis (2001) and Sellers and Shmueli (2010), for instance. However, a drawback of this model is that Poisson distribution is equidispersed. In practice, overdispersed count data are often observed and to handle this problem, different kind of mixed Poisson regression models, wherein the variance is larger than the mean, have been introduced in the literature, such as the negative binomial (NB) regression model [Lawless (1987)], which is widely used, and Poisson-inverse Gaussian (PIG) regression model [Dean et al. (1989), Holla (1967)]. A general class of mixed Poisson regression models was proposed by Barreto-Souza and Simas (2016).

As mentioned by Garay et al. (2011) and Barreto-Souza and Simas (2016), a factor that can lead to overdispersion is the excess of zeros in count data, although Dean and Nielsen (2007) clarify that mixed Poisson regression models may not be suitable for modeling data with high zero-inflation rate, since overdispersion may remain. To deal with this issue, zero-inflated models for count data are easily found in literature, such as zero-inflated Poisson (ZIP) regression model, introduced by Lambert (1992), zero-inflated negative binomial (ZINB) regression model [see, for example, Yau et al. (2003)] and, moreover, zero-inflated generalized Poisson (ZIGP) regression models, proposed by Famoye and Singh (2006).

Data with too many zeros are easily encountered in several fields, as biology [Oliveira et al. (2016)], manufacturing application and engineering [Lambert (1992), Li et al. (1999)], agriculture [Ridout et al. (2001)], health [Mwalili et al. (2008), Lim et al. (2014)], social sciences [Famoye and Singh (2006)] and many other disciplines. Thus, in order to deal with overdispersion and the excess of zero in count data, the goal of this work is to provide appropriated support for modeling these kind of data. For this purpose, we are proposing a general regression model based on a class of zero-inflated mixed Poisson distributions. With this approach, we unify some existent models, such as ZINB and zero-inflated Poisson-inverse Gaussian (ZIPIG) regression models, as well as open the possibility of introducing new zero-inflated models. We will provide appropriated support for modeling these kind of data, exploring computational resources in the R program to obtain the model parameters estimates, through the EM algorithm which can deal with the latents variables, as well as residual analysis and diagnostic for assess global influence.

Keywords: zero-inflation, count regression models, overdispersion, EM algorithm.