Articles

What is underdispersion in Poisson regression?

June 4, 2021 by Rhyley Bryan

What is underdispersion in Poisson regression?

In statistics, overdispersion is the presence of greater variability (statistical dispersion) in a data set than would be expected based on a given statistical model. Conversely, underdispersion means that there was less variation in the data than predicted.

What are the assumptions of Poisson regression?

Assumptions of Poisson regression Changes in the rate from combined effects of different explanatory variables are multiplicative. At each level of the covariates the number of cases has variance equal to the mean (as in the Poisson distribution). Errors are independent of each other.

How do you interpret Poisson regression coefficients?

In the discussion above, Poisson regression coefficients were interpreted as the difference between the log of expected counts, where formally, this can be written as β = log( μx+1) – log( μx ), where β is the regression coefficient, μ is the expected count and the subscripts represent where the predictor variable, say …

How does Poisson deal with overdispersion?

How to deal with overdispersion in Poisson regression: quasi-likelihood, negative binomial GLM, or subject-level random effect?

Use a quasi model;
Use negative binomial GLM;
Use a mixed model with a subject-level random effect.

What causes underdispersion?

Underdispersion can occur when adjacent subgroups are correlated with each other, also known as autocorrelation. When data exhibit underdispersion, the control limits on a traditional P chart or U chart may be too wide.

What is the difference between Poisson and negative binomial?

The Poisson distribution can be considered to be a special case of the negative binomial distribution. The negative binomial considers the results of a series of trials that can be considered either a success or failure. A parameter ψ is introduced to indicate the number of failures that stops the count.

What are the four model assumptions for Poisson regression?

Independence The observations must be independent of one another. Mean=Variance By definition, the mean of a Poisson random variable must be equal to its variance. Linearity The log of the mean rate, log(λ ), must be a linear function of x.

How do you check for overdispersion in Poisson regression?

When the response variable is a count, but μ does not equal σ2, the poisson distribution is not applicable. Overdispersion can be detected by dividing the residual deviance by the degrees of freedom. If this quotient is much greater than one, the negative binomial distribution should be used.

What is Underdispersion in statistics?

What is underdispersion? Underdispersion exists when data exhibit less variation than you would expect based on a binomial distribution (for defectives) or a Poisson distribution (for defects). Underdispersion can occur when adjacent subgroups are correlated with each other, also known as autocorrelation.

How do you identify Overdispersion?

Overdispersion can be detected by dividing the residual deviance by the degrees of freedom. If this quotient is much greater than one, the negative binomial distribution should be used. There is no hard cut off of “much larger than one”, but a rule of thumb is 1.10 or greater is considered large.

What is the value of Poisson regression in SAS?

I was performing a Poisson regression in SAS and found that the Pearson chi-squared value divided by the degrees of freedom was around 5, indicating significant overdispersion. So, I fit a negative binomial model with proc genmod and found the Pearson chi-squared value divided by the degrees of freedom is 0.80.

Can a Poisson distribution be wrong in a regression?

This model assumption can be wrong for many different reasons. Overdispersed count data with a variance larger than what the Poisson distribution dictates is, for instance, often encountered. Deviations from the variance assumption can in a regression context take several forms.

Which is the best model to handle underdispersed Poisson data?

The best — and standard ways to handle underdispersed Poisson data is by using a generalized Poisson, or perhaps a hurdle model. Three parameter count models can also be used for underdispersed data; eg Faddy-Smith, Waring, Famoye, Conway-Maxwell and other generalized count models. The only drawback with these is interpretability.

When to use a dispersion parameter in SAS?

Once the most appropriate functional form of the variance function is determined, a dispersion parameter can be included, if needed, in either model to adjust the statistical inference for any additional over- or underdispersion. How to do that easily in SAS, say, is unfortunately not something I can help with.