# [R] Overdispersed poisson - negative observation

David Firth david.firth at nuffield.oxford.ac.uk
Thu Jan 16 22:06:03 CET 2003

```Peter

You are right to suppose that negative observations should not be a
problem for a quasi-Poisson fit (though negative *fitted* values would
be, in a log-linear model).

The problem is that the "quasipoisson" function is overly restrictive,
in requiring non-negativity of the y variable.  (commenting out the
error trap is not enough to solve the problem).  There are two places
where "quasipoisson" has problems:

dev.resids <- function(y, mu, wt) 2 * wt * (y * log(ifelse(y ==
0, 1, y/mu)) - (y - mu))

fails for negative values of y, while

initialize <- expression({
if (any(y < 0)) stop(paste("Negative values not allowed for",
"the quasiPoisson family"))
n <- rep(1, nobs)
mustart <- y + 0.1
})

would (without the error trap) return negative initial values for
means, which won't work with a log link.

The first problem is the hard one: the deviance isn't defined when y is
negative.  But an alternative is to use the Pearson X^2 statistic,
which is defined for all values of y (and makes more sense anyway for
quasi-likelhood models -- it's what you use to determine the dispersion
constant).

The second problem is easy to fix: for example, just take as starting
values the global mean (corresponding to zero-valued effects in the
model).  I often find that this is a good starting point anyway --
better than the default choice, which typically is not in the model
space (even when there are no negative values of y).

I give below an amended "quasipoisson" which seems to work, ie it fits
the model by the quasi-likelihood method with variance proportional to
mean.  It makes only the two changes just described.  A *slightly*
unsatisfactory aspect is that quantities reported as "deviance" are in
fact Pearson X^2 statistics.  But no other choice really makes sense
there.

David

quasipoisson <- function (link = "log")
## Amended by David Firth, 2003.01.16, at points labelled ###
## to cope with negative y values
##
## Computes Pearson X^2 rather than Poisson deviance
##
## Starting values are all equal to the global mean
{
}
if (any(linktemp == c("log", "identity", "sqrt")))
"family; available links are", "\"identity\", \"log\" and
\"sqrt\""))
variance <- function(mu) mu
validmu <- function(mu) all(mu > 0)
dev.resids <- function(y, mu, wt) wt*(y-mu)^2/mu   ###
aic <- function(y, n, mu, wt, dev) NA
initialize <- expression({
n <- rep(1, nobs)
mustart <- rep(mean(y), length(y))             ###
})
variance,
dev.resids = dev.resids, aic = aic, mu.eta = stats\$mu.eta,
initialize = initialize, validmu = validmu, valideta =
stats\$valideta),
class = "family")
}

On Thursday, Jan 16, 2003, at 15:45 Europe/London,
peter.fledelius at wgo.royalsun.com wrote:

> Dear R users
>
> I have been looking for functions that can deal with overdispersed
> poisson
> models. Some (one) of the observations are negative. According to
> actuarial
> literature (England & Verall, Stochastic Claims Reserving in General
> Insurance , Institute of Actiuaries 2002) this can be handled through
> the
> use of quasi likelihoods instead of normal likelihoods. The presence of
> negatives is not normal in a poisson model, however, we see them
> frequently
> in this type of data, and we would like to be able to fit the model
> anyway.
>
> At the moment R is complaining about negative values and the link
> function
> = log.
>
> My code looks like this. Do any of you know if this problem can be
> solved
> in R? Any suggestions are welcomed.
>
> Kind regards,
>
> Peter Fledelius (new R user)
>
> *********** Code ************
> paym   <- c(5012, 3257, 2638,  898, 1734, 2642, 1828,  599,   54,  172,
>              106, 4179, 1111, 5270, 3116, 1817, -103,  673,  535,
>             3410, 5582, 4881, 2268, 2594, 3479,  649,  603,
>             5655, 5900, 4211, 5500, 2159, 2658,  984,
>             1092, 8473, 6271, 6333, 3786,  225,
>             1513, 4932, 5257, 1233, 2917,
>              557, 3463, 6956, 1368,
>             1351, 5596, 6165,
>             3133, 2262,
>             2063)
> alpha   <- factor(c(1,1,1,1,1,1,1,1,1,1,
>              2,2,2,2,2,2,2,2,2,
>              3,3,3,3,3,3,3,3,
>              4,4,4,4,4,4,4,
>              5,5,5,5,5,5,
>              6,6,6,6,6,
>              7,7,7,7,
>              8,8,8,
>              9,9,
>              10))
> beta    <- factor(c(1,2,3,4,5,6,7,8,9,10,
>              1,2,3,4,5,6,7,8,9,
>              1,2,3,4,5,6,7,8,
>              1,2,3,4,5,6,7,
>              1,2,3,4,5,6,
>              1,2,3,4,5,
>              1,2,3,4,
>              1,2,3,
>              1,2,
>              1))
> d.AD <- data.frame(paym, alpha, beta)
> glm.qD93 <- glm(paym ~ alpha + beta, family=quasipoisson())
> glm.qD93
> ************ Code end ***************
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help
>

```