[Rd] nobs() with glm(family="poisson")

Milan Bouchet-Valat nalimilan at club.fr
Mon Feb 18 12:22:23 CET 2013


The nobs() method for glm objects always returns the number of cases
with non-null weights in the data, which does not correspond to the
number of observations for Poisson regression/log-linear models, i.e.
when family="poisson" or family="quasipoisson".

This sounds dangerous since nobs() is, as the documentation states,
primarily aimed at computing the Bayesian information criterion. Raftery
(1995:20) warned against this:
> What should n be? Once again, it is best to use the actual number of
> individuals, i.e. the sum of the cell counts, and not the number of
> cells (Raftery, 1986a).

Is there a reason why this should not/cannot be done that way?

This behavior can be reproduced with with R 3.0.0 from SVN, using the
example from ?glm:
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
# 9 == length(counts)
# Should be 150 == sum(counts)

FWIW, stats:::nobs.glm is currently defined as:
nobs.glm <- function (object, ...) 
    if (!is.null(w <- object$prior.weights)) sum(w != 0) else length(object$residuals)


Raftery, Adrian E. 1995. “Bayesian Model Selection in Social Research.”
Sociological methodology 25:111–96.

More information about the R-devel mailing list