[R-sig-ME] Poisson mixed models: Non-integer response variable in lmer?

Douglas Bates bates at stat.wisc.edu
Tue Mar 22 16:46:42 CET 2011

On Mon, Mar 14, 2011 at 5:26 PM, Daniel Barton
<daniel.barton at umontana.edu> wrote:
> Hello,
>     Thanks to everyone who contributes to this list!  I often find random
> questions I have answered in the archives of this list.

> My specific question of the moment, a simplified example of what I'm doing
> that I hope illustrates my question...

>     If we have a poisson-distributed response variable in a mixed model
> such as called by:

> lmer(amrotot ~ year + (year|route), family=poisson(link=log))

> where amrotot is an integer count, year is, well, the year (as a linear
> predictor, not a factor) and route is a sampling unit.  If 'exposure' varies
> by route, we can define another model with an offset such as:

> lmer(amrotot ~ year + (year|route), offset=effort, family=poisson(link=log))

> this all seems, generally good and fine.  A colleague asked me why not
> use (amrotot/effort) as the response variable, but this of course results in
> a non-integer response variable.  Yet it turns out, lmer (or glm, for that
> matter) will indeed estimate a model using the non-integer response variable
> (amrotot/effort) but gives warnings.  I understand that poisson regression
> assumes a poisson-distributed integer response variable, but I was curious
> about *why* lmer would provide results for non-integer response variables
> such as (amrotot/effort) and if these results are valid or somehow
> comparable to results where amrotot is the response and effort is an offset,
> with special reference to the confidence intervals of the random effects.
> Using non-integer response variables in poisson regression looks and seems
> wrong to me, but IANA statistician and maybe lmer is doing something I don't
> quite get to make this work.

Gee, most people complain about lmer/glmer not producing results.  The
glmer function uses the same definitions of the glm family as does the
glm function itself so the question of whether a particular response
type can be used with a family is more an issue of the definition of
the family.  Many of the d-p-q-r functions (density, cumulative
probability, quantile, random sample) in R are extended to handle
non-integer values for parameters or responses that, according to the
original definitions, should be integers.  For example, non-integer
degrees of freedom are allowed.

Myself I would stick with the use of the offset.  In this case you
have the log link so an offset of effort is equivalent to modeling the
ratio amrotot/effort and, perhaps, easier to explain.

> Thank you!
>
> Best,
> Dan Barton
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>