[R-sig-ME] prediction from glmer.nb

Sat Mar 1 22:11:52 CET 2014

Thomas Lee Anderson <anderstl at ...> writes:

> 
> Greetings,
> 
> I have a general and somewhat basic question about how glmer.nb works in
> terms of what scale the parameter estimates are on, and predicting from
> glmer.nb. My understanding from reading through some of the help files in
> the glmm FAQ wiki page is that in order to predict from the glmer models,
> predicted values need to be back transformed to place them on the same
> scale as the response variable (accomplished by using type="response" in
> the predict command).  Otherwise, the predicted values would be on the log
> scale in my case (using glmer.nb).  Am I interpreting 
> this correctly and/or
> correct on what scale the predicted values would be?  I am getting rather
> nonsensical values (at least in my estimation) from the predict command
> when I use the type="response". The raw data for the response variable
> ranges from 0-10, and I am predicting over a range 
> values that we actually
> observed for the covariates but the predicted values sometimes 
> exceed >
> 10^27 larger than the actual data.  Any thoughts as 
> to what I'm doing wrong
> or my ignorance of what is actually going on would be appreciated.

  Maybe there's a bug in glmer.nb?  It's somewhat experimental (I thought
the documentation said that: it doesn't, but the brevity of the help page
might be a hint to that effect ...)
> 
> I understand the etiquette is to give a reproducible example but I wasn't
> sure if it was necessary for the question, and I honestly wasn't sure the
> best way to accomplish that and was having 
> difficulty replicating the issue
> with simulated data. 

  This actually strongly suggests that we're going to need something
more like your real data in order to figure out the problem.

> Below is a sample of the data, and the output of the
> model summary if that is helpful. Let me know if other information is
> needed.
> 
>      Site Year      AMAN      AMOP  Structure
> 1   0.002 2012 3.7500000 0.0000000      13.0
> 2   0.002 2013 1.9666667 0.0000000      13.0
> 3  0.002A 2012 0.3333333 2.7777778       0.0
> 4  0.002A 2013 0.0000000 0.0000000       0.0
> 5   0.003 2012 0.0000000 0.2500000     100.0
> 6   0.003 2013 0.0000000 0.1904762     100.0
> 7      10 2012 0.0000000 0.0000000      95.0
> 8      10 2013 0.2333333 0.0000000      95.0
> 9     103 2012 0.0000000 4.6666667      32.5
> 
> 10    103 2013 0.0000000 1.9166667      32.5
> 
> > summary(aman.d5)Generalized linear mixed model fit by maximum likelihood
['glmerMod']
>  Family: Negative Binomial(0.6346) ( log )
> Formula: AMAN ~ AMOP * Structure + (1 | Year)
>    Data: ..2
> 
>       AIC       BIC    logLik  deviance
>  917.1346  940.7914 -452.5673  905.1346
> 
> Random effects:
>  Groups   Name        Variance  Std.Dev.
>  Year     (Intercept) 2.953e-10 1.718e-05
>  Residual             9.836e-01 9.918e-01
> Number of obs: 381, groups: Year, 2

  Notice here that the random effects variance is practically
zero, which suggests that you will do equally well (i.e. get
almost identical results) with MASS::glm.nb -- a reasonable
workaround.
  In general this will almost always happen with a grouping
variable with only 2 levels.  See http://glmm.wikidot.com/faq for
more discussion.  If all your data sets look like this, you
should really save yourself some trouble and just fit Year as
a fixed effect (i.e., use MASS::glm.nb)

  It's also quite weird that you have non-integer response
values (AMAN).  Except possibly in some very particular situations, that
doesn't make sense.  Are these densities rather than counts?
(If so you should consider using the raw counts as the response
variable, with log(sampling area) as an offset term.)

> 
> Fixed effects:
>                 Estimate Std. Error t value Pr(>|z|)
> (Intercept)    -0.124857   0.130868  -0.954 0.340047
> AMOP           -0.558918   0.208893  -2.676 0.007459 **
> Structure      -0.004156   0.003175  -1.309 0.190474
> AMOP:Structure  0.013938   0.004152   3.357 0.000788 ***
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

  Besides this, nothing looks particularly odd