[R-sig-ME] R: Re: Overdispersion and model selection: glmmadmb vs. glmer

Luca Corlatti luca.corlatti at boku.ac.at
Mon Sep 2 13:27:54 CEST 2013


Thanks, Paul, for the precious info and for the nice piece of code!I
tried to run it using the dataset I posted a few hours ago, and with the
following models:
mod.nb<-glmmadmb(larvae~age+mat_be+elevation+log(cortisol)+log(testosterone)+log(hr)+(1|year:month)+(1|id),
family="nbinom", zeroInflation=FALSE, data=dataset)
mod.pois<-glmer(larvae~age+mat_be+elevation+log(cortisol)+log(testosterone)+log(hr)+(1|year:month)+(1|id)+(1|obs),
family=poisson, data=dataset)
and the results are attached to this email as .png file. 
Residuals for log-normal poisson surely look much better than before!
Though, am I right assuming that the glmmadmb plot look a bit more
convincing than the glmer one?
Best wishes, 
Luca
 

>>> Paul Johnson <paul.johnson at glasgow.ac.uk> 02/09/13 13.04 >>>
PS This paper discusses in detail how to assess Poisson-lognormal model
fit by plotting residuals at each heirarchical level: 
Elston, D.A., Moss, R., Boulinier, T., Arrowsmith, C. & Lambin, X.
(2001). Analysis of aggregation, a worked example: numbers of ticks on
red grouse chicks. Parasitology, 122, 563–9.
http://abdn.ac.uk/lambin-group/Papers/Paper%2041%202001%20Elston%20Tick%20aggregation%20Parasitology.pdf



On 2 Sep 2013, at 11:34, Paul Johnson <Paul.Johnson at glasgow.ac.uk>
wrote:

> Hi Luca,
> 
> (To avoid confusion, I'm the Glasgow Paul Johnson, not the Kansas one
- incidentally there's also a statistician of the same name in Oxford, a
former colleague, whom we can only hope stays out of this discussion).
> 
> Below is some code for comparing residuals between Poisson-lognormal
glmer and NB glmmadmb fits. I find this useful as a quick way of
spotting bad fits (aiming, roughly, for homoscedasticity and absence of
a trend). However I agree with the other PJ that plotting the
predictions and prediction intervals from both models over the data is a
better way to compare the two fits.
> 
> Best wishes,
> Paul
> 
> 
>  plot.glmer<-
>    function(mer.fit,type="pearson",overdispersion.term=NULL)
>    {
>      require(AICcmodavg)
>      if(is.null(overdispersion.term))
>      {
>        Fitted<-fitted(mer.fit)
>        Residuals=resid(mer.fit,type)
>      } else
>        {
>          response<-model.frame(mer.fit)[[1]]
>          od.ranef<-lme4::ranef(mer.fit)[[overdispersion.term]][[1]]
>          if(length(response)!=length(od.ranef) ||
fam.link.mer(mer.fit)$family!="poisson" ||
fam.link.mer(mer.fit)$link!="log")
>            stop("Model is not lognormal-Poisson. Cannot use
overdispersion term.")
>          Fitted<-exp(log(fitted(mer.fit))-od.ranef)
>          Residuals<-(response -
Fitted)/sqrt(Fitted+(Fitted^2)*c(exp(lme4::VarCorr(mer.fit)[[overdispersion.term]])-1))
>        }
>      plot.data<-data.frame(Fitted=Fitted,Residuals=Residuals)
>     
plot.data$loess.line<-predict(loess(Residuals~Fitted,data=plot.data))
>      plot.data<-plot.data[order(plot.data$Fitted),]
>      plot(plot.data[,c("Fitted","Residuals")])
>      abline(h=0)
>      points(plot.data[,c("Fitted","loess.line")],type="l",col="red")
>      hist(plot.data$Residuals,xlab="Residuals",main="")
>    }
> 
>  par(mfrow=c(3,2))
>  plot.glmer(pois.glmm)  # overdispersion effects in the fitted values,
so expect a trend
>  plot.glmer(pois.glmm,overdispersion.term="ons")   # 'detrended' by
moving OD effects to the residuals
>  plot.glmer(nb.glmm)  # compare with residuals from NB GLMM
> 
> On 2 Sep 2013, at 08:32, Luca Corlatti <luca.corlatti at boku.ac.at>
wrote:
> 
>> Dear Paul, thanks for your answer. Surely I can offer a link to the
data, here it is:
>> https://www.dropbox.com/s/3i22ovc588kahsk/dataset.csv
>> 
>> 
>> obs=observation level
>> id=animal identity
>> age=age class of the ind.
>> mat_be=individual mating behavior (territorial vs non-territorial)
>> month, year, elevation=when & at which elevation (a.s.l.) the
sampling occurred
>> testosterone, cortisol=testosterone and cortisol levels registered
for each individual
>> hr=home range of each individual (on a monthly basis)
>> larva>> The key point, here, is that not being statistician (I'm a behavioral
ecologist) I don't quite get why 2 methods which should ideally yield
similar results, in fact do not. I realize the question may seem to
vague, and likely fairly trivial, but I guess other researchers, in my
same situation, would simply ask "ok, I have overdispersed data, shall I
go for glmer with (1|obs) or glmmadmb?".
>> 
>> 
>> Without getting too complicated (e.g. model selection and averaging),
even if you run a very simple model, just to check if the number of
counted larvae differ between age classes, the estimates differ between
glmmadmb and glmer:
>> 
>> 
>> mod1<-glmmadmb(larvae~age+(1|year:month)+(1|id), family="nbinom",
zeroInflation=FALSE, data=dataset)
>> mod2<-glmer(larvae~age+(1|year:month)+(1|id)+(1|obs), family=poisson,
data=dataset)
>> 
>> 
>> 
>> If you try with the full additive models:
>> 
>> 
>>
mod1<-glmmadmb(larvae~age+mat_be+elevation+log(cortisol)+log(testosterone)+log(hr)+(1|year:month)+(1|id),
family="nbinom", zeroInflation=FALSE, data=dataset)
>>
mod2<-glmer(larvae~age+mat_be+elevation+log(cortisol)+log(testosterone)+log(hr)+(1|year:month)+(1|id)+(1|obs),
family=poisson, data=dataset)
>> 
>> 
>> 
>> the summary, just as well, gives different results (I used
log-transformation as it seems to linearize the relationship with the
response variable).
>> 
>> 
>> Hope this can help!
>> Luca
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>>>> Paul Johnson <pauljohn32 at gmail.com> 01/09/13 22.04 >>>
>> Hi, I'm a different Paul Johnson :) (Had a shock reviewing this
thread thinking I'd posted a message that I did not recognize at all).
>> 
>> 
>> The question is too vague now.  Aside from agreeing with another Paul
Johnson that there are differences between the negative binomial and
random-effects interpretations, I can't see where this is leading. The
negative binomial model magnifies variance for larger values of the
expected outcome, but it does not, by design, include "clustering" of
variances. On the other hand, the glmer model you describe is a model of
a "clustered" random effect that generates variety among outcomes for a
given set of other predictors.  Both can cause "overdispersion" in the
sense you notice, but they really are substantively different
explanations of it.
>> 
>> How about you show us the estimated models to let us see what's so
different, in your view?
>> 
>> Apart from the residual diagnostics you are concerned about, what
about the substantively important parameters and predicted values?  I
guess that's a little interesting because, when I last checked, glmer
did not have a predict version in the CRAN package (was in testing
Rforge version in 2012).
>> 
>> Are the predicted values of outcomes similar in the 2 models? How
does it look when you create a scatterplot of the predicted values of
the GLMMADMB model versus the glmer model?
>> 
>> 
>> If you post the code that leads to the different diagnostic
scatterplots and offer a link to the data, this might be interesting to
make into one of the working examples for the mixed models wiki.  I
could assign some grad students on this to work it out fully.
>> 
>> 
>> 
>> 
>> 
>> On Wed, Aug 28, 2013 at 4:46 AM, Luca Corlatti
<luca.corlatti at boku.ac.at> wrote:
>> Thanks, Paul, it's somewhat consoling to read somebody else had the
same thoughts!
>> Thanks a lot for your advice. What I miss here, though, is why the
results of the model selection differ so much between glmmadmb and
glmer, hence what is preferable between the two approaches...
>> Cheers,
>> Luca
>> 
>> 
>> 
>> 
>>>>> Paul Johnson <paul.johnson at glasgow.ac.uk> 26/08/13 0.29 >>>
>> Hi Luca,
>> 
>> I've also seen this difference in residuals from a glmmadmb nbinom2
fit and a lognormal-Poisson fit with glmer. Generally the residuals X
fitted plot from glmmadmb looks good (homoscedastic, no trend) while
that from the lognormal-Poisson looks wrong (strong curve climbing
sharply from negat>> The reason for the discrepancy is that the fitted values from the
lognormal-Poisson fit include the overdispersion random effects, i.e.
everything expect the Poisson variation, while the NB fitted values
don't, because overdispersion is intrinsic to the NB, not an added
random effect. Because the obs-level random effect is there to increase
the spread of the Poisson distribution, it's almost inevitable with
strong overdispersion that the residuals with low fitted values will be
negative while those with high fitted values will be positive.
>> 
>> The way to get a fair comparison between the two is to remove the
obs-level random effect from the fitted values of the lognormal-Poisson,
leaving just the fixed effects and any other random effects. I find this
generally gives a residual pattern much more similar to NB in glmmadmb:
>> 
>>  Fitted <- exp(log(fitted(mod)) - ranef(mod)$obs[[1]])
>>  Resid <- (dat$response - Fitted) / sqrt(Fitted + (Fitted^2) *
c(exp(VarCorr(mod)$obs) - 1))
>>  plot(Fitted, Resid)
>> 
>> # obs is the name of the factor used for the obs-level random effect,
>> # mod is the glmer fit of the lognormal-Poisson model, dat$response
are the responses, and
>> # the bit inside the sqrt() is the variance function for the
lognormal-Poisson
>> 
>> Generally I don't find that the model estimates and SEs differ much
though.
>> 
>> Best wishes,
>> Paul
>> 
>> 
>> Paul Johnson
>> 
>> Institute of BAH&CM
>> Graham Kerr Building
>> University of Glasgow
>> Glasgow G12 8QQ
>> 
>> http://www.gla.ac.uk/researchinstitutes/bahcm/staff/pauljohnson/
>> http://www.stats.gla.ac.uk/~paulj/index.html
>> 
>> 
>> ________________________________________
>> From: r-sig-mixed-models-bounces at r-project.org
[r-sig-mixed-models-bounces at r-project.org] On Behalf Of Luca Corlatti
[luca.corlatti at boku.ac.at]
>> Sent: 25 August 2013 20:18
>> To: r-sig-mixed-models at r-project.org
>> Subject: [R-sig-ME] Overdispersion and model selection: glmmadmb vs.
glmer
>> 
>> Dear all,
>> 
>> 
>> I recently ran a model selection (AIC-based) to investigate the role
of several etho-ecological factors in shaping the emission of parasites
in my study species. My data are counts showing overdispersion. I
therefore fitted my models using the function glmmadmb with
family=nbinom. Visual inspection of residuals (normality,
heteroschedasticity, independence) suggested the global model fitted the
data adequately, and I'm pretty happy with the results of my analysis.
For the sake of curiosity, however, I tried to re-run the model
selection using the function glmer, with family=poisson, adding the
observation-level as a random factor (1|obs) to account for
overdispersion, as recently suggested. In this case, however, visual
inspection of residuals for the global model were not very satisfactory.
After running the model selection, the results were quite different from
those obtained with glmmadmb (not dramatically different, but still...).
Given I have no deep knowledge into th!
> e philosophy behind the use of glmer with (1|obs), I was wondering:
>> 1) when one should rely on the use of glmer with (1|obs) to account
for overdispersion? (i.e. is the check of residuals for the global model
the key issue here?)
>> 2) why did I find such a difference in the outcome of the 2 model
selections?
>> Kind regards,
>> Luca
>> 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> 
>> 
>> 
>> 
>> 
>> --
>> Paul E. Johnson
>> Professor, Political Science      Assoc. Director
>> 1541 Lilac Lane, Room 504      Center for Research Methods
>> University of Kansas                 University of Kansas
>> http://pj.freefaculty.org               http://quant.ku.edu
>> 
>> 
>> 
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> _____________> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models





-------------- next part --------------
A non-text attachment was scrubbed...
Name: glmer vs glmmadmb.png
Type: image/png
Size: 83108 bytes
Desc: Portable Network Graphics Format
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20130902/22399ccf/attachment-0001.png>


More information about the R-sig-mixed-models mailing list