[R-sig-ME] wider than expected confidence intervals with lsmeans and predict.glmmadmb
Ben Bolker
bbolker at gmail.com
Mon May 29 01:59:20 CEST 2017
I will look this over more carefully if/when I get time, to see what
the differences are between the ways in which lme4, glmmADMB, and
glmmTMB implement the building blocks that lsmeans uses. But I can't
guarantee I will get to it soon ... essentially, I'll just have to
pick through the implementation of lsm.basis for glmmTMB and glmmADMB
and see what the differences are ... it would help to try this on some
cases without zero-inflation and without any random effects at all, to
see where the discrepancies are coming from.
On Sun, May 28, 2017 at 5:20 PM, Evan Palmer-Young <ecp52 at cornell.edu> wrote:
> Thank you for this suggestion; it looks like you already implemented what
> Prof. Maindonald suggested.
>
> In your (RVL's) J. Stat Software article on lsmeans
> <https://www.jstatsoft.org/article/view/v069i01>, Section 5.1, you wrote:
>
>
> * Note that it is a mistake to try to use confidence intervals to judge
> comparisons. In this example, the standard errors of comparisons are much
> smaller than those of the LS means, because the between-block and
> between-plot variations cancel out in the comparisons. *
>
> I think that this is what John Maindonald indicated, too.
>
> Is it possible that some packages (glmmADMB?) provide predict() estimates
> that include the random-effect variance referred to in the quotation, and
> others do not? Or that some produce confidence intervals whereas others
> produce prediction intervals (i.e., by addition of the residual variance),
> as differentiated in the glmm FAQ
> <https://github.com/bbolker/mixedmodels-misc/blob/master/glmmFAQ.rmd#predictions-andor-confidence-or-prediction-intervals-on-predictions>,
> section on Prediction and Confidence Intervals?
>
> I posted a query to the glmmADMB
> <https://github.com/bbolker/glmmadmb/issues/5> Github page, to see if
> somebody with more familiarity to the package might be able to explain
> nuances or difference. This thread has been cross-referenced with that
> question.
>
> Thank you again for your patience and thorough explanations!
> Much appreciated,
> Evan
>
>
> On Sun, May 28, 2017 at 12:00 AM, Lenth, Russell V <russell-lenth at uiowa.edu>
> wrote:
>
>> If the SE of a mean is exactly 1/2 the SE of the difference of two means
>> -- which is almost never the case -- it would be appropriate to use
>> overlapping confidence intervals to test comparisons of means. So, you
>> should almost never try to do that. In mixed models, it is not at all
>> unusual to have huge discrepancies among standard errors.
>>
>> However, 'lsmeans' does offer an ad hoc method for the graphical
>> comparisons you have in mind. Try this:
>>
>> lsm.TMB<- lsmeans(m.nb2, ~FoodTreatment)
>> plot(lsm.TMB, comparisons = TRUE)
>>
>> This will plot both confidence intervals (in blue) and "comparison arrows"
>> (in red). Non-overlapping comparison arrows will indicate cases where
>> differences are significant. You can have just the comparison arrows by
>> using:
>>
>> plot(lsm.TMB, intervals = FALSE, comparisons = TRUE)
>>
>> In either case, as I say, it is an ad hoc method, and it doesn't always
>> work, especially when there are widely variable standard errors. A warning
>> is issued if it can't figure out a solution.
>>
>> Russ
>> --
>> Russell V. Lenth - Professor Emeritus
>> Department of Statistics and Actuarial Science
>> The University of Iowa - Iowa City, IA 52242 USA
>> Voice (319)335-0712 - FAX (319)335-3017
>> russell-lenth at uiowa.edu - http://www.stat.uiowa.edu/~rlenth/
>>
>>
>>
>> -----Original Message-----
>>
>> Date: Fri, 26 May 2017 17:29:52 -0400
>> From: Evan Palmer-Young <ecp52 at cornell.edu>
>> To: John Maindonald <john.maindonald at anu.edu.au>
>> Cc: R-mixed models mailing list <r-sig-mixed-models at r-project.org>
>> Subject: Re: [R-sig-ME] wider than expected confidence intervals with
>> lsmeans and predict.glmmadmb
>> Message-ID:
>> <CAAge6+7v1KY=8GLNqi1Hzg4zyQY0kfSjGvMXM-rhRFC9ER8Kcw at mail.
>> gmail.com>
>> Content-Type: text/plain; charset="UTF-8"
>>
>> Thanks very much for your reply, Prof. Maindonald.
>>
>> I agree that the pairwise comparisons are informative, but it would be
>> easiest for readers to see the data on the original scale to show
>> differences between groups.
>>
>> When the lsmeans are plotted from glmmTMB, which fits a model with fixed
>> effects identical to those in glmmADMB, the estimates are identical but the
>> SE's differ by a factor of 8.
>>
>> So I am still confused about why the lsmeans plots would reflect pairwise
>> differences with some packages but not with glmmADMB.
>> In my experience, lsmeans plots of group means from glmer() models are
>> also non-overlapping when pairwise comparisons are highly significant.
>>
>> I have extended the code to illustrate the differences.
>>
>> library(glmmADMB)
>>
>> library(lsmeans)
>>
>> #Use data from worked example
>> #http://glmmadmb.r-forge.r-project.org/glmmADMB.html
>>
>> library(glmmADMB)
>> data(Owls)
>> str(Owls)
>> Owls <- transform(Owls,
>> Nest=reorder(Nest,NegPerChick),
>> logBroodSize=log(BroodSize),
>> NCalls=SiblingNegotiation)
>>
>>
>> m.nb<- glmmadmb(NCalls~FoodTreatment+ArrivalTime+
>> +(1|Nest),
>> data=Owls,
>> zeroInflation=FALSE,
>> family="nbinom")
>> summary(m.nb)
>> # Estimate Std. Error z value Pr(>|z|)
>> # (Intercept) 4.2674 0.4705 9.07 < 2e-16 ***
>> # FoodTreatmentSatiated -0.2602 0.0845 -3.08 0.0021 **
>> # ArrivalTime -0.0840 0.0190 -4.42 9.8e-06 ***
>> #Plot lsmeans by FoodTreatment
>> owls.lsm<-lsmeans(m.nb, ~FoodTreatment)
>> owls.lsm
>> # FoodTreatment lsmean SE df asymp.LCL asymp.UCL
>> # Deprived 2.188727 0.7205142 NA 0.7765454 3.600909
>> # Satiated 1.928499 0.7498151 NA 0.4588887 3.398110
>> #SE is much higher than for fixed effects in model
>>
>> plot(owls.lsm)
>> #95% confidence bands overlap almost entirely
>>
>> #Confirm with predict.glmmadmb:
>> New.data<-expand.grid(FoodTreatment= levels(Owls$FoodTreatment),
>> ArrivalTime = mean(Owls$ArrivalTime))
>>
>> New.data$NCalls <- predict(m.nb, New.data, re.form=NA, SE.fit = TRUE)
>>
>> #Get standard errors:
>> calls.pred<- predict(m.nb, New.data, re.form = NA, se.fit = TRUE)
>> calls.pred<-data.frame(calls.pred)
>>
>> New.data$SE<-calls.pred$se.fit
>> New.data
>> # FoodTreatment ArrivalTime NCalls SE
>> # 1 Deprived 24.75763 2.188727 0.7205142
>> # 2 Satiated 24.75763 1.928499 0.7498151
>> #Matches with lsmeans output
>>
>>
>>
>> ################## Compare to glmmTDMB ####################
>> #install.packages("glmmTMB")
>> library(glmmTMB)
>> m.nb2<- glmmTMB(NCalls~FoodTreatment+ArrivalTime+
>> +(1|Nest),
>> data=Owls,
>> family="nbinom2")
>> summary(m.nb2)
>> # Estimate Std. Error z value Pr(>|z|)
>> # (Intercept) 4.91011 0.63343 7.752 9.07e-15 ***
>> # FoodTreatmentSatiated -0.69238 0.10692 -6.476 9.44e-11 ***
>> # ArrivalTime -0.11540 0.02526 -4.569 4.90e-06 ***
>>
>> #Compare to glmmADMB model:Fixed effects are identical
>> summary(m.nb)
>> # Estimate Std. Error z value Pr(>|z|)
>> # (Intercept) 4.9101 0.6334 7.75 9.1e-15 ***
>> # FoodTreatmentSatiated -0.6924 0.1069 -6.48 9.4e-11 ***
>> # ArrivalTime -0.1154 0.0253 -4.57 4.9e-06 ***
>>
>> #Plot lsmeans by FoodTreatment
>> owls.lsm<-lsmeans(m.nb2, ~FoodTreatment) #oops, lsmeans can't use glmmTMB
>> object!
>>
>> ######## Interlude #######
>> #Ben Bolker wrote a function to talk to lsmeans-- incredible!
>> # https://github.com/glmmTMB/glmmTMB/issues/205
>> recover.data.glmmTMB <- function(object, ...) {
>> fcall <- getCall(object)
>> recover.data(fcall,delete.response(terms(object)),
>> attr(model.frame(object),"na.action"), ...) }
>> lsm.basis.glmmTMB <- function (object, trms, xlev, grid, vcov.,
>> mode = "asymptotic", component="cond", ...)
>> {
>> if (mode != "asymptotic") stop("only asymptotic mode is available")
>> if (component != "cond") stop("only tested for conditional component")
>> if (missing(vcov.))
>> V <- as.matrix(vcov(object)[[component]])
>> else V <- as.matrix(.my.vcov(object, vcov.))
>> dfargs = misc = list()
>> if (mode == "asymptotic") {
>> dffun = function(k, dfargs) NA
>> }
>> ## use this? misc = .std.link.labels(family(object), misc)
>> contrasts = attr(model.matrix(object), "contrasts")
>> m = model.frame(trms, grid, na.action = na.pass, xlev = xlev)
>> X = model.matrix(trms, m, contrasts.arg = contrasts)
>> bhat = fixef(object)[[component]]
>> if (length(bhat) < ncol(X)) {
>> kept = match(names(bhat), dimnames(X)[[2]])
>> bhat = NA * X[1, ]
>> bhat[kept] = fixef(object)[[component]]
>> modmat = model.matrix(trms, model.frame(object), contrasts.arg =
>> contrasts)
>> nbasis = estimability::nonest.basis(modmat)
>> }
>> else nbasis = estimability::all.estble
>> list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun,
>> dfargs = dfargs, misc = misc)
>> }
>>
>> ##### End interlude ###
>>
>> lsm.TMB<- lsmeans(m.nb2, ~FoodTreatment)
>> plot(lsm.TMB) #non-overlapping CI's
>>
>> #Compare SE's
>> owls.lsm
>> # FoodTreatment lsmean * SE* df asymp.LCL asymp.UCL
>> # Deprived 2.053073 *0.8952071* NA 0.2984988 3.807646
>> # Satiated 1.360690 *0.9037320 *NA -0.4105918 3.131973
>>
>> lsm.TMB
>> # FoodTreatment lsmean *SE* df asymp.LCL asymp.UCL
>> # Deprived 2.053065 *0.1068562* NA 1.843631 2.262500
>> # Satiated 1.360683 *0.1161322* NA 1.133068 1.588298
>>
>> #lsmeans are identical but SE's differ by factor of 8?!
>>
>>
>> Thank you again.
>> Evan
>>
>
>
>
> --
> Evan Palmer-Young
> PhD candidate
> Department of Biology
> 221 Morrill Science Center
> 611 North Pleasant St
> Amherst MA 01003
> https://scholar.google.com/citations?user=VGvOypoAAAAJ&hl=en
> https://sites.google.com/a/cornell.edu/evan-palmer-young/
> epalmery at cns.umass.edu
> ecp52 at cornell.edu
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
More information about the R-sig-mixed-models
mailing list