[R-sig-ME] population-level predict glmmtmb with poly()

Ben Bolker bbolker @ending from gm@il@com
Wed Oct 24 16:51:02 CEST 2018



On 2018-10-24 10:24 a.m., John Wilson wrote:
> So far it's looking just grand! +1 for "it  wasn't documented until
> about 120 seconds ago".
> Just to push my luck here - in my full model, I have autocorrelation
> structures. When I only set group = NA, the predict() function gave me
> an error due to lack of the autocorrelation variable in the new data. I
> then crossed my fingers and did the same thing as for group - a column
> with NAs. The predict() function had no problem with that. However, I
> wanted to see if that's a (semi) legit way to doing it.

  Seems reasonable (but I can't give any guarantees without spending a
lot more time looking); I would try some simple tests and proceed if the
answers look OK.

> 
> On Tue, Oct 23, 2018 at 8:22 PM Ben Bolker <bbolker using gmail.com
> <mailto:bbolker using gmail.com>> wrote:
> 
> 
>       Naive prediction with data-dependent bases **will not work** with new
>     input variables. (This is a general R/model.matrix() thing ... not
>     glmmTMB's fault.)  The current development version of glmmTMB does some
>     magic (see ?makepredictcall,
>     https://developer.r-project.org/model-fitting-functions.html for more
>     detail) that makes this work.
> 
>       It wasn't documented until about 120 seconds ago, but in order to do
>     population-level predictions with predict() all you need to do is set
>     the group variable to NA.  This has less flexibility than the re.form=
>     argument in lme4 (which allows you to drop a *subset* of the random
>     effects terms for a given grouping variables), but it does handle the
>     most common use case ...
> 
>       Does this work for you (with devel version of glmmTMB) ?
> 
>     # prediction on a new grid
>     newdata <- expand.grid(x = 1:20, z = unique(df$z), group=NA)
>     newdata$y <- predict(m,newdata=newdata, type="response")
> 
>     (ggplot(df, aes(x,y, colour=z))
>         + geom_point()
>         + geom_line(data = newdata, size =2)
>     )
> 
> 
>     On 2018-10-23 6:48 p.m., John Wilson wrote:
>     > Thank you, Dimitris. I poked around a bit but didn't see mention of
>     > autocorrelation structures - are those supported as well?
>     >
>     > I was wondering if this issue was related to the problem logged here:
>     > https://github.com/glmmTMB/glmmTMB/issues/402. Can anyone comment on
>     > whether what I'm seeing is a bug or just a user (=me) mistake? When I
>     > change the toy example to be a simple linear form (rather than
>     poly()), the
>     > two predictions are identical, so I'm pretty sure it's a poly()
>     issue...
>     >
>     > On Tue, Oct 23, 2018 at 4:31 PM D. Rizopoulos
>     <d.rizopoulos using erasmusmc.nl <mailto:d.rizopoulos using erasmusmc.nl>>
>     > wrote:
>     >
>     >> You could fit the same model also using the GLMMadaptive package in
>     >> which the predict() method can produce predictions for the mean
>     subject
>     >> (i.e., the one with random effects values equal to 0), marginal
>     >> predictions (i.e., averaged over the subjects), and subject-specific
>     >> predictions.
>     >>
>     >> For more info you can have a look at:
>     >>
>     >> https://drizopoulos.github.io/GLMMadaptive/
>     >>
>     >>
>     >>
>     https://drizopoulos.github.io/GLMMadaptive/articles/Methods_MixMod.html#predictions
>     >>
>     >> and potentially also
>     >>
>     >>
>     >>
>     https://drizopoulos.github.io/GLMMadaptive/articles/Dynamic_Predictions.html
>     >>
>     >> if you're interested in dynamic predictions.
>     >>
>     >> Best,
>     >> Dimitris
>     >>
>     >>
>     >>
>     >> On 10/23/2018 7:46 PM, John Wilson wrote:
>     >>> Hello,
>     >>>
>     >>> I'm working on a glmmtmb() model with multiple continuous and
>     categorical
>     >>> predictors. Two of the predictors are orthogonal polynomials (I
>     just saw
>     >>> that the package was updated yesterday (!) to correctly handle
>     those).
>     >> One
>     >>> of the polynomials has an interaction with another covariate.
>     >>>
>     >>> Since predict(re.form = 0) doesn't work just yet and one has to use
>     >>> the model.matrix(lme4::nobars(formula(mod1)[-2]), newdata)
>     approach - how
>     >>> do I get the correct polynomial predictions out? It looks like
>     my results
>     >>> depend on how I structure the newdata data frame - when I use
>     >>> expand.grid(), the predictions are wrong, but when I subset the
>     original
>     >>> data, the predictions are correct.
>     >>>
>     >>> Thanks so much!
>     >>> John
>     >>>
>     >>> Here's an example:
>     >>>
>     >>> library(ggplot2)
>     >>> library(glmmTMB)
>     >>>
>     >>> set.seed(0)
>     >>> x <- 1:20
>     >>> z <- sample(c("a", "b"), length(x), replace = TRUE)
>     >>> y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x),
>     10, 0.03))
>     >>> group <- sample(c("i", "ii"), length(x), replace = TRUE)
>     >>> df <- data.frame(x = x, y = y, z = z, group = group)
>     >>> m <- glmmTMB(y ~ poly(x, 3) * z +
>     >>> (1 | group),
>     >>> family = nbinom2,
>     >>> data = df)
>     >>> # prediction on a new grid
>     >>> newdata <- expand.grid(x = 1:20, z = unique(df$z))
>     >>> X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
>     >>> beta.cond = fixef(m)$cond
>     >>> newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
>     >>> # prediction in original data frame
>     >>> X.cond = model.matrix(lme4::nobars(formula(m)[-2]))
>     >>> beta.cond = fixef(m)$cond
>     >>> df$Pred1 = as.numeric(X.cond %*% beta.cond)
>     >>>
>     >>> # the newdata preds are obviously off
>     >>> ggplot(df) +
>     >>> geom_point(aes(x = x, y = y, colour = z)) +
>     >>> geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour =
>     z), size =
>     >> 2)
>     >>> +
>     >>> geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
>     >>>
>     >>> # if the new grid is defined like this, then the predictions are ok
>     >>> newdata <- unique(select(df, x, z))
>     >>> X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
>     >>> beta.cond = fixef(m)$cond
>     >>> newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
>     >>> ggplot(df) +
>     >>> geom_point(aes(x = x, y = y, colour = z)) +
>     >>> geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour =
>     z), size =
>     >> 2)
>     >>> +
>     >>> geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
>     >>>
>     >>>       [[alternative HTML version deleted]]
>     >>>
>     >>> _______________________________________________
>     >>> R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org> mailing list
>     >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     >>>
>     >>
>     >> --
>     >> Dimitris Rizopoulos
>     >> Professor of Biostatistics
>     >> Department of Biostatistics
>     >> Erasmus University Medical Center
>     >>
>     >> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
>     >> Tel: +31/(0)10/7043478
>     >> Fax: +31/(0)10/7043014
>     >> Web (personal): http://www.drizopoulos.com/
>     >> Web (work): http://www.erasmusmc.nl/biostatistiek/
>     >> Blog: http://iprogn.blogspot.nl/
>     >
>     >       [[alternative HTML version deleted]]
>     >
>     > _______________________________________________
>     > R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org> mailing list
>     > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     >
> 
>     _______________________________________________
>     R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org> mailing list
>     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>



More information about the R-sig-mixed-models mailing list