[R-sig-ME] Log likelihood of a glmer() binomial model .

Tue Feb 2 08:31:08 CET 2021

Dear Juho,

The log-likelihood function in GLMMs does not have a closed-form. For example, see slides 345-347 in my notes. I.e., when we combine a Binomial probability mass function (pmf) for the response given the random effects, with the normal probability density function for the random effects, we do not get back a Binomial pmf.

Hence, AFAIK you cannot use dbinom() to calculate the marginal deviance in a mixed-effects logistic regression.

You could calculate the “deviance” from the first term in slide 345, but this doesn’t correspond to a log-likelihood, to my view (i.e., the log-likelihood is typically calculated based on observed data alone).

Best,
Dimitris

From: Juho Kristian Ruohonen <juho.kristian.ruohonen using gmail.com>
Sent: Tuesday, February 2, 2021 8:13 AM
To: D. Rizopoulos <d.rizopoulos using erasmusmc.nl>
Cc: Ben Bolker <bbolker using gmail.com>; R-mixed models mailing list <r-sig-mixed-models using r-project.org>
Subject: Re: [R-sig-ME] Log likelihood of a glmer() binomial model .

Dear Dimitris/List,

Slide 360 in Dimitris' teaching materials<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.drizopoulos.com%2Fcourses%2FEMC%2FCE08.pdf&data=04%7C01%7Cd.rizopoulos%40erasmusmc.nl%7Cd2d097e45e61476cd4fd08d8c749f2f8%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637478467710678229%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8SrRl9g%2FkefFyyERBIkVL%2F1T3Li7oyCqgRbKdonrarM%3D&reserved=0> appears to detail how to compute marginal LL, yet my attempts to manually reproduce the marginal LL/deviance reported by any GLMM software continue to fail.

The code below uses Dimitris' dataset with Ben's software. The software choice is because with Ben's software I can at least replicate the conditional deviance even though I cannot replicate the marginal deviance, as shown below:

data("pbc2", package = "JM")

pbc2$serCholD <- as.numeric(pbc2$serChol > 210)

lme4mod <- glmer(serCholD ~ (1|id) + year*drug + I(age-50)*sex, family = binomial, data = pbc2, nAGQ = 15)

deviance(lme4mod)

[1] 351.5973

-2*sum(dbinom(lme4mod using resp$y, size = 1, prob = predict(lme4mod, re.form = NULL, type = "response"), log = T))

[1] 351.5973

But I cannot replicate the marginal deviance, which is evidently not the same as the "mean-subject" deviance based on only the fixed effects:

-2*logLik(lme4mod)

'log Lik.' 702.6521 (df=8)

-2*sum(dbinom(lme4mod using resp$y, size = 1, prob = predict(lme4mod, re.form = NA, type = "response"), log = T))

[1] 1233.415

So, let's try calculating the marginal deviance from marginal probabilities created using the procedure detailed by Dimitris' Slide 360:

marginalFitted <- plogis( # On the probability scale,

  sapply(rownames(lme4mod using frame), function(i) # for each row of the dataset,

    # the "marginal" prediction is the average of 10,000 "mean-subject" predictions of which each has been augmented by a quantity randomly generated from the estimated N(0, lme4mod using theta) distribution:

    mean(

      predict(lme4mod, newdata = pbc2[i,], re.form = NA) + rnorm(10000, sd = lme4mod using theta)

      )))

marginalDeviance <- -2*sum(dbinom(lme4mod using resp$y, size = 1, prob = marginalFitted, log = T))

But the resulting quantity spectacularly fails to match the marginal likelihood reported by Ben's software:

-2*logLik(lme4mod)

'log Lik.' 702.6521 (df=8)

marginalDeviance

[1] 1232.663

Hence, I remain at my wits' end.

Best,

J

su 31. tammik. 2021 klo 11.57 D. Rizopoulos (d.rizopoulos using erasmusmc.nl<mailto:d.rizopoulos using erasmusmc.nl>) kirjoitti:
If I may add to this, GLMMadaptive does produce marginal predictions, i.e., predictions based on the marginal coefficients; for example, see below. These predictions will have the same interpretation as the one you would get from a GEE approach. You may find more information on this also in my slide (Chapter 5): http://www.drizopoulos.com/courses/EMC/CE08.pdf<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.drizopoulos.com%2Fcourses%2FEMC%2FCE08.pdf&data=04%7C01%7Cd.rizopoulos%40erasmusmc.nl%7Cd2d097e45e61476cd4fd08d8c749f2f8%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637478467710678229%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8SrRl9g%2FkefFyyERBIkVL%2F1T3Li7oyCqgRbKdonrarM%3D&reserved=0>

Also, a small comment on Ben's code: GLMMadaptive does not use a Monte Carlo simulation for calculating the marginal coefficients - this is in closed-form. The Monte Carlo simulation is used for calculating the standard errors of the marginalized coefficients.

Best,
Dimitris

library("GLMMadaptive")
library("lattice")
data("pbc2", package = "JM")

pbc2$serCholD <- as.numeric(pbc2$serChol > 210)

fm_s52_pbc <- mixed_model(fixed = serCholD ~ year * drug + I(age - 50) * sex,
                          random = ~ 1 | id,
                          family = binomial(), data = pbc2, nAGQ = 15)

summary(fm_s52_pbc)

# the data frame that contains the combination of values to
# create the plot
newDF <- with(pbc2, expand.grid(
    year = seq(0, 12, length.out = 15),
    age = 49,
    drug = levels(drug),
    sex = levels(sex)
))

# log odds for average/median subject
# note: you need to use the effectPlotData() function
# from the GLMMadaptive package (i.e., in case of problems use
# 'GLMMadaptive::effectPlotData')
xyplot(pred + low + upp ~ year | sex * drug,
       data = effectPlotData(fm_s52_pbc, newDF),
       type = "l", lty = c(1, 2, 2), col = c(2, 1, 1), lwd = 2,
       xlab = "Follow-up time (years)", ylab = "Conditional Log Odds")

# marginal probabilities and conditional probabilities corresponding to
# the average/median individual (i.e., the one with random effects value equal to zero)
plot_data_marg <- effectPlotData(fm_s52_pbc, newDF, marginal = TRUE)
plot_data_marg$pred0 <- effectPlotData(fm_s52_pbc, newDF)$pred

key <- simpleKey(c("marginal probabilities", "probabilities average patient"),
                 points = FALSE, lines = TRUE)
key$lines$col <- c("red", "blue")
key$lines$lwd <- c(2, 2)
key$lines$lty <- c(1, 1)
expit <- function (x) exp(x) / (1 + exp(x))
xyplot(expit(pred) + expit(pred0) + expit(low) + expit(upp) ~ year | sex * drug,
       data = plot_data_marg, key = key,
       type = "l", lty = c(1, 1, 2, 2), lwd = 2,
       col = c("red", "blue", "black", "black"),
       xlab = "Follow-up time (years)", ylab = "Probabilities")

-----Original Message-----
From: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org<mailto:r-sig-mixed-models-bounces using r-project.org>> On Behalf Of Juho Kristian Ruohonen
Sent: Sunday, January 31, 2021 9:11 AM
To: Ben Bolker <bbolker using gmail.com<mailto:bbolker using gmail.com>>
Cc: R-mixed models mailing list <r-sig-mixed-models using r-project.org<mailto:r-sig-mixed-models using r-project.org>>
Subject: Re: [R-sig-ME] Log likelihood of a glmer() binomial model .

Dear list/Ben,

GLMMadaptive doesn't do marginal _predictions_ but it does do
> marginal _coefficients_.
>

Thanks for clearing that up!

> If you average across simulations with newdata= set that should give
> you marginal predictions ...
>

With *newdata *set to what? I assume I'd have to use the original dataset while somehow emulating the estimated RE distribution. However, I can't seem to make anything work. A simple example with a simple dataset:

>
> *(cluster <- rep(LETTERS[1:8], each = 20))(x <- rep(rep(c(0,1), each =
> 10), times = 8))(y <- as.vector(sapply(2:9, function(x) c(rep(c(1,0),
> c(x,
> 10-x)) ,rep(c(1,0), c(x-1, 10-(x-1)))) )))* *glmm <- glmer(y ~
> (1|cluster) + x, family = binomial, nAGQ = 10)*
> *deviance(glmm) *# "Conditional" deviance obtained by plugging in the
> BLUPs.
>
> [1] 185.4821
>
> *-2*logLik(glmm) *# "Marginal" deviance from averaging the fitted
> values w.r.t the estimated N(0, σ) distribution of the REs.
>
> 'log Lik.' 203.891 (df=3)
>
>
What I'm trying to learn is how to calculate or, failing that, simulate this "marginal" deviance. My first impulse is to simply hand-calculate a deviance from fitted values with all RE's set to their mean of 0. Given that the assumed RE distribution is normal and hence symmetric, my intuition is that the result should equal the result from integrating over the estimated RE distribution. But that fails to happen:

*-2*sum(dbinom(y, size = 1, prob = predict(glmm, re.form = NA,   type =
> "response"), log = T))*
>
> [1] 220.2704
>
>
So my next avenue is to do as Ben seems to suggest, i.e. use the means of simulated "marginal" responses from the model as fitted values. Indeed,
*simulate.merMod()* with *use.u = FALSE* should be exactly what we want, given that its documentation says the following:

use.u
>
> (logical) if TRUE, generate a simulation conditional on the current
> random-effects estimates; if FALSE generate new Normally distributed
> random-effects values.
>

But alas:

*-2*sum(dbinom(y, 1, prob = rowMeans(simulate(glmm, nsim = 1e4, seed =
> 2021, use.u = FALSE)), log = T))*
>
> [1] 220.3949
>
>
The resulting deviance is off no matter what random seed I set. As a sidenote, it seems to equal the deviance obtained by setting all random effects to 0, just as my unreliable intuition suggested. But it's the wrong result nonetheless.

Hence I'm at my wits' end. How does one calculate or simulate the marginal deviance of a binary GLMM?

Best,

J

la 30. tammik. 2021 klo 22.55 Ben Bolker (bbolker using gmail.com<mailto:bbolker using gmail.com>) kirjoitti:

>    "marginal" is unfortunately a word that has different meanings in
> different contexts, I'm still trying to sort it out in my own mind.
>
>    GLMMadaptive doesn't do marginal _predictions_ but it does do
> marginal _coefficients_.
>
>    I put up some example workings at
>
>
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbbolk
> er.github.io<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fer.github.io%2F&data=04%7C01%7Cd.rizopoulos%40erasmusmc.nl%7Cd2d097e45e61476cd4fd08d8c749f2f8%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637478467710678229%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=B7ccPD%2FUqXJ7vHrl3QLedWIV9f1U%2FPial2nEIDUaLEM%3D&reserved=0>%2Fmixedmodels-misc%2Fnotes%2Fmarginal_ex.html%3Fde49a7770
> f7&data=04%7C01%7Cd.rizopoulos%40erasmusmc.nl<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F40erasmusmc.nl%2F&data=04%7C01%7Cd.rizopoulos%40erasmusmc.nl%7Cd2d097e45e61476cd4fd08d8c749f2f8%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637478467710688175%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=EsCKF4EqxsgPfABTJd9LMrHeDw8bTBl4pGum9ordGMM%3D&reserved=0>%7C7d927fd8921241ae84
> 7108d8c5bfca57%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C6374767748
> 20847825%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiL
> CJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=yBBei0xChbPMg0bwvw33t9K
> zYN0n0EISVRexB%2BGARtg%3D&reserved=0
>
>   (the hash at the end may be unnecessary for you).
>
>   If you average across simulations with newdata= set that should give
> you marginal predictions ...
>
> On 1/30/21 9:23 AM, Juho Kristian Ruohonen wrote:
> > Sorry to revive the thread after 21 months, but the topic has just
> > become highly relevant. My question concerns the following remark by Ben:
> >
> >     As far as what predict() does: depending on the value of re.form, it
> >     either gives a population-level prediction (R=0) or a conditional
> >     prediction (R set to its conditional mode).  You might be looking for
> >     *marginal* predictions (which Rizopoulous's GLMMadaptive package can
> >     do ...)
> >
> >
> > Does "marginal predictions" here mean predictions based exclusively
> > on the "population-averaged" fixed-effect estimates that are
> > equivalent to onescalculated by *gee()* from the /gee/ library and
> > *geeglm()* from the /geepack /library? Or does it mean predictions
> > calculated using the GLMM estimates of the fixed effects and then
> > averaged with respect to the estimated random-effects distribution?
> > If it means the former, aren't those predictions easily obtainable
> > through a standard GLM that ignores the clustering? If it means the
> > latter, could someone please point out how exactly to calculate such marginal predictions using GLMMadaptive?
> > I've been trying to figure out how to manually calculate the
> > marginal (latter sense) LL for binomial GLMMs, with no success so far...
> >
> > Best,
> >
> > J
> >
> > la 20. huhtik. 2019 klo 17.51 Ben Bolker (bbolker using gmail.com<mailto:bbolker using gmail.com>
> > <mailto:bbolker using gmail.com<mailto:bbolker using gmail.com>>) kirjoitti:
> >
> >
> >        I agree with Juho that there's a typo -- would be something like.
> >
> >       sum((y==1)*log(pred_prob) + (y==0)*log(1-pred_prob))
> >
> >        As far as what predict() does: depending on the value of
> > re.form,
> it
> >     either gives a population-level prediction (R=0) or a conditional
> >     prediction (R set to its conditional mode).  You might be looking for
> >     *marginal* predictions (which Rizopoulous's GLMMadaptive package can
> >     do ...)
> >
> >
> >     On 2019-04-20 3:42 a.m., Juho Kristian Ruohonen wrote:
> >      > Rolf: Forgive my ignorance, but isn't the relevant log-likelihood
> >     here
> >      > the log-likelihood of the observed responses in the validation
> >     set given
> >      > the model-predicted probabilities? I.e. sum(dbinom(VS$y, size
> > =
> VS$n,
> >      > prob = predict(fit, newdata = VS, type = "response"), log =
> >     TRUE))? Even
> >      > this would be somewhat off because dbinom() isn't aware of the
> >      > random-effects integral business. But it looks to me like your
> >     current
> >      > call is calculating the log-sum of the predicted probabilities of
> >     y = 1
> >      > in the validation set, not the loglikelihood of the observed
> >     responses
> >      > in the validation set.
> >      >
> >      > la 20. huhtik. 2019 klo 8.51 Rolf Turner (r.turner using auckland.ac.nz<mailto:r.turner using auckland.ac.nz>
> >     <mailto:r.turner using auckland.ac.nz<mailto:r.turner using auckland.ac.nz>>
> >      > <mailto:r.turner using auckland.ac.nz<mailto:r.turner using auckland.ac.nz>
> >     <mailto:r.turner using auckland.ac.nz<mailto:r.turner using auckland.ac.nz>>>) kirjoitti:
> >      >
> >      >
> >      >     On 20/04/19 12:44 PM, Ben Bolker wrote:
> >      >
> >      >     >    This seems wrong.
> >      >
> >      >     Yeah, that figures.
> >      >
> >      >     > The GLMM log-likelihood includes an integral over
> >      >     > the distribution of the random effects.
> >      >
> >      >     I was aware of this.  I guess what I was naïvely expecting
> >     was that
> >      >     predict.merMod() would handle this.  I.e. that this predict
> >     method
> >      >     (with type = "response") would return, for each observed y_i
> >     in the
> >      >     (new) data set
> >      >
> >      >        Pr(Y = y_i) = \int_0 Pr(Y = y_i | R = r) f(r) dr
> >      >
> >      >     where R is the vector of random effects and f(r) is its
> >     probability
> >      >     density function (multivariate normal, with mean 0 and some
> >     covariance
> >      >     matrix, which has been estimated in the fitting process.
> >      >
> >      >     I guess that this is *not* what predict.merMod() returns ---
> >     but I
> >      >     don't
> >      >     understand why not.  It seems to me that this is what it
> "should"
> >      >     return.
> >      >
> >      >     I'm probably misunderstanding something, possibly simple,
> >     possibly
> >      >     subtle.
> >      >
> >      >     Apropos of nothing much, what *does* predict.merMod() return?
> >      >     Maybe Pr(Y = y_i | R = 0) ???
> >      >
> >      >     <SNIP>
> >      >     >   Here is an **inefficient** method for computing the
> >     likelihood
> >      >     >
> >      >     >     coefs <- unlist(getME(fit,c("theta","beta"))
> >      >     >     newdev <- update(fit, data=VS, devFunOnly=TRUE)
> >      >     >     newdev(coefs)
> >      >     >
> >      >     > This is slow because it has to reconstruct all of the
> >     random-effects
> >      >     > matrices, do permutations to reorder the relevant matrices
> >     to be as
> >      >     > sparse as possible, etc. etc.
> >      >
> >      >     Thanks for this.  I'll give it a go.  I think that the
> >     slowness may not
> >      >     be an overwhelming drawback.  Anyhow I shall try to test it
> out.
> >      >
> >      >     Thanks again.
> >      >
> >      >     cheers,
> >      >
> >      >     Rolf
> >      >
> >      >     --
> >      >     Honorary Research Fellow
> >      >     Department of Statistics
> >      >     University of Auckland
> >      >     Phone: +64-9-373-7599 ext. 88276
> >      >
> >      >     _______________________________________________
> >      > R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org>
> >     <mailto:R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org>>
> >      >     <mailto:R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org>
> >     <mailto:R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org>>> mailing list
> >      > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=04%7C01%7Cd.rizopoulos%40erasmusmc.nl%7C7d927fd8921241ae847108d8c5bfca57%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637476774820847825%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=BXuC6CVsrp3tlibaDkAnYC3BxFz%2Bfv3W85OMQn6Gjng%3D&reserved=0<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=04%7C01%7Cd.rizopoulos%40erasmusmc.nl%7Cd2d097e45e61476cd4fd08d8c749f2f8%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637478467710688175%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8Hg%2FlYRcCggPDya9X%2BGI0r0reWmQKpjdGj71eflNK34%3D&reserved=0>
> >     <https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=04%7C01%7Cd.rizopoulos%40erasmusmc.nl%7C7d927fd8921241ae847108d8c5bfca57%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637476774820847825%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=BXuC6CVsrp3tlibaDkAnYC3BxFz%2Bfv3W85OMQn6Gjng%3D&reserved=0<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=04%7C01%7Cd.rizopoulos%40erasmusmc.nl%7Cd2d097e45e61476cd4fd08d8c749f2f8%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637478467710698129%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jfLIQQC79l0lr3qJTpGgX2Ucu8r%2FhWuDHKxlJZseAnU%3D&reserved=0>>
> >      >
> >
>

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org> mailing list
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=04%7C01%7Cd.rizopoulos%40erasmusmc.nl%7C7d927fd8921241ae847108d8c5bfca57%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637476774820847825%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=BXuC6CVsrp3tlibaDkAnYC3BxFz%2Bfv3W85OMQn6Gjng%3D&reserved=0<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=04%7C01%7Cd.rizopoulos%40erasmusmc.nl%7Cd2d097e45e61476cd4fd08d8c749f2f8%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637478467710698129%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jfLIQQC79l0lr3qJTpGgX2Ucu8r%2FhWuDHKxlJZseAnU%3D&reserved=0>

	[[alternative HTML version deleted]]