# [R-sig-ME] Log likelihood of a glmer() binomial model .

Juho Kristian Ruohonen juho@kr|@t|@n@ruohonen @end|ng |rom gm@||@com
Sun Jan 31 09:10:37 CET 2021

Dear list/Ben,

GLMMadaptive doesn't do marginal _predictions_ but it does do
> marginal _coefficients_.
>

Thanks for clearing that up!

> If you average across simulations with newdata= set that should give
> you marginal predictions ...
>

With *newdata *set to what? I assume I'd have to use the original dataset
while somehow emulating the estimated RE distribution. However, I can't
seem to make anything work. A simple example with a simple dataset:

>
> *(cluster <- rep(LETTERS[1:8], each = 20))(x <- rep(rep(c(0,1), each =
> 10), times = 8))(y <- as.vector(sapply(2:9, function(x) c(rep(c(1,0), c(x,
> 10-x)) ,rep(c(1,0), c(x-1, 10-(x-1)))) )))*
> *glmm <- glmer(y ~ (1|cluster) + x, family = binomial, nAGQ = 10)*
> *deviance(glmm) *# "Conditional" deviance obtained by plugging in the
> BLUPs.
>
> [1] 185.4821
>
> *-2*logLik(glmm) *# "Marginal" deviance from averaging the fitted values
> w.r.t the estimated N(0, σ) distribution of the REs.
>
> 'log Lik.' 203.891 (df=3)
>
>
What I'm trying to learn is how to calculate or, failing that, simulate
this "marginal" deviance. My first impulse is to simply hand-calculate a
deviance from fitted values with all RE's set to their mean of 0. Given
that the assumed RE distribution is normal and hence symmetric, my
intuition is that the result should equal the result from integrating over
the estimated RE distribution. But that fails to happen:

*-2*sum(dbinom(y, size = 1, prob = predict(glmm, re.form = NA,   type =
> "response"), log = T))*
>
> [1] 220.2704
>
>
So my next avenue is to do as Ben seems to suggest, i.e. use the means of
simulated "marginal" responses from the model as fitted values. Indeed,
*simulate.merMod()* with *use.u = FALSE* should be exactly what we want,
given that its documentation says the following:

use.u
>
> (logical) if TRUE, generate a simulation conditional on the current
> random-effects estimates; if FALSE generate new Normally distributed
> random-effects values.
>

But alas:

*-2*sum(dbinom(y, 1, prob = rowMeans(simulate(glmm, nsim = 1e4, seed =
> 2021, use.u = FALSE)), log = T))*
>
> [1] 220.3949
>
>
The resulting deviance is off no matter what random seed I set. As a
sidenote, it seems to equal the deviance obtained by setting all random
effects to 0, just as my unreliable intuition suggested. But it's the wrong
result nonetheless.

Hence I'm at my wits' end. How does one calculate or simulate the marginal
deviance of a binary GLMM?

Best,

J

la 30. tammik. 2021 klo 22.55 Ben Bolker (bbolker using gmail.com) kirjoitti:

>    "marginal" is unfortunately a word that has different meanings in
> different contexts, I'm still trying to sort it out in my own mind.
>
>    GLMMadaptive doesn't do marginal _predictions_ but it does do
> marginal _coefficients_.
>
>    I put up some example workings at
>
>
> http://bbolker.github.io/mixedmodels-misc/notes/marginal_ex.html?de49a7770f7
>
>   (the hash at the end may be unnecessary for you).
>
>   If you average across simulations with newdata= set that should give
> you marginal predictions ...
>
> On 1/30/21 9:23 AM, Juho Kristian Ruohonen wrote:
> > Sorry to revive the thread after 21 months, but the topic has just
> > become highly relevant. My question concerns the following remark by Ben:
> >
> >     As far as what predict() does: depending on the value of re.form, it
> >     either gives a population-level prediction (R=0) or a conditional
> >     prediction (R set to its conditional mode).  You might be looking for
> >     *marginal* predictions (which Rizopoulous's GLMMadaptive package can
> >     do ...)
> >
> >
> > Does "marginal predictions" here mean predictions based exclusively on
> > the "population-averaged" fixed-effect estimates that are equivalent to
> > onescalculated by *gee()* from the /gee/ library and *geeglm()* from the
> > /geepack /library? Or does it mean predictions calculated using the GLMM
> > estimates of the fixed effects and then averaged with respect to the
> > estimated random-effects distribution? If it means the former, aren't
> > those predictions easily obtainable through a standard GLM that ignores
> > the clustering? If it means the latter, could someone please point out
> > how exactly to calculate such marginal predictions using GLMMadaptive?
> > I've been trying to figure out how to manually calculate the marginal
> > (latter sense) LL for binomial GLMMs, with no success so far...
> >
> > Best,
> >
> > J
> >
> > la 20. huhtik. 2019 klo 17.51 Ben Bolker (bbolker using gmail.com
> > <mailto:bbolker using gmail.com>) kirjoitti:
> >
> >
> >        I agree with Juho that there's a typo -- would be something like.
> >
> >       sum((y==1)*log(pred_prob) + (y==0)*log(1-pred_prob))
> >
> >        As far as what predict() does: depending on the value of re.form,
> it
> >     either gives a population-level prediction (R=0) or a conditional
> >     prediction (R set to its conditional mode).  You might be looking for
> >     *marginal* predictions (which Rizopoulous's GLMMadaptive package can
> >     do ...)
> >
> >
> >     On 2019-04-20 3:42 a.m., Juho Kristian Ruohonen wrote:
> >      > Rolf: Forgive my ignorance, but isn't the relevant log-likelihood
> >     here
> >      > the log-likelihood of the observed responses in the validation
> >     set given
> >      > the model-predicted probabilities? I.e. sum(dbinom(VS$y, size = > VS$n,
> >      > prob = predict(fit, newdata = VS, type = "response"), log =
> >     TRUE))? Even
> >      > this would be somewhat off because dbinom() isn't aware of the
> >      > random-effects integral business. But it looks to me like your
> >     current
> >      > call is calculating the log-sum of the predicted probabilities of
> >     y = 1
> >      > in the validation set, not the loglikelihood of the observed
> >     responses
> >      > in the validation set.
> >      >
> >      > la 20. huhtik. 2019 klo 8.51 Rolf Turner (r.turner using auckland.ac.nz
> >     <mailto:r.turner using auckland.ac.nz>
> >      > <mailto:r.turner using auckland.ac.nz
> >     <mailto:r.turner using auckland.ac.nz>>) kirjoitti:
> >      >
> >      >
> >      >     On 20/04/19 12:44 PM, Ben Bolker wrote:
> >      >
> >      >     >    This seems wrong.
> >      >
> >      >     Yeah, that figures.
> >      >
> >      >     > The GLMM log-likelihood includes an integral over
> >      >     > the distribution of the random effects.
> >      >
> >      >     I was aware of this.  I guess what I was naïvely expecting
> >     was that
> >      >     predict.merMod() would handle this.  I.e. that this predict
> >     method
> >      >     (with type = "response") would return, for each observed y_i
> >     in the
> >      >     (new) data set
> >      >
> >      >        Pr(Y = y_i) = \int_0 Pr(Y = y_i | R = r) f(r) dr
> >      >
> >      >     where R is the vector of random effects and f(r) is its
> >     probability
> >      >     density function (multivariate normal, with mean 0 and some
> >     covariance
> >      >     matrix, which has been estimated in the fitting process.
> >      >
> >      >     I guess that this is *not* what predict.merMod() returns ---
> >     but I
> >      >     don't
> >      >     understand why not.  It seems to me that this is what it
> "should"
> >      >     return.
> >      >
> >      >     I'm probably misunderstanding something, possibly simple,
> >     possibly
> >      >     subtle.
> >      >
> >      >     Apropos of nothing much, what *does* predict.merMod() return?
> >      >     Maybe Pr(Y = y_i | R = 0) ???
> >      >
> >      >     <SNIP>
> >      >     >   Here is an **inefficient** method for computing the
> >     likelihood
> >      >     >
> >      >     >     coefs <- unlist(getME(fit,c("theta","beta"))
> >      >     >     newdev <- update(fit, data=VS, devFunOnly=TRUE)
> >      >     >     newdev(coefs)
> >      >     >
> >      >     > This is slow because it has to reconstruct all of the
> >     random-effects
> >      >     > matrices, do permutations to reorder the relevant matrices
> >     to be as
> >      >     > sparse as possible, etc. etc.
> >      >
> >      >     Thanks for this.  I'll give it a go.  I think that the
> >     slowness may not
> >      >     be an overwhelming drawback.  Anyhow I shall try to test it
> out.
> >      >
> >      >     Thanks again.
> >      >
> >      >     cheers,
> >      >
> >      >     Rolf
> >      >
> >      >     --
> >      >     Honorary Research Fellow
> >      >     Department of Statistics
> >      >     University of Auckland
> >      >     Phone: +64-9-373-7599 ext. 88276
> >      >
> >      >     _______________________________________________
> >      > R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org>
> >      >     <mailto:R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org>> mailing list
> >      > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> >      >
> >
>

[[alternative HTML version deleted]]