[R-sig-ME] Log likelihood of a glmer() binomial model .

Sat Jan 30 15:23:20 CET 2021

Sorry to revive the thread after 21 months, but the topic has just become
highly relevant. My question concerns the following remark by Ben:

As far as what predict() does: depending on the value of re.form, it
> either gives a population-level prediction (R=0) or a conditional
> prediction (R set to its conditional mode).  You might be looking for
> *marginal* predictions (which Rizopoulous's GLMMadaptive package can do
> ...)
>

Does "marginal predictions" here mean predictions based exclusively on the
"population-averaged" fixed-effect estimates that are equivalent to ones
calculated by *gee()* from the *gee* library and *geeglm()* from the *geepack
*library? Or does it mean predictions calculated using the GLMM estimates
of the fixed effects and then averaged with respect to the estimated
random-effects distribution? If it means the former, aren't those
predictions easily obtainable through a standard GLM that ignores the
clustering? If it means the latter, could someone please point out how
exactly to calculate such marginal predictions using GLMMadaptive? I've
been trying to figure out how to manually calculate the marginal (latter
sense) LL for binomial GLMMs, with no success so far...

Best,

J

la 20. huhtik. 2019 klo 17.51 Ben Bolker (bbolker using gmail.com) kirjoitti:

>
>   I agree with Juho that there's a typo -- would be something like.
>
>  sum((y==1)*log(pred_prob) + (y==0)*log(1-pred_prob))
>
>   As far as what predict() does: depending on the value of re.form, it
> either gives a population-level prediction (R=0) or a conditional
> prediction (R set to its conditional mode).  You might be looking for
> *marginal* predictions (which Rizopoulous's GLMMadaptive package can do
> ...)
>
>
> On 2019-04-20 3:42 a.m., Juho Kristian Ruohonen wrote:
> > Rolf: Forgive my ignorance, but isn't the relevant log-likelihood here
> > the log-likelihood of the observed responses in the validation set given
> > the model-predicted probabilities? I.e. sum(dbinom(VS$y, size = VS$n,
> > prob = predict(fit, newdata = VS, type = "response"), log = TRUE))? Even
> > this would be somewhat off because dbinom() isn't aware of the
> > random-effects integral business. But it looks to me like your current
> > call is calculating the log-sum of the predicted probabilities of y = 1
> > in the validation set, not the loglikelihood of the observed responses
> > in the validation set.
> >
> > la 20. huhtik. 2019 klo 8.51 Rolf Turner (r.turner using auckland.ac.nz
> > <mailto:r.turner using auckland.ac.nz>) kirjoitti:
> >
> >
> >     On 20/04/19 12:44 PM, Ben Bolker wrote:
> >
> >     >    This seems wrong.
> >
> >     Yeah, that figures.
> >
> >     > The GLMM log-likelihood includes an integral over
> >     > the distribution of the random effects.
> >
> >     I was aware of this.  I guess what I was naïvely expecting was that
> >     predict.merMod() would handle this.  I.e. that this predict method
> >     (with type = "response") would return, for each observed y_i in the
> >     (new) data set
> >
> >        Pr(Y = y_i) = \int_0 Pr(Y = y_i | R = r) f(r) dr
> >
> >     where R is the vector of random effects and f(r) is its probability
> >     density function (multivariate normal, with mean 0 and some
> covariance
> >     matrix, which has been estimated in the fitting process.
> >
> >     I guess that this is *not* what predict.merMod() returns --- but I
> >     don't
> >     understand why not.  It seems to me that this is what it "should"
> >     return.
> >
> >     I'm probably misunderstanding something, possibly simple, possibly
> >     subtle.
> >
> >     Apropos of nothing much, what *does* predict.merMod() return?
> >     Maybe Pr(Y = y_i | R = 0) ???
> >
> >     <SNIP>
> >     >   Here is an **inefficient** method for computing the likelihood
> >     >
> >     >     coefs <- unlist(getME(fit,c("theta","beta"))
> >     >     newdev <- update(fit, data=VS, devFunOnly=TRUE)
> >     >     newdev(coefs)
> >     >
> >     > This is slow because it has to reconstruct all of the
> random-effects
> >     > matrices, do permutations to reorder the relevant matrices to be as
> >     > sparse as possible, etc. etc.
> >
> >     Thanks for this.  I'll give it a go.  I think that the slowness may
> not
> >     be an overwhelming drawback.  Anyhow I shall try to test it out.
> >
> >     Thanks again.
> >
> >     cheers,
> >
> >     Rolf
> >
> >     --
> >     Honorary Research Fellow
> >     Department of Statistics
> >     University of Auckland
> >     Phone: +64-9-373-7599 ext. 88276
> >
> >     _______________________________________________
> >     R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org> mailing list
> >     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>

	[[alternative HTML version deleted]]