[R-sig-ME] nAGQ > 1 in lme4::glmer gives unexpected likelihood

Sat Apr 25 14:33:29 CEST 2020

>    I think they're both 'correct' in the sense of being proportional to 
> the likelihood but use different baselines, thus incommensurate (see the 
> note pointed out below).

Well, I'm sorry to be a spoiler here, but I think we should
acknowledge that (log) likelihood is uniquely well defined
function.
I remember how we've fought with the many definitions of
deviance and have (in our still unfinished glmm-paper -- hint!!)
very nicely tried to define the many versions
(conditional/unconditional and more), but I find it too sloppy
to claim that likelihoods are only defined up to a constant
factor -- even though of course it doesn't matter of ML (and
REML) if the objective function is "off by constant factor".

Martin

> On 4/24/20 5:05 PM, Vaida, Florin wrote:
> > One can figure out which likelihood version is correct (Laplace or nAGQ>1) by doing a simulation with random effects variance -> 0.
> > I'll do that if I find time.
> > Florin
> >
> >> On Apr 24, 2020, at 7:59 AM, Ben Bolker <bbolker using gmail.com> wrote:
> >>
> >>
> >>     I found the note I was looking for.  In ?deviance.glmerMod it says:
> >>
> >>   If adaptive Gauss-Hermite quadrature is used, then
> >>            ‘logLik(object)’ is currently only proportional to the
> >>            absolute-unconditional log-likelihood.
> >>
> >> (see the discussion on this page for more context); see also https://github.com/lme4/lme4/blob/master/misc/logLikGLMM/logLikGLMM.R
> >>
> >>
> >> On 4/24/20 10:24 AM, Douglas Bates wrote:
> >>> Having said that, I do see that the fits in the MixedModels package for
> >>> Julia produce similar values of the deviance with the Laplace approximation
> >>> and nAGQ = 7
> >>>
> >>> julia> m1 = fit(MixedModel, @formula(y ~ 1 + (1|group)), dd, Poisson())
> >>> Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
> >>>    y ~ 1 + (1 | group)
> >>>    Distribution: Poisson{Float64}
> >>>    Link: LogLink()
> >>>
> >>>    Deviance: 193.5587
> >>>
> >>> Variance components:
> >>>           Column    Variance  Std.Dev.
> >>> group (Intercept)  3.9577026 1.9893975
> >>>
> >>>   Number of obs: 100; levels of grouping factors: 10
> >>>
> >>> Fixed-effects parameters:
> >>> ──────────────────────────────────────────────────
> >>>               Estimate  Std.Error  z value  P(>|z|)
> >>> ──────────────────────────────────────────────────
> >>> (Intercept)   2.65175   0.632317     4.19    <1e-4
> >>> ──────────────────────────────────────────────────
> >>>
> >>> julia> m1 = fit(MixedModel, @formula(y ~ 1 + (1|group)), dd, Poisson(),
> >>> nAGQ=7)
> >>> Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 7)
> >>>    y ~ 1 + (1 | group)
> >>>    Distribution: Poisson{Float64}
> >>>    Link: LogLink()
> >>>
> >>>    Deviance: 193.5104
> >>>
> >>> Variance components:
> >>>           Column    Variance  Std.Dev.
> >>> group (Intercept)  3.9577026 1.9893975
> >>>
> >>>   Number of obs: 100; levels of grouping factors: 10
> >>>
> >>> Fixed-effects parameters:
> >>> ──────────────────────────────────────────────────
> >>>               Estimate  Std.Error  z value  P(>|z|)
> >>> ──────────────────────────────────────────────────
> >>> (Intercept)   2.65175   0.632317     4.19    <1e-4
> >>> ──────────────────────────────────────────────────
> >>>
> >>> As the person who wrote the first version of the nAGQ code in R I would not
> >>> be surprised if there was a constant dropped somewhere.  It is difficult
> >>> code.
> >>>
> >>> And the results here in the Julia package make me uncomfortable because the
> >>> values of the parameter estimates are identical in the two fits.  I would
> >>> expect them to be close but not identical.
> >>>
> >>> Isn't it good to know that there is still room for research in this area?
> >>> :-)
> >>>
> >>> On Fri, Apr 24, 2020 at 9:05 AM Douglas Bates <bates using stat.wisc.edu> wrote:
> >>>
> >>>> There's a lot of variability in your lambdas
> >>>>
> >>>>> exp(3 + random_effects)
> >>>>   [1]  91.5919358   6.9678749   4.1841478  78.0771666 890.6931394
> >>>> 20.8558107
> >>>>   [7]   3.0037864   0.3049416   2.1675995  40.6209684
> >>>>
> >>>> Do you really expect that some groups will have a mean count of nearly 900
> >>>> whereas others will have a mean count less than 1?
> >>>>
> >>>>
> >>>> On Wed, Apr 22, 2020 at 5:58 PM Ben Goldstein <ben.goldstein using berkeley.edu>
> >>>> wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> I'm using lme4::glmer to estimate Poisson mixed models in a very simple
> >>>>> context (single random effect). I'm interested in the model
> >>>> likelihood/AIC
> >>>>> across many simulated datasets.
> >>>>>
> >>>>> To investigate whether the Laplace approximation was appropriate for my
> >>>>> data context, I explored using the argument nAGQ to improve the accuracy
> >>>> of
> >>>>> the likelihood estimation. When I changed nAGQ to a value > 1, I saw an
> >>>>> unexpectedly huge change in the likelihood; log-likelihoods tended to be
> >>>>> off by ~200. Other statistics packages (e.g. GLMMadaptive) yield
> >>>> estimates
> >>>>> that agree with lme4's Laplace approximation, as did a manual likelihood
> >>>>> estimate, and not with the nAGQ > 2 estimate.
> >>>>>
> >>>>> The following code reproduces the problem I'm encountering.
> >>>>>
> >>>>> *# r-sig-mixed-models GLMM question*
> >>>>> library(lme4)
> >>>>> set.seed(51)
> >>>>>
> >>>>> *# Simulate some random effect-driven Poisson data*
> >>>>> random_effects <- rnorm(10, 0, 2)
> >>>>> group <- rep(1:10, 10)
> >>>>> simulated_data <- data.frame(y = rpois(n = 100, lambda = exp(3 +
> >>>>> random_effects[group])),
> >>>>>                               group = group)
> >>>>>
> >>>>> *# Fit models with Laplace (nAGQ = 1) and nAGQ = 11*
> >>>>> fit_Laplace <- glmer(y ~ (1|group), data = simulated_data, family =
> >>>>> poisson())
> >>>>> fit_AGQ <- glmer(y ~ (1|group), data = simulated_data, family =
> >>>> poisson(),
> >>>>> nAGQ = 11)
> >>>>>
> >>>>> logLik(fit_Laplace)
> >>>>> logLik(fit_AGQ)
> >>>>> logLik(fit_Laplace) - logLik(fit_AGQ) *# Huge difference!*
> >>>>>
> >>>>> When I execute the above code, I see a difference in likelihood of
> >>>>> -218.8894. I've tested across many simulations and on 2 different
> >>>> machines
> >>>>> (Mac and Linux). My version of lme4 is up to date.
> >>>>>
> >>>>> Has anyone run into this issue before? Am I using the glmer function
> >>>> wrong,
> >>>>> or is it possible there's something going on under the hood?
> >>>>>
> >>>>> Thanks,
> >>>>> Ben
> >>>>>
> >>>>>          [[alternative HTML version deleted]]
> >>>>>
> >>>>> _______________________________________________
> >>>>> R-sig-mixed-models using r-project.org mailing list
> >>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>>>>
> >>>>          [[alternative HTML version deleted]]
> >>>>
> >>>> _______________________________________________
> >>>> R-sig-mixed-models using r-project.org mailing list
> >>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>>>
> >>> 	[[alternative HTML version deleted]]
> >>>
> >>> _______________________________________________
> >>> R-sig-mixed-models using r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >> _______________________________________________
> >> R-sig-mixed-models using r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models