[R-sig-ME] nAGQ > 1 in lme4::glmer gives unexpected likelihood
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Sat Apr 25 14:33:29 CEST 2020
> I think they're both 'correct' in the sense of being proportional to
> the likelihood but use different baselines, thus incommensurate (see the
> note pointed out below).
Well, I'm sorry to be a spoiler here, but I think we should
acknowledge that (log) likelihood is uniquely well defined
function.
I remember how we've fought with the many definitions of
deviance and have (in our still unfinished glmm-paper -- hint!!)
very nicely tried to define the many versions
(conditional/unconditional and more), but I find it too sloppy
to claim that likelihoods are only defined up to a constant
factor -- even though of course it doesn't matter of ML (and
REML) if the objective function is "off by constant factor".
Martin
> On 4/24/20 5:05 PM, Vaida, Florin wrote:
> > One can figure out which likelihood version is correct (Laplace or nAGQ>1) by doing a simulation with random effects variance -> 0.
> > I'll do that if I find time.
> > Florin
> >
> >> On Apr 24, 2020, at 7:59 AM, Ben Bolker <bbolker using gmail.com> wrote:
> >>
> >>
> >> I found the note I was looking for. In ?deviance.glmerMod it says:
> >>
> >> If adaptive Gauss-Hermite quadrature is used, then
> >> ‘logLik(object)’ is currently only proportional to the
> >> absolute-unconditional log-likelihood.
> >>
> >> (see the discussion on this page for more context); see also https://github.com/lme4/lme4/blob/master/misc/logLikGLMM/logLikGLMM.R
> >>
> >>
> >> On 4/24/20 10:24 AM, Douglas Bates wrote:
> >>> Having said that, I do see that the fits in the MixedModels package for
> >>> Julia produce similar values of the deviance with the Laplace approximation
> >>> and nAGQ = 7
> >>>
> >>> julia> m1 = fit(MixedModel, @formula(y ~ 1 + (1|group)), dd, Poisson())
> >>> Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
> >>> y ~ 1 + (1 | group)
> >>> Distribution: Poisson{Float64}
> >>> Link: LogLink()
> >>>
> >>> Deviance: 193.5587
> >>>
> >>> Variance components:
> >>> Column Variance Std.Dev.
> >>> group (Intercept) 3.9577026 1.9893975
> >>>
> >>> Number of obs: 100; levels of grouping factors: 10
> >>>
> >>> Fixed-effects parameters:
> >>> ──────────────────────────────────────────────────
> >>> Estimate Std.Error z value P(>|z|)
> >>> ──────────────────────────────────────────────────
> >>> (Intercept) 2.65175 0.632317 4.19 <1e-4
> >>> ──────────────────────────────────────────────────
> >>>
> >>> julia> m1 = fit(MixedModel, @formula(y ~ 1 + (1|group)), dd, Poisson(),
> >>> nAGQ=7)
> >>> Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 7)
> >>> y ~ 1 + (1 | group)
> >>> Distribution: Poisson{Float64}
> >>> Link: LogLink()
> >>>
> >>> Deviance: 193.5104
> >>>
> >>> Variance components:
> >>> Column Variance Std.Dev.
> >>> group (Intercept) 3.9577026 1.9893975
> >>>
> >>> Number of obs: 100; levels of grouping factors: 10
> >>>
> >>> Fixed-effects parameters:
> >>> ──────────────────────────────────────────────────
> >>> Estimate Std.Error z value P(>|z|)
> >>> ──────────────────────────────────────────────────
> >>> (Intercept) 2.65175 0.632317 4.19 <1e-4
> >>> ──────────────────────────────────────────────────
> >>>
> >>> As the person who wrote the first version of the nAGQ code in R I would not
> >>> be surprised if there was a constant dropped somewhere. It is difficult
> >>> code.
> >>>
> >>> And the results here in the Julia package make me uncomfortable because the
> >>> values of the parameter estimates are identical in the two fits. I would
> >>> expect them to be close but not identical.
> >>>
> >>> Isn't it good to know that there is still room for research in this area?
> >>> :-)
> >>>
> >>> On Fri, Apr 24, 2020 at 9:05 AM Douglas Bates <bates using stat.wisc.edu> wrote:
> >>>
> >>>> There's a lot of variability in your lambdas
> >>>>
> >>>>> exp(3 + random_effects)
> >>>> [1] 91.5919358 6.9678749 4.1841478 78.0771666 890.6931394
> >>>> 20.8558107
> >>>> [7] 3.0037864 0.3049416 2.1675995 40.6209684
> >>>>
> >>>> Do you really expect that some groups will have a mean count of nearly 900
> >>>> whereas others will have a mean count less than 1?
> >>>>
> >>>>
> >>>> On Wed, Apr 22, 2020 at 5:58 PM Ben Goldstein <ben.goldstein using berkeley.edu>
> >>>> wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> I'm using lme4::glmer to estimate Poisson mixed models in a very simple
> >>>>> context (single random effect). I'm interested in the model
> >>>> likelihood/AIC
> >>>>> across many simulated datasets.
> >>>>>
> >>>>> To investigate whether the Laplace approximation was appropriate for my
> >>>>> data context, I explored using the argument nAGQ to improve the accuracy
> >>>> of
> >>>>> the likelihood estimation. When I changed nAGQ to a value > 1, I saw an
> >>>>> unexpectedly huge change in the likelihood; log-likelihoods tended to be
> >>>>> off by ~200. Other statistics packages (e.g. GLMMadaptive) yield
> >>>> estimates
> >>>>> that agree with lme4's Laplace approximation, as did a manual likelihood
> >>>>> estimate, and not with the nAGQ > 2 estimate.
> >>>>>
> >>>>> The following code reproduces the problem I'm encountering.
> >>>>>
> >>>>> *# r-sig-mixed-models GLMM question*
> >>>>> library(lme4)
> >>>>> set.seed(51)
> >>>>>
> >>>>> *# Simulate some random effect-driven Poisson data*
> >>>>> random_effects <- rnorm(10, 0, 2)
> >>>>> group <- rep(1:10, 10)
> >>>>> simulated_data <- data.frame(y = rpois(n = 100, lambda = exp(3 +
> >>>>> random_effects[group])),
> >>>>> group = group)
> >>>>>
> >>>>> *# Fit models with Laplace (nAGQ = 1) and nAGQ = 11*
> >>>>> fit_Laplace <- glmer(y ~ (1|group), data = simulated_data, family =
> >>>>> poisson())
> >>>>> fit_AGQ <- glmer(y ~ (1|group), data = simulated_data, family =
> >>>> poisson(),
> >>>>> nAGQ = 11)
> >>>>>
> >>>>> logLik(fit_Laplace)
> >>>>> logLik(fit_AGQ)
> >>>>> logLik(fit_Laplace) - logLik(fit_AGQ) *# Huge difference!*
> >>>>>
> >>>>> When I execute the above code, I see a difference in likelihood of
> >>>>> -218.8894. I've tested across many simulations and on 2 different
> >>>> machines
> >>>>> (Mac and Linux). My version of lme4 is up to date.
> >>>>>
> >>>>> Has anyone run into this issue before? Am I using the glmer function
> >>>> wrong,
> >>>>> or is it possible there's something going on under the hood?
> >>>>>
> >>>>> Thanks,
> >>>>> Ben
> >>>>>
> >>>>> [[alternative HTML version deleted]]
> >>>>>
> >>>>> _______________________________________________
> >>>>> R-sig-mixed-models using r-project.org mailing list
> >>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>>>>
> >>>> [[alternative HTML version deleted]]
> >>>>
> >>>> _______________________________________________
> >>>> R-sig-mixed-models using r-project.org mailing list
> >>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> _______________________________________________
> >>> R-sig-mixed-models using r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >> _______________________________________________
> >> R-sig-mixed-models using r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
More information about the R-sig-mixed-models
mailing list