[R-sig-ME] nAGQ > 1 in lme4::glmer gives unexpected likelihood

Sat Apr 25 17:29:06 CEST 2020

There shouldn't be discussions of "off by" with respect to the
log-likelihood. As Martin pointed out, if you know the probability model
then you have one and only one likelihood.  In the case of a generalized
linear mixed model the likelihood is well-defined but does not have a
closed-form expression because the integral with respect to the random
effects doesn't have a closed-form expression.  Nevertheless the likelihood
is, in the mathematical sense, well-defined.  So the only question about
different ways of approximating the log-likelihood is how close it gets to
this well-defined value.

On Sat, Apr 25, 2020 at 9:41 AM <fvaida using health.ucsd.edu> wrote:

> Looks like you guys thought about deviance issues a lot:
> https://github.com/lme4/lme4/issues/375
>
> Speaking about the principle of "least surprise", it is surprising to get
> completely different behaviors in deviance for nAGQ=1 vs nAGQ=2 (say), in
> the same model - (what Ben Goldstein pointed out).
>
> A couple further questions:
> 1. Is the "off by" constant the same, for a given model, if I only change
> nAGQ?  E.g., comparing nAGQ=5 and nAGQ=20?
> 2. If so, can you calculate the "off by" constant by comparing with
> formula-based Laplace approximation with a bona fide Adaptive Gaussian
> Quadrature using existing algorithm but n=1?
>
> Florin
>
> On Apr 24, 2020, at 7:59 AM, Ben Bolker <bbolker using gmail.com<mailto:
> bbolker using gmail.com>> wrote:
>
>
>    I found the note I was looking for.  In ?deviance.glmerMod it says:
>
>  If adaptive Gauss-Hermite quadrature is used, then
>           ‘logLik(object)’ is currently only proportional to the
>           absolute-unconditional log-likelihood.
>
> (see the discussion on this page for more context); see also
> https://github.com/lme4/lme4/blob/master/misc/logLikGLMM/logLikGLMM.R
>
>
> On 4/24/20 10:24 AM, Douglas Bates wrote:
> Having said that, I do see that the fits in the MixedModels package for
> Julia produce similar values of the deviance with the Laplace approximation
> and nAGQ = 7
>
> julia> m1 = fit(MixedModel, @formula(y ~ 1 + (1|group)), dd, Poisson())
> Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
>   y ~ 1 + (1 | group)
>   Distribution: Poisson{Float64}
>   Link: LogLink()
>
>   Deviance: 193.5587
>
> Variance components:
>          Column    Variance  Std.Dev.
> group (Intercept)  3.9577026 1.9893975
>
>  Number of obs: 100; levels of grouping factors: 10
>
> Fixed-effects parameters:
> ──────────────────────────────────────────────────
>              Estimate  Std.Error  z value  P(>|z|)
> ──────────────────────────────────────────────────
> (Intercept)   2.65175   0.632317     4.19    <1e-4
> ──────────────────────────────────────────────────
>
> julia> m1 = fit(MixedModel, @formula(y ~ 1 + (1|group)), dd, Poisson(),
> nAGQ=7)
> Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 7)
>   y ~ 1 + (1 | group)
>   Distribution: Poisson{Float64}
>   Link: LogLink()
>
>   Deviance: 193.5104
>
> Variance components:
>          Column    Variance  Std.Dev.
> group (Intercept)  3.9577026 1.9893975
>
>  Number of obs: 100; levels of grouping factors: 10
>
> Fixed-effects parameters:
> ──────────────────────────────────────────────────
>              Estimate  Std.Error  z value  P(>|z|)
> ──────────────────────────────────────────────────
> (Intercept)   2.65175   0.632317     4.19    <1e-4
> ──────────────────────────────────────────────────
>
> As the person who wrote the first version of the nAGQ code in R I would not
> be surprised if there was a constant dropped somewhere.  It is difficult
> code.
>
> And the results here in the Julia package make me uncomfortable because the
> values of the parameter estimates are identical in the two fits.  I would
> expect them to be close but not identical.
>
> Isn't it good to know that there is still room for research in this area?
> :-)
>
> On Fri, Apr 24, 2020 at 9:05 AM Douglas Bates <bates using stat.wisc.edu<mailto:
> bates using stat.wisc.edu>> wrote:
>
> There's a lot of variability in your lambdas
>
> exp(3 + random_effects)
>  [1]  91.5919358   6.9678749   4.1841478  78.0771666 890.6931394
> 20.8558107
>  [7]   3.0037864   0.3049416   2.1675995  40.6209684
>
> Do you really expect that some groups will have a mean count of nearly 900
> whereas others will have a mean count less than 1?
>
>
> On Wed, Apr 22, 2020 at 5:58 PM Ben Goldstein <ben.goldstein using berkeley.edu
> <mailto:ben.goldstein using berkeley.edu>>
> wrote:
>
> Hi all,
>
> I'm using lme4::glmer to estimate Poisson mixed models in a very simple
> context (single random effect). I'm interested in the model
> likelihood/AIC
> across many simulated datasets.
>
> To investigate whether the Laplace approximation was appropriate for my
> data context, I explored using the argument nAGQ to improve the accuracy
> of
> the likelihood estimation. When I changed nAGQ to a value > 1, I saw an
> unexpectedly huge change in the likelihood; log-likelihoods tended to be
> off by ~200. Other statistics packages (e.g. GLMMadaptive) yield
> estimates
> that agree with lme4's Laplace approximation, as did a manual likelihood
> estimate, and not with the nAGQ > 2 estimate.
>
> The following code reproduces the problem I'm encountering.
>
> *# r-sig-mixed-models GLMM question*
> library(lme4)
> set.seed(51)
>
> *# Simulate some random effect-driven Poisson data*
> random_effects <- rnorm(10, 0, 2)
> group <- rep(1:10, 10)
> simulated_data <- data.frame(y = rpois(n = 100, lambda = exp(3 +
> random_effects[group])),
>                              group = group)
>
> *# Fit models with Laplace (nAGQ = 1) and nAGQ = 11*
> fit_Laplace <- glmer(y ~ (1|group), data = simulated_data, family =
> poisson())
> fit_AGQ <- glmer(y ~ (1|group), data = simulated_data, family =
> poisson(),
> nAGQ = 11)
>
> logLik(fit_Laplace)
> logLik(fit_AGQ)
> logLik(fit_Laplace) - logLik(fit_AGQ) *# Huge difference!*
>
> When I execute the above code, I see a difference in likelihood of
> -218.8894. I've tested across many simulations and on 2 different
> machines
> (Mac and Linux). My version of lme4 is up to date.
>
> Has anyone run into this issue before? Am I using the glmer function
> wrong,
> or is it possible there's something going on under the hood?
>
> Thanks,
> Ben
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org>
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org>
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org>
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> _______________________________________________
> R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org>
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]