[R-sig-ME] Negative response values when simulating glmer with log link

Ben Bolker bbo|ker @end|ng |rom gm@||@com
Tue Apr 29 16:44:36 CEST 2025


    (I never answered the question on the lme4 issues list: I will 
answer here, and copy the information to the issues list.)

    There are two ways one might define a "log normal GLMM": (1) with a 
transformation

    eta = a + b*x + ...  (linear predictor)
    log(y) ~ Normal(eta, sigma^2)

or (2) with a link function:

    eta = a + b*x + ... (same as above)
    y ~ Normal(exp(eta), sigma^2)

These look almost identical, but are quite different.

   The first case is equivalent to

   Y ~ log-Normal(meanlog = eta, meansd = sigma)

[using R's parameterization based on the mean and standard deviation *on 
the log scale*].  In this case:

   * simulated values of log(y) can be any real number, but y = 
exp(log(y)) will always be positive (possibly zero due to floating point 
underflow in extreme cases
   * the standard deviation of Y is proportional to its mean (== the 
coefficient of variation is constant)

   In the second case,

   * simulated values of y can be any real number: could easily be 
negative, for example, if exp(eta) is close to zero and sigma is not too 
small
   * the standard deviation of Y is constant

   Although there are use cases for both models, I would say that case 1 
(transformation) is generally a more natural way to model positive, 
continuous data.

   Does that help?



On 2025-04-29 2:50 a.m., Fiona Scarff wrote:
> I have some data in which the response variable can only be a non-negative
> number. I fitted a log normal glmm using the lme4 package, and simulated
> from the model using simulate.merMod. A very small proportion of the
> simulated values are slightly negative, and I would like to understand how
> that is possible with a log link. I found a post in which Ben Bolker
> observed that:
> "Note that if you did simulate data with a log link and a Gaussian family,
> you could still get negative values if the standard deviation were large
> enough ..."
> https://github.com/lme4/lme4/issues/530
> 
> I thought that the log link would force all the reponses to be
> non-negative. It is not especially important in this particular case, but I
> feel I have misunderstood something, either about the way that simulate()
> works for mixed effects models, or perhaps something more fundamental about
> how random effects work in a model with a non-identity link. Apologies
> therefore if this question is misdirected and ought instead to go to
> crossvalidated.
> 
> Many thanks for your help,
> Fiona
> 
> *Dr Fiona Scarff*
> *Harry Butler Institute*
> *Murdoch University*
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
 > E-mail is sent at my convenience; I don't expect replies outside of 
working hours.



More information about the R-sig-mixed-models mailing list