[R-sig-ME] Negative response values when simulating glmer with log link
Ben Bolker
bbo|ker @end|ng |rom gm@||@com
Tue Apr 29 16:44:36 CEST 2025
(I never answered the question on the lme4 issues list: I will
answer here, and copy the information to the issues list.)
There are two ways one might define a "log normal GLMM": (1) with a
transformation
eta = a + b*x + ... (linear predictor)
log(y) ~ Normal(eta, sigma^2)
or (2) with a link function:
eta = a + b*x + ... (same as above)
y ~ Normal(exp(eta), sigma^2)
These look almost identical, but are quite different.
The first case is equivalent to
Y ~ log-Normal(meanlog = eta, meansd = sigma)
[using R's parameterization based on the mean and standard deviation *on
the log scale*]. In this case:
* simulated values of log(y) can be any real number, but y =
exp(log(y)) will always be positive (possibly zero due to floating point
underflow in extreme cases
* the standard deviation of Y is proportional to its mean (== the
coefficient of variation is constant)
In the second case,
* simulated values of y can be any real number: could easily be
negative, for example, if exp(eta) is close to zero and sigma is not too
small
* the standard deviation of Y is constant
Although there are use cases for both models, I would say that case 1
(transformation) is generally a more natural way to model positive,
continuous data.
Does that help?
On 2025-04-29 2:50 a.m., Fiona Scarff wrote:
> I have some data in which the response variable can only be a non-negative
> number. I fitted a log normal glmm using the lme4 package, and simulated
> from the model using simulate.merMod. A very small proportion of the
> simulated values are slightly negative, and I would like to understand how
> that is possible with a log link. I found a post in which Ben Bolker
> observed that:
> "Note that if you did simulate data with a log link and a Gaussian family,
> you could still get negative values if the standard deviation were large
> enough ..."
> https://github.com/lme4/lme4/issues/530
>
> I thought that the log link would force all the reponses to be
> non-negative. It is not especially important in this particular case, but I
> feel I have misunderstood something, either about the way that simulate()
> works for mixed effects models, or perhaps something more fundamental about
> how random effects work in a model with a non-identity link. Apologies
> therefore if this question is misdirected and ought instead to go to
> crossvalidated.
>
> Many thanks for your help,
> Fiona
>
> *Dr Fiona Scarff*
> *Harry Butler Institute*
> *Murdoch University*
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
> E-mail is sent at my convenience; I don't expect replies outside of
working hours.
More information about the R-sig-mixed-models
mailing list