[R-sig-ME] seeking input lme4::glmer with a gamma family: link = log or identity?
@dl@point @ending from gm@il@com
Wed Jul 25 22:05:44 CEST 2018
Thank you Paul, I appreciate your time. And, apologies if my understanding
is often incomplete.
> An incomplete answer…
> > 1. Is a Gamma distribution best for my distance data? If so, which link
> > function is most appropriate? I explored two link functions: identity and
> > log. I have concerns and see potential issues with both (see my
> > in the reproducible example below.
> I don’t know (I haven’t run your code) but I’ve always somehow managed to
> avoid gamma regression for strictly positive data by logging the response
> and fitting a model with normal errors.
If possible, I'd rather not transform the raw data to facilitate
interpretation of the coefficient estimates. I'm likely naive or
misunderstanding something though. Log transforming the distance data does
produce a reasonably normal distribution. The following two models have
very similar AIC, BIC, LogLik, etc. estimates and the p-values of the fixed
effects produce similar interpretations. However, the fixed effects
estimates are quite different.
gammaDist <- glmer(distance ~ CSs.lat + CSdirect + CSstart + year + age*sex
+ (1|id), data = birds, family = Gamma(link = log), nAGQ = 10, control =
glmerControl(optimizer = "bobyqa"))
logGausDist <- glmer(log(distance) ~ CSs.lat + CSdirect + CSstart + year +
age*sex + (1|id), data = birds, family = gaussian(link = log), nAGQ = 10,
control = glmerControl(optimizer = "bobyqa"))
The interpretation from these two models are mostly the same: only starting
latitude is a marginally significant predictor of bird migration distance.
> 2. If the log link is the best or most appropriate to use, then the
> > summary(mDist) produces a sd of the random effect = 0 with the bobyqa
> > optimizer. Switching to Nelder_Mead gives a reasonable sd, but throws a
> > convergence warning.
> (For clarity, I assume that by "sd of the random effect” you mean the
> square root of the variance parameter that gauges residual inter-bird
> variation in mean distance and not the SD of the estimate of that
> parameter, which anyway isn’t output by glmer.)
> Why is a random effect variance estimate of zero implausible? I would
> trust a converged estimate over a non-converged estimate, regardless of
> whether the estimate is zero. Also… you could compare the log-likelihoods
> using logLik() — you’d expect the converged fit to have a higher LL. For
> more general troubleshooting of convergence warnings:
Yes, I believe your assumption is correct. In case I am wrong, I'm
referring to these estimates from the summary(model) output:
Groups Name Variance Std.Dev.
id (Intercept) 0.00000 0.0000
Residual 0.02879 0.1697
Number of obs: 137, groups: id, 79
The reason I said that a Std.Dev. = 0 is implausible is because the
ecologist in me says that there is no way that individual birds do not vary
between each other (or even within for birds with multiple migration route
data). Am I misunderstanding the meaning of the Std.Dev here?
> Another quick check I often do is to fit the non-converged model with
> glmmTMB (which appears to be more robust than lme4), and compare
> likelihoods and estimates with lme4.
> A quick and dirty model fit assessment is to simulate from the fitted
> model (which is as easy as simulate(my.fit)), and see if the simulated
> responses look more or less like the real responses.
> Good luck,
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models