[R-sig-ME] addressing singularity in lme4 fits caused by subsets of contrasts
bbo|ker @end|ng |rom gm@||@com
Thu Jan 21 23:01:58 CET 2021
This is an interesting question ("interesting" means among other
things "I don't know").
If you get a variance estimate of zero for the second contrast then
removing that term from the model should (I think) give you **exactly
the same** model results (as an analogy: suppose you had the mean model
y = a +b*x+c*z and for some reason got an estimate of c=0, then you said
"can I drop z from the model?")
More generally, in order to know whether this is OK you have to
define what "OK" means. Trying to avoid philosophical or subjective
statements, you could ask whether following this process gives 'good'
results (unbiased and/or low-error estimates and good coverage of
whichever set of parameters you're interested in). In particular, if
you're interested in inference on fixed effects only, then I'd say you
can do anything to the random effects component of the model as long as
it doesn't mess up your estimation and inference on the fixed effects.
You could try some simulations to test your idea (note that your
conclusions can only be for the range of parameters you've actually
simulated: in particular Bates et al 2015 criticize the realism of the
simulations from Barr et al 2013 "keep it maximal":
"First, the simulations implement a factorial contrast that is
atypically large compared to what is found in natural data. Second, and
more importantly, the correlations in the random effects structure range
from−0.8 to +0.8. Such large correlation parameters are indicative of
overparameterization.They hardly ever represent true correlations in the
population. As a consequence, these simulations do not provide a
solid foundation for recommendations about how to fit
mixed-effects models to empirical data."
Bates, Douglas, Reinhold Kliegl, Shravan Vasishth, and Harald Baayen.
“Parsimonious Mixed Models.” ArXiv:1506.04967 [Stat], June 16, 2015.
On 1/21/21 11:54 AM, Nathan Tardiff wrote:
> I have encountered an issue a couple times recently when fitting models in
> lme4 that I have not seen addressed in commonly cited papers for dealing w/
> boundary/singular fit issues.
> Say I have a categorical variable representing a set of within-subject
> contrasts, which is entered into the model as a set of effect coded
> variables, e.g.
> df$congruent.f <- factor(df$congruent,levels=c(1,0,-1),
> contrasts(df$congruent.f) <- contr.sum(3)
> which will produce two variables in the model (e.g. congruent.f1,
> congruent.f2). When I fit the model w/ random intercepts and slopes for
> these contrasts (along w/ other control variables), I get a boundary
> (singular) fit warning.
> Examining the correlations and variance components suggests that the
> primary cause of the warning is in one of these contrast variables (e.g.
> congruent.f2). So, would it ever be acceptable in this scenario to remove
> the random effect term ONLY for congruent.f2, not the entire set of
> congruent.f contrasts, where the goal is statistical inference and I do not
> want the p-values/confidence intervals for congruent.f1 to be
> anticonservative when it does in fact show variance across subjects?
> I have to this point assumed that this would be a bad idea and tried to
> simplify such models in other ways (i.e. setting correlations to zero or
> removing other random effects), but this does not always work and seems a
> roundabout method if you are not dealing with the primary problem.
> [[alternative HTML version deleted]]
> R-sig-mixed-models using r-project.org mailing list
More information about the R-sig-mixed-models