[R-sig-ME] is a mixed effect model appropiate?

Henrik Singmann singmann at psychologie.uzh.ch
Fri Aug 11 12:23:39 CEST 2017

Whereas it is certainly possible to do so, the problem of estimating 
random-effects parameters for a grouping factor with very few levels 
(e.g., 3) is that usually the power for detecting fixed effects suffers 
quite considerably.

Jake Westfall and colleagues show this quite convincingly in the context 
of linear mixed models and models with multiple independent (i.e., 
crossed) random-effects grouping factors:

Obviously the situation here is different, but I would not be surprised 
if a similar problem holds. So one more vote for Phillip's comment from me.


Am 10.08.2017 um 00:06 schrieb Christian Ritz:
> Dear Tamara,
> in my experience it works fine to fit a linear mixed model with lmer()
> in cases where there are only few levels of a random effect.
> Most of the time the estimated variance (component) (in your case the
> between-site variance) will be become 0, most likely reflecting that
> there was very little information in the data (not enough sites) for
> estimation of this parameter.
> I would prefer this approach (including site as a random effect) to
> using a decision rule where the number of levels of the random effect
> determines whether or not a random effect is included in a model.
> Best wishes Christian
> On 09-08-2017 23:41, Alday, Phillip wrote:
>> With only three sites, you don't have enough levels to use site as a
>> grouping variable / random effect. Random effects are *variance*
>> components and it doesn't make too much sense to discuss variance with
>> only three group members.
>> You could include site as a fixed effect, as you're doing now; adding
>> interaction terms would largely address the independence issue. Note
>> however that the inference from fixed and random effects is slightly
>> different: with fixed effects, you get estimates for each level, but
>> for random effects you get an estimate of the variance between / due to
>> sites and, optionally, a prediction for individual sites. So the random
>> effect will tend to generalize better to across all possible sites,
>> assuming that you sampled enough sites to begin with, while the fixed
>> effect will better model individual sites.
>> In your case, I would focus on including interaction terms before
>> modelling site. If you are able to do that, I would include site as a
>> fixed effect (too few levels as a random effect), but I suspect site
>> will correlate strongly with some of the other variables and so you
>> might have some issues with collinearity.
>> One final thing: you can fit (Gaussian) linear models with glm(), but
>> lm() will tend to be faster and offer some additional summary info. You
>> of course still need glm() for generalized variants such as logit, etc.
>> For lmer and glmer, the distinction is stricter -- you must use lmer()
>> for the (Gaussian) linear case and glmer() for the generalized case or
>> glmer() will complain.
>> Best,
>> Phillip
>> On Wed, 2017-08-09 at 16:26 -0300, Tamara R wrote:
>>> Hi, i'm working with survey data regarding leptospirosis knowledge,
>>> attitudes and practices on residents from three slum settlements and
>>> i'm
>>> using socio-demographic indicators, knowledge score and attitude
>>> score as
>>> predictors of preventive practices score.
>>> I started analyzing my data as a linear model with both categorical
>>> and
>>> continuous predictors:
>>> glm(practices~site + sex + education + occupation + knowledge score +
>>> attitude score
>>> But discussing the results with my phD advisor she suggested me to
>>> put site
>>> as a random effect in a linear mixed model because of lack of
>>> independence
>>> between observations from the same site:
>>> lmer(practices~sex + education + occupation + knowledge score +
>>> attitude
>>> score + (1|site))
>>> Thing is that i have less than 100 observations and the variance of
>>> random
>>> effects equals to 0. I read in a previous post on this group that it
>>> indicates that the model could be simplified by removing the random
>>> effect
>>> but i wish to know if simplifying my model (going back to the
>>> original
>>> regression model) will be appropiate to model the lack of
>>> independence of
>>> the data or should i also include random slopes for knowledge and
>>> attitude
>>> scores into the model? Thanks in advance
>>> Tamara Ricardo
>>> Lic. en Biodiversidad - Becaria CONICET
>>> FHUC - Universidad Nacional del Litoral
>>> Ciudad Universitaria - Pje. el Pozo
>>> Santa Fe (3000) - Argentina
>>> 	[[alternative HTML version deleted]]
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 	[[alternative HTML version deleted]]

More information about the R-sig-mixed-models mailing list