[R-sig-ME] multiple variance structure in lmer giving zero variances

Sun Jun 3 21:15:54 CEST 2012

Sara Krause <skkrause at ...> writes:

> I'm hoping someone can help me out.  Forgive me if I my mistake is
> something simple.  I am new to mixed models, new to R, and new to lme4 and
> am struggling to figure everything out.  I have two questions that I am
> hoping someone can answer.
> 1) Am I using the correct random structure for my model?
> 2) Can someone help me figure out what is wrong with my syntax to code for
> random effect variance by treatment group?

  (I can understand your desperation, but please don't cross-post ...
I almost answered this on r-help with a suggestion that you try
on r-sig-mixed instead.)

> These questions somewhat go together but let me tackle number one first.  I
> was told by our statistician (who unfortunately doesn't know R well) that
> my model should include random effects for ID, ID*state, and ID*season.
> If my understanding of lme4 code is correct, my random structure would
> appear as it does in this model
> M1<-lmer (y~trt*season*state*site+(1|ID)+(1|state)+(1|season))
> 
 [snipped from below]:

> There are two levels in each of these factors.  Treatment is G or S
> (basically treated or control groups) and state is either
> Pre-treatment or Post-treatment.  My other fixed factors are a site
> factor (two locations) and a season factor (breeding or non-breeding
> season).  Is this correct?  (I am starting with the fully crossed
> fixed effects and I will use iterative model selection to find the
> optimal model after I make sure that I have the correct random
> terms)

   In general, categorical predictors should not appear in both
the fixed and random parts of the model -- that constitutes overfitting.

Depending on the structure of your data you should also strongly
consider including interactions between treatment and your random
factors (see Schielzeth, Holger, and Wolfgang
Forstmeier. 2009. “Conclusions Beyond Support: Overconfident Estimates
in Mixed Models.” Behavioral Ecology 20 (2):416–420.
doi:10.1093/beheco/arn145.
http://www.ncbi.nlm.nih.gov/pubmed/19461866.).  However, you can only
include treatment interactions that make sense. It sounds like each
individual is allocated to a single treatment, so (guessing that your
statistician means that you want the *interactions* of ID with state
and season) I think you should probably use

y~trt*season*state*site+(state|ID)+(0+season|ID)

which includes a random effect of state, a state:ID interaction
(variation of the state effect among individuals), and a season:ID
interaction (variation of seasonal effect among individuals).

  Why are you moving to lmer? is it so that you can handle the
separate effects of state:ID and season:ID ?  

> The second question is a bit more complicated and changes my random
> structure as well.  I had originally built my model in nlme  and used the
> multiple variance (varIdent) function to allow different variance for two
> of my terms (trt and state) in my nlme model because different levels in
> each term had different variances. 
> 
> I found a very helpful post (
> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2007q3/000248.html) on
> how to also do this in lmer but it doesn't seem to be working correctly for
> me and I have not been able to figure out why. 

  I'm mostly going to postpone comment on your second question until
your first question is sorted out, although I would also encourage you
to see if your heteroscedasticity problem might be handled by transformation --
most typically, if treatment increases both the mean and variance a lot,
then log transforming the data might take care of the problem with less
hassle ...

> I am also trying to do this
> with 2 factors instead of one.
> The overall structure of my data is
> Structure of my dataset (d)
>   ..@ frame   :'data.frame': 173 obs. of  10 variables:
>   .. ..$ y     : int [1:173] 209 382 448 353 224 112 198 273 495 622 ...
>   .. ..$ trt   : Factor w/ 2 levels "G","S": 1 1 1 1 1 1 1 1 1 1 ...
>   .. ..$ season: Factor w/ 2 levels "Breeding","NonBreeding": 2 2 2 2 2 2 2
> 1 1 1 ...
>   .. ..$ state : Factor w/ 2 levels "Post","Pre": 1 1 1 1 1 1 1 1 1 1 ...
>   .. ..$ site  : Factor w/ 2 levels "Mrak","Orchard": 1 1 1 1 1 2 2 1 1 1
> ...
>   .. ..$ G     : num [1:173] 1 1 1 1 1 1 1 1 1 1 ...
>   .. ..$ ID    : Factor w/ 59 levels "100","103","105",..: 11 19 22 58 6 36
> 50 11 20 24 ...
>   .. ..$ S     : num [1:173] 0 0 0 0 0 0 0 0 0 0 ...
>   .. ..$ Pre   : num [1:173] 0 0 0 0 0 0 0 0 0 0 ...
>   .. ..$ Post  : num [1:173] 1 1 1 1 1 1 1 1 1 1 ...
> 

> m1<-lmer(y~trt*season*state*site+(0+G|ID)+(0+S|ID) +(0+Pre|ID) +
> (0+Post|ID)+(1|season), data=d)

  Your basic problem here is that it doesn't make sense to test the interaction
of treatment with ID (as you have effectively done here), because each
individual is only in one treatment category ...