[R-sig-ME] lmer and standard error

Mon Mar 30 22:44:45 CEST 2015

Alexandre Lafontaine <a_lafontaine at ...> writes:

> Dear R users,

> I have a question regarding how lmer (either lme4 or lmerTest)
> handles the degrees of freedom and calculation of the standard error
> for repeated observations.  I have a dataset in wich I have multiple
> observations for 57 different idyears in two different regions
> (range).

  Your formatting got mangled; it's best to try to send to the
mailing list using the simplest format you have available (plain
text, monospace font).  (I'm further mangling it because I'm posting
via Gmane, which doesn't like lines > 80 characters)

> head(database) 
idyear      range overlapok cut05 c0620 regen water
     roads  elevaju36 GJ502006 charlevoix        
 1     0     0     0    
> 0     0 223.888937 GJ502006 charlevoix         1   
  0   100     0     0     0 220.582938 GJ502006 charlevoix         
1     0   100     0     0     0 219.411039
> GJ502006 charlevoix         1     0   100     0     0     
0 219.411040 GJ502006 charlevoix         1     0   100     0     0     0 
219.411041 GJ502006
> charlevoix         1     0   100     0     0     0 219.0555

> Here is my lmer formula in which i nested idyears in range as random effects:

> fidint5 <- lmer(overlapok ~ natdist + cut05 + c0620 + regen +
  (1|range/idyear) , data=database)

summary(fidint5)

The summary identifies the good number of  groups (57) for 2 range. 
However, the df shows that the error is
computed on between 2404 and 2418 df which returns really high t values 
and therefore extremely small p values.

 Random effects: Groups       Name        Variance Std.Dev. 
idyear:range (Intercept) 0.16211  0.4026   
range        (Intercept) 0.00000  0.0000   
Residual                 0.01709  0.1307  
Number of obs: 2429, groups:  idyear:range, 33; range, 2

   It doesn't make sense to use range as a random effect, since
there are only two levels.  Most practical to treat it as fixed
instead.

[snip]

> Are the groups specified in the random term considered in this
> result? Is the way I specified the random effects incorrect or is
> this the way lmer function is designed? I am really only beginning
> to use mixed models and would really appreciate any help on this.

  The plain old lme4 package gives no df, leaving it to you to 
work it out for yourself.

  lmerTest uses Satterthwaite approximations, which are generally
pretty good but might have failed you here.  You could try the
pbkrtest and/or afex packages to get Kenward-Roger approximations,
which are slower but more reliable (if they're very close to
the Satterthwaite results you could fall back on the Satterthwaite
approx for practical use rather than slowing yourself down all
the time).

  If your covariates (natdist + cut05 + c0620 + regen) 
vary within years, then this is more or less a randomized-block
design, in which case the df given will be about right.

  By treating 'range' as a fixed effect, you won't be able to
make inferences beyond the two ranges considered -- but practically
speaking you wouldn't be able to extrapolate to other ranges if
you had only measured two in the first place ...

  Please try not to post in HTML ...