[R-sig-ME] effective degrees of freedom

Ben Bolker bbolker at gmail.com
Wed Apr 25 15:57:37 CEST 2012


Alan Haynes <aghaynes at ...> writes:

> This isnt so much an R question, as a theoretical question, but this is
> probably the best place to find an answer.
> Im attempting to calculate my effective sample size. Following Zuur et al
> 2009, for models with a single random effect, this seems to be relatively
> simple:
> 
> Intraclass correlation, ICC:
> 
> ICC = d^2 / (d^2+sigma^2)
> 
> in R:
> ranmod <- summary(mod)@REmat

   I would strongly encourage you to use the accessor
method (VarCorr) provided for this, i.e.

vc <- VarCorr(mod)
v <- vc[[1]][1,1]

 or more explicitly

v.subj <- vc[["Subject"]]["(Intercept)","(Intercept)"]

  i.e. pull out the subject-level random variance-covariance
matrix, extract the intercept-level variance

v.resid <- attr(vc,"sc")^2
ICC <- v.subj/(v.subj + v.resid)

> Using the random effects matrix from the summary.mer object we take the
> St.Dev value for the value of d and the residual St.Dev for sigma.
> 
> ICC is then used to calculate a "design effect" :
> 
> DE = 1 + (n-1) * ICC
> 
> in R:


> DE <- 1 + (length(ranef(mod)[,1])-1)*IC

  Hmmm.  I don't know what version of lme4 you're using, but for
me 

n <- nrow(ranef(mod)[[1]])

works better (there's no real difference between nrow(x) and length(x[,1]),
but the [[1]] is necessary to pull out the first element of the *list*
of random effects)
  
  
> 
> where n is the number of levels for the random effect. This is then used to
> calculate the effect sample size, N_effective:
> 
> N_effective = (N * n) / DE
> 
> in R:
> N_effective <- N*length(ranef(mod)[,1]) / DE
> (I havent come up with a good way to find N...)

  Well, this is a balanced design, so N*n is just the total number of
observations.  nobs(mod) **should** work but doesn't (in the stable
version of lme4), but nrow(model.frame(mod)) does ...
> 
> where N is the number of samples per level of the random effect.
> 
> I was wondering whether anyone had any idea how to calculate this for a
> model with 2 random effects. I havent been able to find any suggestions
> beyond this single random effect. I assume multiple random effects affect
> d, N and n.

  Well, the formulas that Zuur et al. give are from Snijders and Bosker 1999
(An Introduction to Basic and Advanced Multilevel Modeling); I suspect
you'd have to go there.

You can get a glance here:

http://tinyurl.com/snijdersboskerDE

However, this is likely to be a big can of worms.  If your two random
effects are nested, then the design effect should probably be calculated
according to the relevant level for whichever effect you are testing
(e.g. in a split-plot design, different effects are tested at different
levels).  If your two random effects are crossed, there's probably
no good answer.

> Perhaps more relevant, is it even worth worrying about effective sample
> size for use as a denominator DF when calculating P-Values?

   See http://glmm.wikidot.com/faq for some discussion of denominator
df and p-values ...

  Ben Bolker



More information about the R-sig-mixed-models mailing list