[R-sig-ME] Sample size and mixed models

Fri Dec 12 23:24:03 CET 2008

Hi, Ben: 

      In your example with 100091 observations in 444 groups, I think 
the proper "n" depends on the variance components:  If the group 
variance dominants and the within-group variance is negligible, then you 
have only 444 "observations".  If the between-group variance is 
negligible relative to the with-group variance, your 100091 observations 
are practically independent, so you should use that number. 

      I think the formula you want to use is as follows: 

           IC = (-2)*log(likelihood at MLE) + 2*bias,

where IC = "Information Criterion", discussed by Sadanori Konishi and 
Genshiro Kitagawa (2007) Information Criteria and Statistical Modeling 
(Springer);  this formula appears on p. 55. 

      Later, they derive the following formula for the bias (p. 59): 

           bias = trace(solve(observed information, covariance of the 
score function)). 

      The book develops many variations on this theme, including the 
traditional AIC (sec. 3.4.4), BIC (ch. 9), AICc (p. 191), and a 
Bootstrap Information Criterion (ch. 8). 

      Hope this helps.
      Spencer     

Ben Zuckerberg wrote:
> A very quick (and possibly silly) question for mixed modelers.  
> Certain metrics such as Nagelkerke's R2 and the sample size adjusted 
> AICc require the user to specify the sample size.  What is the 
> appropriate sample size to use in a mixed model where you might have 
> hundreds of repeat samples on a smaller sample of sites (in this case, 
> the sites are treated as the random factor)?  In my case, the lmer 
> output will produce the following information: Number of obs: 10091, 
> groups: ID, 444.  For calculating sample size adjusted statistics, 
> would you use an effective sample size of 444?  Thank you.
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models