# [R-sig-ME] Sample size and mixed models

Spencer Graves spencer.graves at pdf.com
Fri Dec 12 23:24:03 CET 2008

```Hi, Ben:

In your example with 100091 observations in 444 groups, I think
the proper "n" depends on the variance components:  If the group
variance dominants and the within-group variance is negligible, then you
have only 444 "observations".  If the between-group variance is
negligible relative to the with-group variance, your 100091 observations
are practically independent, so you should use that number.

I think the formula you want to use is as follows:

IC = (-2)*log(likelihood at MLE) + 2*bias,

where IC = "Information Criterion", discussed by Sadanori Konishi and
Genshiro Kitagawa (2007) Information Criteria and Statistical Modeling
(Springer);  this formula appears on p. 55.

Later, they derive the following formula for the bias (p. 59):

bias = trace(solve(observed information, covariance of the
score function)).

The book develops many variations on this theme, including the
traditional AIC (sec. 3.4.4), BIC (ch. 9), AICc (p. 191), and a
Bootstrap Information Criterion (ch. 8).

Hope this helps.
Spencer

Ben Zuckerberg wrote:
> A very quick (and possibly silly) question for mixed modelers.
> Certain metrics such as Nagelkerke's R2 and the sample size adjusted
> AICc require the user to specify the sample size.  What is the
> appropriate sample size to use in a mixed model where you might have
> hundreds of repeat samples on a smaller sample of sites (in this case,
> the sites are treated as the random factor)?  In my case, the lmer
> output will produce the following information: Number of obs: 10091,
> groups: ID, 444.  For calculating sample size adjusted statistics,
> would you use an effective sample size of 444?  Thank you.
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

```