[R-sig-ME] Too small a sample size for lmer?
Martin Maechler
maechler at stat.math.ethz.ch
Sat Jul 18 16:59:00 CEST 2009
>>>>> "CG" == Christine Griffiths <Christine.Griffiths at bristol.ac.uk>
>>>>> on Sat, 18 Jul 2009 13:58:36 +0100 writes:
CG> Dear R users,
CG> Many of you may be familiar with my design as I have posted a number of
CG> queries before. Having consulted with someone in my department about
CG> estimating bias corrected confidence intervals for small sample sizes
CG> (rather than MCMC which Baayen et al. 2008 suggest should not be used),
CG> they implied that I should not be using lmer for such a small sample size
CG> as lmer was designed to deal with very large datasets. Is this still the
CG> case? If so what is regarded as a small sample size?
The fact that it was designed *to be able* to deal with big data
sets does not mean that it was not appropriate for small data
sets as well.
It's just that mixed effect models with large data sets an
crossed random effects really currently can *only* be
analyzed with lmer {no other software available, not even if you
pay much}.
Said all that, I think your situation looks like a case where I
would want to use (probably a parametric) bootstrap,
and interestingly enough, at the UseR! 2009 meeting in Rennes,
10 days ago, there was a nice talk on this topic:
Jose A. Sanchez-Espigares, Jordi Ocaña
An R implementation of bootstrap procedures for mixed models
You can find the abstract *and* slides on
http://www.agrocampus-ouest.fr/math/useR-2009/abstracts/user_author.html
I don't think that their R code is already publicly available,
but I've CC'ed one of the authors, and they may be willing to
let you use their code before release.
Martin Maechler, ETH Zurich
CG> Below is a description of my data. I have 5/6 enclosures (replicates) per
CG> treatment - Aldabra/Radiata/control. Aldabra and radiata refer to two
CG> different tortoise species, while control lacks tortoises. The enclosures
CG> were assigned to a block: a block containing each of the 3 treatments, i.e.
CG> 6 blocks in total. Each month for ten months I collected data: a repeated
CG> crossed design. Unfortunately, I have non-orthogonal, unbalanced data (5/6
CG> enclosures per treatment) as I cannot use a replicate within the aldabra
CG> and radiata treatments. These are however from different blocks so I am
CG> reluctant to axe them to achieve balanced data as this would leave me only
CG> 4 blocks. I measured various attributes which I think that tortoises would
CG> have an impact on, e.g. plant count, species richness. Because my data is
CG> unbalanced and a repeated measures design I had chosen lmer to best model
CG> this.
CG> For one other aspect, I calculate food web properties, for which I have no
CG> replication, i.e. only one observation per treatment per month. Would lmer
CG> be an acceptable way to analyse this data?
CG> If lmer is not advised for the analyses of these data, what other analyses
CG> techniques should I investigate?
CG> Baayen et al. (2008)Mixed-effects modeling with crossed random effects
CG> for subjects and items. Journal of Memory and Language, 59, 390-412.
CG> Many thanks,
CG> Christine
More information about the R-sig-mixed-models
mailing list