[R-sig-ME] Too small a sample size for lmer?

Sat Jul 18 19:26:25 CEST 2009

Sorry I meant mcmcsamp. Page 398, second column, after the table of 
coefficients estimated.

yeah, I have had lots of problems with modeling this data and so was really 
wondering whether there was a better way to look at it, but maintaining the 
repeated design.

Martin, thanks for the guidance. I have had a look at the link you 
suggested and will wait and see whether code is brought out to do this.

Thank you
Christine

--On 18 July 2009 13:18 -0400 Ben Bolker <bolker at ufl.edu> wrote:

>
>   Why do Baayen et al 2008 recommend against MCMC?  Do you mean mcmcsamp
>  (which may or may not be unreliable in this incarnation, I don't know)
> or MCMC in general?  I tried to find it in the paper -- do you mean for
> variance parameters (where the zero component gets in the way)?
>
>   Your response variables are also interesting -- unless both plant
> count and species richness are large numbers, they'll probably have
> non-normal distributions, which adds to complication (it is possible,
> but not really really easy, to deal with overdispersed [negative
> binomial / log-normal-Poisson / quasi-Poisson ] count data in glmer, and
> species richness often has quite an odd distribution depending on the
> characteristics of the "regional species pool" ...)
>
>   Ben Bolker
>
>
> Martin Maechler wrote:
>>>>>>> "CG" == Christine Griffiths <Christine.Griffiths at bristol.ac.uk>
>>>>>>>     on Sat, 18 Jul 2009 13:58:36 +0100 writes:
>>
>>     CG> Dear R users,
>>     CG> Many of you may be familiar with my design as I have posted a
>>     number of  CG> queries before. Having consulted with someone in my
>>     department about  CG> estimating bias corrected confidence intervals
>>     for small sample sizes  CG> (rather than MCMC which Baayen et al.
>>     2008 suggest should not be used),  CG> they implied that I should
>>     not be using lmer for such a small sample size  CG> as lmer was
>>     designed to deal with very large datasets. Is this still the  CG>
>>     case? If so what is regarded as a small sample size?
>>
>> The fact that it was designed *to be able* to deal with big data
>> sets does not mean that it was not appropriate for small data
>> sets as well.
>> It's just that mixed effect models with large data sets an
>> crossed random effects really currently can *only* be
>> analyzed with lmer {no other software available, not even if you
>> pay much}.
>>
>> Said all that, I think your situation looks like a case where I
>> would want to use (probably a parametric) bootstrap,
>> and interestingly enough, at the UseR! 2009 meeting in Rennes,
>> 10 days ago, there was a nice talk on this topic:
>>
>>    Jose A. Sanchez-Espigares, Jordi Ocaña 	
>>    An R implementation of bootstrap procedures for mixed models
>>
>> You can find the abstract *and* slides on
>>   http://www.agrocampus-ouest.fr/math/useR-2009/abstracts/user_author.ht
>>   ml
>>
>> I don't think that their R code is already publicly available,
>> but I've CC'ed one of the authors, and they may be willing to
>> let you use their code before release.
>>
>> Martin Maechler, ETH Zurich
>>
>>     CG> Below is a description of my data. I have 5/6 enclosures
>>     (replicates) per  CG> treatment - Aldabra/Radiata/control. Aldabra
>>     and radiata refer to two  CG> different tortoise species, while
>>     control lacks tortoises. The enclosures  CG> were assigned to a
>>     block: a block containing each of the 3 treatments, i.e.  CG> 6
>>     blocks in total. Each month for ten months I collected data: a
>>     repeated  CG> crossed design. Unfortunately, I have non-orthogonal,
>>     unbalanced data (5/6  CG> enclosures per treatment) as I cannot use
>>     a replicate within the aldabra  CG> and radiata treatments. These
>>     are however from different blocks so I am  CG> reluctant to axe them
>>     to achieve balanced data as this would leave me only  CG> 4 blocks.
>>     I measured various attributes which I think that tortoises would
>>     CG> have an impact on, e.g. plant count, species richness. Because
>>     my data is  CG> unbalanced and a repeated measures design I had
>>     chosen lmer to best model  CG> this.
>>
>>     CG> For one other aspect, I calculate food web properties, for which
>>     I have no  CG> replication, i.e. only one observation per treatment
>>     per month. Would lmer  CG> be an acceptable way to analyse this data?
>>
>>     CG> If lmer is not advised for the analyses of these data, what
>>     other analyses  CG> techniques should I investigate?
>>
>>     CG> Baayen et al. (2008)Mixed-effects modeling with crossed random
>>     effects CG> for subjects and items. Journal of Memory and Language,
>>     59, 390-412.
>>
>>     CG> Many thanks,
>>     CG> Christine
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
> --
> Ben Bolker
> Associate professor, Biology Dep't, Univ. of Florida
> bolker at ufl.edu / www.zoology.ufl.edu/bolker
> GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc