[R-sig-ME] unbalanced data in nested lmer model

Tue Mar 30 11:40:11 CEST 2010

Dear all,
thank you for the very helpful comments.
I understand now that I have data that don't give the power to separate 
regional and farm effects. It would be possible to leave both random 
effects in the model as they are features of the study's data structure, 
but then concentrate on the fixed effects for interpretation.

Regards,
Jana

Ben Bolker schrieb:
> Luca Borger wrote:
> 
> [Jana Bürger:]
>>> Moreover I don't understand your argument that fitting random efects with 
>>> less than 5 levels was dodgy, as often examples in the books have 3 
>>> samples from one beach, or 3 laboratory workers within one laboratory. 
>>> These are less than 5 levels, are they not?
>> These are usually toy datasets to exemplify how the approach works, I do not 
>> think they make a claim that the resulting variance estimates are very 
>> reliable (think in the Zuur etal. mixed effects book you can find more 
>> realistic examples, if I remember well). Plus, "level" refers to the number 
>> of beaches or the number of labs etc. and the resulting variance estimates - 
>> if less than say 5 it appears that you might be better off fitting it as a 
>> fixed effect and not trying to decompose the variance into between labs and 
>> within labs etc. Anyway, just my 2 cents and hope I explained this 
>> correctly... 
>>
>> See also the wiki page set up by Ben Bolker:
>> http://glmm.wikidot.com/faq
>>
>> e.g. you might be interested in this entry therein:
>>
>> Zero or very small random effects variance estimates;
>> (...)
>> Very small variance estimates, or very large correlation estimates, often 
>> indicates unidentifiability/lack of data (either due to exact 
>> identifiability [e.g. designs that are not replicated at an important level] 
>> or weak identifiable (designs that would be workable with more data of the 
>> same type)
> 
>   I just added this to the FAQ:
> 
> Should I treat factor xxx as fixed or random?
> 
> This is in general a far more difficult question than it seems on the
> surface. There are many competing philosophies and definitions (see
> Gelman 2xxx). One point of particular relevance to 'modern' mixed model
> estimation (rather than 'classical' method-of-moments estimation) is
> that, for practical purposes, there must be a reasonable number of
> random-effects levels (e.g. blocks) — more than 5 or 6 at a minimum.
> 
>     e.g., from Crawley (2002) p. 670: "Are there enough levels of the
> factor in the data on which to base an estimate of the variance of the
> population of effects? No, means [you should probably treat the variable
> as] fixed effects."
> 
> Some researchers (who treat fixed vs random as a philosophical rather
> than a pragmatic decision) object to this approach.
> 
> Treating factors with small numbers of levels as random will in the best
> case lead to very small estimates of random effects; in the worst case
> it will lead to various numerical difficulties such as lack of
> convergence, zero variance estimates, etc.. In the classical
> method-of-moments approach these problems do not arise (because the sums
> of squares are always well defined as long as there are at least two
> units), but the underlying problems of lack of power are there nevertheless.
> 
>    (Contributions welcome!)
> 

-- 
Jana Bürger

Universität Rostock
Agrar-  und Umweltwissenschaftliche Fakultät
FG Phytomedizin
Satower Straße 48
18059 Rostock

Tel. 0381-498 31 71
Fax.0381-498 31 62