[R-sig-eco] which factor to nest?

Mon Jan 26 02:43:09 CET 2009

Thanks Ben for a speedy response...
I agree that a GLMM is probably more prudent and will investigate that 
idea further. My guard is against making the analysis too complicated...

For interest and discussion: In the minutes between my post and the 
response I went 'way back' to consider an even simpler design (if it 
works with unbalanced data). In essence the beekeepers are block factors 
as the treatments were applied within these blocks to colonies at random 
and the beekeeper is not of any interest, just the parasite numbers of 
the colonies. In fundamental design terms, the randomized block design 
appears a viable option. However there are likely issues with the count 
data (that I will investigate as I am unaware of the data per se, but do 
have access to similar data that do, in fact, have lots of zeros). My 
'traditional'

thanks,
trevor

Ben Bolker wrote:
>   My two cents:
>
>   * a GLMM if parasite numbers are small enough to
> have to deal with them as count data (e.g. lots of zeros).
> Otherwise (if you're lucky, as GLMMs are harder) most
> likely a lognormal -- log-transform data or log(1+x) if
> there are some zeros, and treat as a LMM (nlme or lmer).
>
>   * "Nesting" is more or less a red herring here, only
> really has to do with multiple *random* factors (and
> then more to do with the coding of the random factors
> than with fundamental experimental design distinctions).
>
>   * so: antiG vs control is fixed, Beekeeper is probably
> best treated as random (7 units is enough to make a
> random effect plausible: if you had only 2 or 3 you
> would probably have to treat as a fixed effect to
> make progress)
>
>   * because unbalanced (and possibly GLMM), aov/sums
> of squares approaches are probably not viable
>
>   * fairly straightforward with nlme (something like
> lme(logparasites ~ antiG, random = ~1|Beekeeper) or
> lme4:
>
> lmer(logparasites ~ antiG + (1|Beekeeper)) or
> (for GLMM)
>
> glmer(logparasites ~ antiG + (1|Beekeeper), family=poisson)
>
>  * Two more things to watch out for:
>
>    - lme (nlme package) will give you p-values, lmer (lme4 package)
> will not
>    - if you end up fitting a GLMM you should definitely
> worry about/check for overdispersion
>
>   Ben Bolker
>
>
> tavery wrote:
>   
>> Hi all,
>> Maybe an expert of this particular design could provide insights into a 
>> interesting question (or possibly just a derailed view). Possibly 
>> outside of the R world, but has to be sorted out before R code can be 
>> generated - which should be trivial...
>>
>> - 7 beekeepers each with several hives
>> - some hives treated with antiG, others left as controls
>> - unbalanced design (not an equal number of treated or control sites 
>> among or within beekeepers)
>> - measured parasite numbers (average per hive)
>> Q: want to know if antiG reduces parasite load
>>
>> The initial reaction (from a student) was to consider Beekeeper as a 
>> random factor (although it could be considered fixed), and nest 
>> Treatment (antiG or control) within Beekeeper. This design is intuitive 
>> as Beekeepers are 'groups' and hives are 'subgroups' to which treatments 
>> are applied. Upon some investigation, it appears that the model could be 
>> flipped i.e. consider Treatment as a fixed factor and nest Beekeeper 
>> within Treatment. In this latter case, each Beekeeper would be 
>> represented in each Treatment and a crossed design results i.e. not 
>> nested at all. Various texts appear to 'arbitrarily' designate factors 
>> in similar models (see Zar on drug/drugstore example).
>>
>> a) What design is correct?
>> b) What am I missing in way of determining groups and the ultimate design?
>>
>> thanks in advance,
>> trevor
>> biology department
>> acadia
>>
>> _______________________________________________
>> R-sig-ecology mailing list
>> R-sig-ecology at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>     
>
>
>