[R-sig-eco] which factor to nest?

Kingsford Jones kingsfordjones at gmail.com
Mon Jan 26 03:49:03 CET 2009


On Sun, Jan 25, 2009 at 6:43 PM, tavery <trevor.avery at acadiau.ca> wrote:
> Thanks Ben for a speedy response...
> I agree that a GLMM is probably more prudent and will investigate that idea
> further. My guard is against making the analysis too complicated...
>
> For interest and discussion: In the minutes between my post and the response
> I went 'way back' to consider an even simpler design (if it works with
> unbalanced data). In essence the beekeepers are block factors as the
> treatments were applied within these blocks to colonies at random and the
> beekeeper is not of any interest, just the parasite numbers of the colonies.
> In fundamental design terms, the randomized block design appears a viable
> option. However there are likely issues with the count data (that I will
> investigate as I am unaware of the data per se, but do have access to
> similar data that do, in fact, have lots of zeros).

Hi Trevor,

Yes, it sounds as though you have a nice, simple RBD that can be
analyzed using the code Ben suggested.  The lack of balance shouldn't
be a problem as long as you use one of the mixed models functions
(lmer, lme, glmmPQL, etc) rather than aov.  The fact that you have
count data shouldn't be a problem, although if you have an excessive
number of zeros you might want to have a look at the non-CRAN package
glmmADMB.

hth,

Kingsford Jones


>
> thanks,
> trevor
>
> Ben Bolker wrote:
>>
>>  My two cents:
>>
>>  * a GLMM if parasite numbers are small enough to
>> have to deal with them as count data (e.g. lots of zeros).
>> Otherwise (if you're lucky, as GLMMs are harder) most
>> likely a lognormal -- log-transform data or log(1+x) if
>> there are some zeros, and treat as a LMM (nlme or lmer).
>>
>>  * "Nesting" is more or less a red herring here, only
>> really has to do with multiple *random* factors (and
>> then more to do with the coding of the random factors
>> than with fundamental experimental design distinctions).
>>
>>  * so: antiG vs control is fixed, Beekeeper is probably
>> best treated as random (7 units is enough to make a
>> random effect plausible: if you had only 2 or 3 you
>> would probably have to treat as a fixed effect to
>> make progress)
>>
>>  * because unbalanced (and possibly GLMM), aov/sums
>> of squares approaches are probably not viable
>>
>>  * fairly straightforward with nlme (something like
>> lme(logparasites ~ antiG, random = ~1|Beekeeper) or
>> lme4:
>>
>> lmer(logparasites ~ antiG + (1|Beekeeper)) or
>> (for GLMM)
>>
>> glmer(logparasites ~ antiG + (1|Beekeeper), family=poisson)
>>
>>  * Two more things to watch out for:
>>
>>   - lme (nlme package) will give you p-values, lmer (lme4 package)
>> will not
>>   - if you end up fitting a GLMM you should definitely
>> worry about/check for overdispersion
>>
>>  Ben Bolker
>>
>>
>> tavery wrote:
>>
>>>
>>> Hi all,
>>> Maybe an expert of this particular design could provide insights into a
>>> interesting question (or possibly just a derailed view). Possibly outside of
>>> the R world, but has to be sorted out before R code can be generated - which
>>> should be trivial...
>>>
>>> - 7 beekeepers each with several hives
>>> - some hives treated with antiG, others left as controls
>>> - unbalanced design (not an equal number of treated or control sites
>>> among or within beekeepers)
>>> - measured parasite numbers (average per hive)
>>> Q: want to know if antiG reduces parasite load
>>>
>>> The initial reaction (from a student) was to consider Beekeeper as a
>>> random factor (although it could be considered fixed), and nest Treatment
>>> (antiG or control) within Beekeeper. This design is intuitive as Beekeepers
>>> are 'groups' and hives are 'subgroups' to which treatments are applied. Upon
>>> some investigation, it appears that the model could be flipped i.e. consider
>>> Treatment as a fixed factor and nest Beekeeper within Treatment. In this
>>> latter case, each Beekeeper would be represented in each Treatment and a
>>> crossed design results i.e. not nested at all. Various texts appear to
>>> 'arbitrarily' designate factors in similar models (see Zar on drug/drugstore
>>> example).
>>>
>>> a) What design is correct?
>>> b) What am I missing in way of determining groups and the ultimate
>>> design?
>>>
>>> thanks in advance,
>>> trevor
>>> biology department
>>> acadia
>>>
>>> _______________________________________________
>>> R-sig-ecology mailing list
>>> R-sig-ecology at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>>
>>
>>
>>
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>



More information about the R-sig-ecology mailing list