[R-sig-ME] VERY simple question about NESTING in experimental designs (for glm, lme, lmer, etc.)
Toby Marthews
toby.marthews at ouce.ox.ac.uk
Thu Jan 20 15:23:55 CET 2011
Dear R-sig-mixed-models,
This is probably a painfully simple question, but I can't seem to pin it down from any source. About NESTING.
Imagine I carry out a nested experiment on birds. I measure the size (say in cm beak to tail) of birds from 6 different locations in Country A with 20 birds in each location. I then also repeat this experiment in Country B with the same replication. Crucially, because the work at each of the 12 locations was being carried out by a different collaborating group, they all numbered their individual birds simply 1-20. The data I eventually receive for my metastudy is something like:
Country Location Birdnum Birdlength
A 1 1 7.3
A 1 2 6.7
... ... ... ...
A 1 20 7.9
A 2 1 6.7
A 2 2 6.9
... ... ... ...
B 1 1 6.7
B 1 2 6.6
The bird numbers and location numbers here are not unique across the experimental design (bird #1 at location 1 is not the same individual as bird #1 at location 2). Hearing what Prof Bates said about not using "implicit nesting" in 2005 (http://cran.r-project.org/doc/Rnews/Rnews_2005-1.pdf ), I construct new variables bnum=factor(paste(Country,"-",Location,"-",Birdnum,sep="")) which contains levels "A-1-1","A-1-2", ..., "B-1-2", etc., and locnum=factor(paste(Country,"-",Location,sep="")) which contains levels "A-1","A-1", ..., "B-1", etc. and that means I can use locnum and bnum instead of Location and Birdnum and I have a unique numbering system.
Say I am interested in the differences between birds in countries A and B with location and birdnumber being random effects. I believe I should try to use a command like
lme(fixed=Birdlength~Country,random=~1|bnum)
or glm(Birdlength~Country)
or lmer(Birdlength~Country+(1|bnum))
however I have been criticised on two counts for this by colleagues-who-shall-remain-nameless:
(1) This is a nested design so I should replace bnum with Country/locnum/bnum or Country/Location/Birdnum in both the lme and the lmer command. (I'm pretty sure I can just use bnum on its own because by knowing bnum I automatically know the corresponding country and location of the measurement so Country and Location are effectively redundant (surely?) however, if I'm right then that means that I will only ever need "/" if my nesting is somehow implicit (i.e. because I usually use paste in the way described, I should never have to use "/" even in nested experiments (which seems odd?))
(2) Because Country (my fixed predictor) is being used to calculate bnum, I am mixing fixed and random effects inappropriately. (I am less sure about this one: perhaps "random=~Country|bnum" would be more correct or something else?)
Even with this very simple experimental setup, a number of possible alternatives have been suggested by my colleagues including
glm(Birdlength~Country+Country/locnum+Country/locnum/bnum)
lmer(Birdlength~Country+Country|locnum+Country|(locnum/bnum))
lmer(Birdlength~Country+Country|locnum+Country|(bnum))
lmer(Birdlength~Country+1|locnum+1|(locnum/bnum))
with the result that I'm getting very confused!
I'm pretty sure I'm making a meal of this little question so I'll stop there, but any comments would be very welcome!
Best,
Toby Marthews
More information about the R-sig-mixed-models
mailing list