[R-sig-ME] VERY simple question about NESTING in experimental designs (for glm, lme, lmer, etc.)

Toby Marthews toby.marthews at ouce.ox.ac.uk
Thu Jan 20 15:23:55 CET 2011


Dear R-sig-mixed-models,

This is probably a painfully simple question, but I can't seem to pin it down from any source. About NESTING.

Imagine I carry out a nested experiment on birds. I measure the size (say in cm beak to tail) of birds from 6 different locations in Country A with 20 birds in each location. I then also repeat this experiment in Country B with the same replication. Crucially, because the work at each of the 12 locations was being carried out by a different collaborating group, they all numbered their individual birds simply 1-20. The data I eventually receive for my metastudy is something like:

Country	Location	Birdnum	Birdlength
   A	1	1	7.3
   A	1	2	6.7
   ...	...	...	...
   A	1	20	7.9
   A	2	1	6.7
   A	2	2	6.9
   ...	...	...	...
   B	1	1	6.7
   B	1	2	6.6

The bird numbers and location numbers here are not unique across the experimental design (bird #1 at location 1 is not the same individual as bird #1 at location 2). Hearing what Prof Bates said about not using "implicit nesting" in 2005 (http://cran.r-project.org/doc/Rnews/Rnews_2005-1.pdf ), I construct new variables bnum=factor(paste(Country,"-",Location,"-",Birdnum,sep="")) which contains levels "A-1-1","A-1-2", ..., "B-1-2", etc., and locnum=factor(paste(Country,"-",Location,sep="")) which contains levels "A-1","A-1", ..., "B-1", etc. and that means I can use locnum and bnum instead of Location and Birdnum and I have a unique numbering system.

Say I am interested in the differences between birds in countries A and B with location and birdnumber being random effects. I believe I should try to use a command like
  lme(fixed=Birdlength~Country,random=~1|bnum)
  or glm(Birdlength~Country)
  or lmer(Birdlength~Country+(1|bnum))
however I have been criticised on two counts for this by colleagues-who-shall-remain-nameless:

    (1) This is a nested design so I should replace bnum with Country/locnum/bnum or Country/Location/Birdnum in both the lme and the lmer command. (I'm pretty sure I can just use bnum on its own because by knowing bnum I automatically know the corresponding country and location of the measurement so Country and Location are effectively redundant (surely?) however, if I'm right then that means that I will only ever need "/" if my nesting is somehow implicit (i.e. because I usually use paste in the way described, I should never have to use "/" even in nested experiments (which seems odd?))
    (2) Because Country (my fixed predictor) is being used to calculate bnum, I am mixing fixed and random effects inappropriately. (I am less sure about this one: perhaps "random=~Country|bnum" would be more correct or something else?)

Even with this very simple experimental setup, a number of possible alternatives have been suggested by my colleagues including
  glm(Birdlength~Country+Country/locnum+Country/locnum/bnum)
  lmer(Birdlength~Country+Country|locnum+Country|(locnum/bnum))
  lmer(Birdlength~Country+Country|locnum+Country|(bnum))
  lmer(Birdlength~Country+1|locnum+1|(locnum/bnum))
with the result that I'm getting very confused!

I'm pretty sure I'm making a meal of this little question so I'll stop there, but any comments would be very welcome!

Best,
Toby Marthews




More information about the R-sig-mixed-models mailing list