[R-sig-eco] Question About Syntax For Complex ANOVA Design

Mike Dunbar mdu at ceh.ac.uk
Fri Nov 7 17:27:13 CET 2008


Hi Joe

I think the command you want is probably simpler than you think:

 lme(HSP~coast*MBL, random= ~1|site)     
or
 lme(HSP~coast+MBL, random= ~1|site)     

coast and MBL have distinct levels so are fixed and site is random as you say.
Having site as random will take into account that there are repeated measures through time at each point (MBL).
Each site has two points (MBLs). lme will treat coast and MBL correctly providing that they are coded correctly. You can check this by running the analysis and looking at the degrees of freedom for the fixed effects. It's best not to thing to hard about the structure in terms of fixed/random/fixed: even though other stats packages might encourage this. Think of the random effects (including the error term)  providing the structure and the fixed effects slotting into that structure accordingly.
If you write random = ~time|site then you are saying random slopes for the time fixed effect, i.e. there is an overall time trend and each site responds differently around that trend. I don't think this is what you want as you don't specifically mention time trends.

Or if time of year is a factor, something like
lme(HSP~coast+MBL+time, random= ~1|site)    
But the problem here is that you may run out of replication to estimate any of the fixed effects, each combination of coast, MBL, site and time is unique. 

Also BUT BUT, even if you allow the three time samples to be replicates:
You are potentially going to have an issue if you only have four sites. This is not alot to estimate a random effect. One option is to treat site as fixed. There is an argument that site is indeed random, so it should be treated as random, in which case I'm not sure that lme (or the newer lmer) will handle the full uncertainty for small sample sizes correctly). To do that would need a more fully Bayesian approach. But I'm writing that from memory, I don't have the reference to hand. 

Finally, it all depends on what the hypothesis you are trying to test: what's the hypothesis?

regards

Mike





>>> Joe Simonis <jls468 at cornell.edu> 07/11/2008 15:53 >>>
Hey Everyone,

    I'm helping a friend out with analyzing some of her data, and I 
haven't run an ANOVA like this in a while, and especially not in R.  I'm 
having a bit of trouble figuring out the correct syntax and so I was 
hoping to get feedback.  Any input would be welcomed.  As of now, I also 
don't have the data, but I've been told that sample size should be equal 
for all of the combinations (although that may not be true).  In any 
case, for now, let's assume all sample sizes are equal.

    The basics of the mensurative experiment are as follows:

    The study was looking at variation in physiological values (HSP) of 
intertidal mussels across a few different sites at three different times 
of year.  The sampling was done in New Zealand, with 2 sites sampled on 
each of the East and West coasts, and within each site, there were two 
sampling points (mussel bed location, MBL), one low in the intertidal 
one high in the intertidal.  There are two levels of MBL (low and high) 
at each site and there are two sites for each cost.  I see this as MBL 
nested in site, nested in coast.  However, it seems to me that only site 
is a random factor.  Both MBLs were picked specifically at that site and 
were done so in a way to compare high to low locations, so that seems 
fixed to me.  Site was picked more to look at site-to-site variation 
(i.e random factor).  And coasts were explicitly being compared (i.e. 
fixed factor). 

    So, I see that as a fixed factor nested in a random factor  nested 
in a fixed factor.  Does that make sense?  And then there's the bit 
about repeated measures, since they sampled mussels from each MBL 3 
times.  I don't think that necessarily complicates things too much, but 
maybe it does?  I can put together the ANOVA table on paper in the way 
I'd like to analyze the data (working off of examples in Quinn and 
Keough on pg 314, but with an additional level of nestedness).  However, 
I am pretty lost on how to code the syntax for the analysis in R.  I've 
had a few different ideas, but none of them really seem correct to me.  
I think the biggest problem for me is figuring out how to keep the 
structure of the nestedness in tact despite the fact that some factors 
are fixed and some are random. 

    The best I could come up with so far is     lme(HSP~MBL, random= 
~time|coast/site)     

    But it doesn't seem really right to me.  I have MBL in the fixed 
part of the model, but when it's like that, I think the nestedness gets 
lost (since there's no cost/site/MBL anywhere). 

    Again, I would appreciate any insights into the syntax or the 
general way I am approaching this analysis.  I've been trying to piece 
the stuff together from nested analyses and time-series analyses in 
Crawley's book, but I'm just not getting it.

    Thanks a bunch in advance!!

--Joe

-- 
Joseph L. Simonis

Cornell University
Department of Ecology & Evolutionary Biology
E231 Corson Hall
Ithaca, NY 14853 USA
email:  jls468 at cornell.edu 

http://www.people.cornell.edu/pages/jls468/ 

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org 
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

-- 
This message (and any attachments) is for the recipient ...{{dropped:6}}



More information about the R-sig-ecology mailing list