[R] nested anova and multiple comparisons

Kingsford Jones kingsfordjones at gmail.com
Sat Apr 26 23:52:48 CEST 2008

Hi Stephen,

On Sat, Apr 26, 2008 at 10:29 AM, Stephen Cole <swbcole at gmail.com> wrote:
>  I have managed to accomplish my first two goals by analyzing the data
>  as a 3 level nested Anova with the following code
>  mod1 <- aov(Count~Region/Location/Site, data=data)
>  This allows me to get the MS for a Anova table.   However R does not
>  compute the correct F-statistics (because it uses the residual MS for
>  the denominator in all calculations which it shouldnt) so i manually
>  computed the F-stats and variance components by hand.

R's just doing what it was asked it to do.  The fact that it used the
residual MS isn't a surprise given your code.

>  >From reading the help guide i learned about and tried using the
>  Error(Region/Location/Site) command but then i can only get MS and no
>  F-statistics and still hand to compute the rest by hand.

Yes, if you want to use Method-of-Moments estimation to solve for
expected mean squares for the variance components and calculate
F-stats w/ the 'correct' MS, then then specifying error strata is
reasonable for balanced data (assuming you've met model assumptions).
However, I believe most statisticians these days are more inclined to
use (Restricted) Maximum Likelihood estimation (e.g. using lme or
lmer) because of the added fliexibility it provides (also it won't
produce negative variance estimates when the mean squared error is
larger than the mean square between the groups of interest)

As for why you are only getting MS and no F-stats, it's hard to say
without a reproducible example (see the posting guide).  In my
experience this will occur when there are insufficient degrees of
freedom for calculating an F-stat at a given level.

>  My problem now is that i would liek to use TukeyHSD for multiple
>  comparisons.  Howeber since R is unable to compute the correct F
>  statistics in this nested design i also think it is using the wrong MS
>  and df in calculating tukeys q. for example when i use

Again, it's not that R is unable to, it's that you asked R not to.

>  TukeyHSD(mod1, "Region")  i will get values however i do not think
>  they have been calculated correctly.
>  Furthermore when i use the Error(Region/Location/Site) term i can then
>  no longer use TukeyHSD as i get the error message that there is no
>  applicable use of tukey on this object.

Looking at the methods available for TukeyHSD shows that it will work
for an object of class "aov"

> methods(TukeyHSD)
[1] TukeyHSD.aov

But ?aov states:


         An object of class 'c("aov", "lm")' or for multiple responses of
         class 'c("maov", "aov", "mlm", "lm")' or for multiple error strata
         of class '"aovlist"'.  There are 'print' and 'summary' methods
         available for these.

So your model with error strata is of class "aovlist" not "aov".

>  i am just wondering if there is any way to use Multiple comparisons
>  with R in a nested design like mine.

I'm not sure but the package 'multcomp' might be of help.

>  I have thought about using lme or lmer but if i understand them right
>  with a balanced design i should be able to get the same result using
>  aov

Even with balanced data, you don't get the exact same thing with aov.
As mentioned above lme and lmer use different estimation methods.  One
of the advantages of using  (RE)ML is a lot more flexibility in how
you specifiy the structure of random effects covariance matrices, and
(at least with lme) you have a lot of flexibility in how you structure
the residual covariance matrix as well.  This is particularly useful
when you are not meeting assumptions of constant variance and/or
independence of observations (which seems to be the rule rather than
the exception with ecological data with a spatial component, such as
you appear to have).

The theory behind analyses you want to do are quite complex with many
potential pitfalls.  If at all possible, I highly recommend finding
the help of someone who is an expert in these types of models (known
as mixed, hierarchical, multilevel, random effects, variance
components, nested ANOVA, etc....).


Kingsford Jones

>  Thanks
>  ______________________________________________
>  R-help at r-project.org mailing list
>  https://stat.ethz.ch/mailman/listinfo/r-help
>  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>  and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list