[R] Specifying an appropriate error term in a hierarchical regression

Peter Dalgaard p.dalgaard at biostat.ku.dk
Mon Apr 17 23:55:14 CEST 2006


"Chris Bergstresser" <chris at subtlety.com> writes:

> Hi all --
> 
>    So I obtained a copy of _Statistical Models in S_, which I hoped
> would answer my question, but it doesn't quite.  On page 151, it
> discusses nested terms, which is what I'm working with.  So I tried
> using "value ~ a/b" which apparently should fit factor b within factor
> a.  But the summary is the exact same summary that "value ~ a + b" and
> "value ~ a * b" produces, which doesn't seem right.  And the F value
> reported for A is wrong; it should compare the mean square of A with
> the mean square of B(A), not the residual mean square as it does.
>    What am I doing wrong?

Confusing nesting and nesting...

The / operator is designed to handle cases like this

a b
1 1
1 2
2 1
2 2

in which the numbering of b only makes sense within a - no connection
between b=1 when a=1 and when a=2 (think group1, member1). In this
case, the interaction a:b makes sense, but a main effect of b does
not. You have the opposite scenario

a b
1 1
1 2
2 3
2 4

in that case, the interaction a:b is effectively equivalent to b since
the crosstable of a and b has only four non-empty cells.

The problem is that both cases are called "nested".

It is not clear to me what B(A) is supposed to mean in Kirk's
terminology. Sounds like it could be the total sum of squares for the
b stratum, which is the effect of b ignoring a (?).
 
> On 4/11/06, Chris Bergstresser <chris at subtlety.com> wrote:
> > Hi all --
> >
> >    So I'm working through my statistics homework again, and trying to
> > reproduce the examples in the book (Kirk's _Experimental Design_,
> > third edition) in R.  This is a completely randomized hierarchical
> > design (CRH-28(A)).  The B factor is completely nested within the A
> > factor.  Pages 480-482, for those playing along at home.
> >
> >    I can use:
> >
> > summary(aov(value ~ a + Error(b), data = ex));
> >
> >    to get the correct F value for the main effect of A.  I can use
> >
> > summary(aov(value ~ b, data = ex));
> >
> >    to get the correct values for B(A) and the within cell SS.  But I
> > can't find any documentation about constructing the Error term to get
> > this output in a single analysis (except for
> > http://www.psych.upenn.edu/~baron/rpsych/rpsych.html, but Kirk doesn't
> > talk about these tests in terms of Error strata, so it's a little hard
> > to figure out the correspondence).
> >    Also, the documentation on the Error term in ?aov is rather
> > perfunctory.  There's no mention of the "/" operator, for example.
> >
> > ex = scan()
> >  3  6  3  3
> >  1  2  2  2
> >  5  6  5  6
> >  2  3  4  3
> >  7  8  7  6
> >  4  5  4  3
> >  7  8  9  8
> > 10 10  9 11
> >
> > a = factor(rep(paste("a", 1:2, sep = ""), each = 16));
> > b = factor(rep(paste("b", 1:8, sep = ""), each = 4));
> >
> > ex = data.frame(value = ex, a, b);
> >
> > summary(aov(value ~ a + Error(b), data = ex));
> > summary(aov(value ~ b, data = ex));
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907




More information about the R-help mailing list