[R] Anova - adjusted or sequential sums of squares?
John Fox
jfox at mcmaster.ca
Wed Apr 20 17:15:39 CEST 2005
Dear Mick,
The Anova() function in the car package will compute what are often called
"type-II" and "-III" sums of squares.
Without wishing to reinvigorate the sums-of-squares controversy, I'll just
add that the various "types" of sums of squares correspond to different
hypotheses. The hypotheses tested by "type-I" sums of squares are rarely
sensible; that the results vary by the order of the factors is a symptom of
this, but sums of squares that are invariant with respect to ordering of the
factors don't necessarily correspond to sensible hypotheses.
If you do decide to use "type-III" sums of squares, be careful to use a
contrast type (such as contr.sum) that produces an orthogonal row basis for
the terms in the model.
I hope this helps,
John
--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox
--------------------------------
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of
> michael watson (IAH-C)
> Sent: Wednesday, April 20, 2005 9:38 AM
> To: bates at wisc.edu
> Cc: r-help at stat.math.ethz.ch
> Subject: RE: [R] Anova - adjusted or sequential sums of squares?
>
> I guess the real problem is this:
>
> As I have a different number of observations in each of the
> groups, the results *change* depending on which order I
> specify the factors in the model. This unnerves me. With a
> completely balanced design, this doesn't happen - the results
> are the same no matter which order I specify the factors.
>
> It's this reason that I have been given for using the
> so-called type III adjusted sums of squares...
>
> Mick
>
> -----Original Message-----
> From: Douglas Bates [mailto:bates at stat.wisc.edu]
> Sent: 20 April 2005 15:07
> To: michael watson (IAH-C)
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] Anova - adjusted or sequential sums of squares?
>
>
> michael watson (IAH-C) wrote:
> > Hi
> >
> > I am performing an analysis of variance with two factors, each with
> > two levels. I have differing numbers of observations in
> each of the
> > four combinations, but all four combinations *are* present
> (2 of the
> > factor combinations have 3 observations, 1 has 4 and 1 has 5)
> >
> > I have used both anova(aov(...)) and anova(lm(...)) in R
> and it gave
> > the same result - as expected. I then plugged this into minitab,
> > performed what minitab called a General Linear Model (I have to use
> > this in minitab as I have an unbalanced data set) and got a
> different
> > result. After a little mining this is because minitab, by default,
> > uses the type III adjusted SS. Sure enough, if I changed
> minitab to
> > use the type I sequential SS, I get exactly the same
> results as aov()
> and lm() in R.
> >
> > So which should I use? Type I adjusted SS or Type III
> sequential SS?
> > Minitab help tells me that I would "usually" want to use type III
> > adjusted SS, as type I sequential "sums of squares can differ when
> > your design is unbalanced" - which mine is. The R functions I am
> > using are clearly using the type I sequential SS.
>
> Install the fortunes package and try
> > fortune("Venables")
>
> I'm really curious to know why the "two types" of sum of squares are
> called "Type I" and "Type III"! This is a very common misconception,
> particularly among SAS users who have been fed this nonsense
> quite often
> for all their professional lives. Fortunately the reality is much
> simpler. There is,
> by any
> sensible reckoning, only ONE type of sum of squares, and it always
> represents
> an improvement sum of squares of the outer (or alternative) model over
> the inner (or null hypothesis) model. What the SAS highly dubious
> classification of
> sums of squares does is to encourage users to concentrate on the null
> hypothesis model and to forget about the alternative. This is
> always a
> very bad
> idea and not surprisingly it can lead to nonsensical tests, as in the
> test it
> provides for main effects "even in the presence of interactions",
> something which beggars definition, let alone belief.
> -- Bill Venables
> R-help (November 2000)
>
> In the words of the master, "there is ... only one type of sum of
> squares", which is the one that R reports. The others are awkward
> fictions created for times when one could only afford to fit
> one or two
> linear models per week and therefore wanted the output to
> give results
> for all possible tests one could conceive, even if the models being
> tested didn't make sense.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list