[R] Anova - adjusted or sequential sums of squares?

John Fox jfox at mcmaster.ca
Wed Apr 20 17:15:39 CEST 2005


Dear Mick,

The Anova() function in the car package will compute what are often called
"type-II" and "-III" sums of squares. 

Without wishing to reinvigorate the sums-of-squares controversy, I'll just
add that the various "types" of sums of squares correspond to different
hypotheses. The hypotheses tested by "type-I" sums of squares are rarely
sensible; that the results vary by the order of the factors is a symptom of
this, but sums of squares that are invariant with respect to ordering of the
factors don't necessarily correspond to sensible hypotheses. 

If you do decide to use "type-III" sums of squares, be careful to use a
contrast type (such as contr.sum) that produces an orthogonal row basis for
the terms in the model.

I hope this helps,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
-------------------------------- 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of 
> michael watson (IAH-C)
> Sent: Wednesday, April 20, 2005 9:38 AM
> To: bates at wisc.edu
> Cc: r-help at stat.math.ethz.ch
> Subject: RE: [R] Anova - adjusted or sequential sums of squares?
> 
> I guess the real problem is this:
> 
> As I have a different number of observations in each of the 
> groups, the results *change* depending on which order I 
> specify the factors in the model.  This unnerves me.  With a 
> completely balanced design, this doesn't happen - the results 
> are the same no matter which order I specify the factors.  
> 
> It's this reason that I have been given for using the 
> so-called type III adjusted sums of squares...
> 
> Mick
> 
> -----Original Message-----
> From: Douglas Bates [mailto:bates at stat.wisc.edu]
> Sent: 20 April 2005 15:07
> To: michael watson (IAH-C)
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] Anova - adjusted or sequential sums of squares?
> 
> 
> michael watson (IAH-C) wrote:
> > Hi
> > 
> > I am performing an analysis of variance with two factors, each with 
> > two levels.  I have differing numbers of observations in 
> each of the 
> > four combinations, but all four combinations *are* present 
> (2 of the 
> > factor combinations have 3 observations, 1 has 4 and 1 has 5)
> > 
> > I have used both anova(aov(...)) and anova(lm(...)) in R 
> and it gave 
> > the same result - as expected.  I then plugged this into minitab, 
> > performed what minitab called a General Linear Model (I have to use 
> > this in minitab as I have an unbalanced data set) and got a 
> different 
> > result. After a little mining this is because minitab, by default, 
> > uses the type III adjusted SS.  Sure enough, if I changed 
> minitab to 
> > use the type I sequential SS, I get exactly the same 
> results as aov()
> and lm() in R.
> > 
> > So which should I use?  Type I adjusted SS or Type III 
> sequential SS? 
> > Minitab help tells me that I would "usually" want to use type III 
> > adjusted SS, as  type I sequential "sums of squares can differ when 
> > your design is unbalanced" - which mine is.  The R functions I am 
> > using are clearly using the type I sequential SS.
> 
> Install the fortunes package and try
>  > fortune("Venables")
> 
> I'm really curious to know why the "two types" of sum of squares are
> called "Type I" and "Type III"! This is a very common misconception,
> particularly among SAS users who have been fed this nonsense 
> quite often
> for all their professional lives. Fortunately the reality is much
> simpler. There is, 
> by any
> sensible reckoning, only ONE type of sum of squares, and it always 
> represents
> an improvement sum of squares of the outer (or alternative) model over
> the inner (or null hypothesis) model. What the SAS highly dubious 
> classification of
> sums of squares does is to encourage users to concentrate on the null
> hypothesis model and to forget about the alternative. This is 
> always a 
> very bad
> idea and not surprisingly it can lead to nonsensical tests, as in the 
> test it
> provides for main effects "even in the presence of interactions",
> something which beggars definition, let alone belief.
>     -- Bill Venables
>        R-help (November 2000)
> 
> In the words of the master, "there is ... only one type of sum of 
> squares", which is the one that R reports.  The others are awkward 
> fictions created for times when one could only afford to fit 
> one or two 
> linear models per week and therefore wanted the output to 
> give results 
> for all possible tests one could conceive, even if the models being 
> tested didn't make sense.
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html




More information about the R-help mailing list