[R] Type I SS and Type III SS problem
Peter Dalgaard
P.Dalgaard at biostat.ku.dk
Fri Sep 19 10:27:02 CEST 2008
leeznar wrote:
> Dear all:
> I m a newer on R. I have some problem when I use anova function. I use anova function to get Type I SS results, but I also need to get Type III SS results. However, in my code, there is some different between the result of Type I SS and Type III SS. I don’t know why the “seqe” factor disappeared in the result of Type III SS. How can I do?
>
>
Well, first make sure that you copy/paste correctly. Your drop1()
results appear not to come from the model M!!
> drop1(M, test="F")
Single term deletions
Model:
Cmax ~ seqe + subj:seqe + per + drug
Df Sum of Sq RSS AIC F value Pr(F)
<none> 438395 302
per 1 63175 501570 304 1.7293 0.2131
drug 1 58149 496544 304 1.5917 0.2311
seqe:subj 12 634325 1072719 303 1.4469 0.2660
So things are not massively different, which is because this is a nicely
balanced cross-over trial. In an unbalanced trial, the type I anova
would be order-dependent.
Now, seqe is nested in subj:seqe, in fact it is even nested in subj
(because each subject gets the two treatments in only one order,
right?), so taking seqe out of the model doesn't change the fit, as long
as, and this is the important bit, subj:seqe. The drop1() function knows
this and doesn't do the test. In contrast, anova looks at the effect
_before_ subj:seqe was in the model.
Since subj:seqe is equivalent to subj, it can be illuminating to write
the model in this (equivalent) way and notic the differences:
> M<- lm(Cmax ~ subj + seqe + per + drug , data=KK)
> anova(M)
Analysis of Variance Table
Response: Cmax
Df Sum Sq Mean Sq F value Pr(>F)
subj 13 634910 48839 1.3369 0.3109
per 1 63175 63175 1.7293 0.2131
drug 1 58149 58149 1.5917 0.2311
Residuals 12 438395 36533
> drop1(M)
Single term deletions
Model:
Cmax ~ subj + seqe + per + drug
Df Sum of Sq RSS AIC
<none> 438395 302
subj 12 634325 1072719 303
seqe 0 -1.630e-09 438395 302
per 1 63175 501570 304
drug 1 58149 496544 304
And (I think I've said this before on the list): When things are nicely
balanced, aov() can be preferable:
> summary(aov(Cmax ~ per*drug + Error(subj) , data=KK))
Error: subj
Df Sum Sq Mean Sq F value Pr(>F)
per:drug 1 585 585 0.0111 0.918
Residuals 12 634325 52860
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
per 1 63175 63175 1.7293 0.2131
drug 1 58149 58149 1.5917 0.2311
Residuals 12 438395 36533
> Here is my example and result.
> a<-c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14)
> b<-c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2)
> c<-c(2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2)
> d<-c(2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1)
> e<-c(1739,1633,1481,1837,1780,2073,1374,1629,1555,1385,1756,1522,1566,1643,
> 1939,1615,1475,1759,1388,1483,1127,1682,1542,1247,1235,1605,1598,1718 )
> KK<-data.frame(subj=as.factor(a), drug=as.factor(b), per=as.factor(c), seqe=as.factor(d), Cmax=e)
> M<- lm(Cmax ~ seqe+ subj:seqe + per + drug , data=KK)
> anova(M)
> drop1(M, test="F")
>
> The result of Type I SS:
> Analysis of Variance Table
> Response: Cmax
> Df Sum Sq Mean Sq F value Pr(>F)
> seqe 1 585 585 0.0160 0.9014
> per 1 63175 63175 1.7293 0.2131
> drug 1 58149 58149 1.5917 0.2311
> seqe:subj 12 634325 52860 1.4469 0.2660
> Residuals 12 438395 36533
>
> The result of Type III SS:
> Single term deletions
> Model:
> AUCt ~ seqe + subj:seqe + per + drug
> Df Sum of Sq RSS AIC F value Pr(F)
> <none> 63208187 442
> per 1 2100484 65308672 441 0.3988 0.5396
> drug 1 4714183 67922370 442 0.8950 0.3628
> seqe:subj 12 35813062 99021249 430 0.5666 0.8308
>
> Best regards,
> HY Lee
>
>
> _________________________________________________________________________________________________________
> 想知道無聊生活如何大變身嘛? http://tw.promote.mail.yahoo.com/dc/change.html
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list