[R] Reproducing SAS GLM in R

Peter Dalgaard p.dalgaard at biostat.ku.dk
Tue Feb 22 14:25:17 CET 2005


Bela Bauer <bela_b at gmx.net> writes:

> Hi,
> 
> I'm still trying to figure out that GLM procedure in SAS.
> Let's start with the simple example:
> 
> PROC GLM;
> MODEL col1 col3 col5 col7 col9 col11 col13 col15 col17 col19 col21
> col23 =/nouni;
> repeated roi 6, ord 2/nom mean;
> TITLE 'ABDERUS lat ACC 300-500';
> 
> That's the same setup that I had in my last email. I have three
> factors: facSubj,facCond and facRoi. I had this pretty much figured
> out with three lengthy calls to lm(), but I have to extend my code to
> also work on models with four, five or even six factors, so that
> doesn't seem like a practical method any more. I've tried with the
> following code with glm(),anova() and drop1() (I use sum contrasts to
> reproduce those Type III SS values); I've also tried many other
> things, but this is the only somewhat reasonable result I get with glm.
> 
>  > options(contrasts=c("contr.sum","contr.poly"))
>  > test.glm <- glm(vecData ~ (facCond+facSubj+facRoi)^2)
>  > anova(test.glm,test="F")

Why glm() and not lm()??

However, the real crucial thing here is that SAS is de facto fitting a
mixed-effects model, with random effects being everything with Subject
inside. So, except for the GG/HF business, you should get something
similar if you try

summary(aov(vecData ~ facCond*facRoi + Error(facSubj/(facCond*facRoi)) ))

(If you want a closer parallel to the SAS approach, you have to wait
for the mlm extensions that I have planned. I'll get to the GG/HF
issue Real Soon Now.)

> Analysis of Deviance Table
> 
> Model: gaussian, link: identity
> 
> Response: vecData
> 
> Terms added sequentially (first to last)
> 
> 
>                   Df Deviance Resid. Df Resid. Dev       F    Pr(>F)
> NULL                               239    1429.30
> facCond           1     2.21       238    1427.09  3.0764   0.08266 .
> facSubj          19   685.94       219     741.16 50.2463 < 2.2e-16 ***
> facRoi            5   258.77       214     482.38 72.0316 < 2.2e-16 ***
> facCond:facSubj  19   172.70       195     309.68 12.6510 < 2.2e-16 ***
> facCond:facRoi    5    10.37       190     299.31  2.8867   0.01803 *
> facSubj:facRoi   95   231.05        95      68.26  3.3850 4.266e-09 ***
> ---
> Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
>  > drop1(test.glm,scope=.~.,test="F")
> Single term deletions
> 
> Model:
> vecData ~ (facCond + facSubj + facRoi)^2
>                  Df Deviance     AIC F value     Pr(F)
> <none>                68.26  671.33
> facCond          1    70.47  676.97  3.0764   0.08266 .
> facSubj         19   754.19 1209.89 50.2463 < 2.2e-16 ***
> facRoi           5   327.03 1037.35 72.0316 < 2.2e-16 ***
> facCond:facSubj 19   240.96  936.05 12.6510 < 2.2e-16 ***
> facCond:facRoi   5    78.63  695.27  2.8867   0.01803 *
> facSubj:facRoi  95   299.31  836.09  3.3850 4.266e-09 ***
> ---
> Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
> 
> Now, unfortunately this just isn't the output of SAS (roi corresponds
> to facRoi, ord corresponds to facCond)
> 
> > Source                    DF   Type III SS   Mean Square  F Value  Pr > F
> >  roi                        5   258.7726806    51.7545361    21.28
> > <.0001
> >  Error(roi)                95   231.0511739     2.4321176
> >                                            Adj Pr > F
> >                   Source                 G - G     H - F
> >                   roi                   <.0001    <.0001
> >                   Error(roi)
> >                    Greenhouse-Geisser Epsilon    0.5367
> >                    Huynh-Feldt Epsilon           0.6333
> >  Source                    DF   Type III SS   Mean Square  F Value
> > Pr > F
> >  ord                        1     2.2104107     2.2104107     0.24
> > 0.6276
> >  Error(ord)                19   172.7047994     9.0897263
> >  Source                    DF   Type III SS   Mean Square  F Value
> > Pr > F
> >  roi*ord                    5   10.37034918    2.07406984     2.89
> > 0.0180
> >  Error(roi*ord)            95   68.25732255    0.71849813
> >                                            Adj Pr > F
> >                   Source                 G - G     H - F
> >                   roi*ord               0.0663    0.0591
> >                   Error(roi*ord)
> >                    Greenhouse-Geisser Epsilon    0.4116
> >                    Huynh-Feldt Epsilon           0.4623
> 
> As you can see, I get a correct p and F values for the facCond:facRoi
> interaction, but everything else doesn't come out right. The drop1()
> call gives me the correct degrees of freedom, but residual degrees of
> freedom seem to be wrong.
> 
> Could you give me any hints how I could get to the SAS results? For
> the lm() calls that work (in special cases), refer to my posting from
> last Friday.
> I also have a 4-factorial example, and I've been told that people
> around here do 5- or 6-factorial multivariant ANOVAs, too, so I need a
> general solution.
> 
> Thanks a lot!
> 
> Bela
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list