[R] Sums of sq in car package Anova function

Karla Sartor ksartor at montana.edu
Sun Dec 19 17:32:14 CET 2004


John,

Thank very much for your help.  I think that I have figured out my 
problem.  The levels of one of my factors are "1" and "0".  While this 
didn't matter with the 'anova()' function, is does seem to alter the 
results with the 'Anova' function.  When I changed the levels to 
letters, the tables matched my SPSS output.  As for why the type III 
test in SPSS was nearly identical to the 'anova' function, my unequal 
sample sizes were not drastically different so changing to type III must 
not have changed the results very much?  That was all I could come up 
with at the time.

Here is the code I used:

options(contrasts = c("contr.sum", "contr.poly"))
require(car)

GH = read.table("GH.txt", header =T)
GH.sub = subset(GH, GH$sp=="C")
attach(GH.sub)

biomass= log10(GH.sub$tot.bio)
GH.sub.fit = lm(biomass~am*nbr*barr, data=GH.sub)
print(Anova(GH.sub.fit, type='III'))

I get this with "1" and "0" factor levels:

Anova Table (Type III tests)

Response: biomass
                 Sum Sq  Df   F value        Pr(>F)   
(Intercept) 51.943   1     3725.4324 < 2.2e-16 ***
am             2.403    1     172.3630   < 2.2e-16 ***
nbr            0.779    3      18.6347     4.434e-10 ***
barr           0.078    1      5.5803       0.01968 * 
am:nbr       0.018    3      0.4284       0.73296   
am:barr      0.039    1      2.7826       0.09775 . 
nbr:barr      0.044    3     1.0606        0.36834   
am:nbr:barr 0.022    3     0.5208        0.66873   
Residuals    1.771 127   


And this with letter factor levels:

Anova Table (Type III tests)

Response: biomass
                    Sum Sq  Df   F value          Pr(>F)   
(Intercept)    75.371   1     5405.7202     < 2e-16 ***
am                2.403     1     172.3630     < 2e-16 ***
nbr               1.482     3      35.4357     < 2e-16 ***
barr              0.040     1      2.8410       0.09434 . 
am:nbr          0.018     3      0.4284       0.73296   
am:barr         0.039     1      2.7826       0.09775 . 
nbr:barr         0.051     3      1.2167      0.30643   
am:nbr:barr    0.022     3     0.5208       0.66873   
Residuals       1.771 127                     
---

SPSS gives: 

Tests of Between-Subjects Effects
Dependent Variable: lot10.tot.bio
Source                    Type III                 df    Mean Square    
F             Sig.
                                Sum of Squares   
Corrected Model    4.002(a)                 15       .267           
19.133      .000
Intercept                  75.371                     1    75.371       
   5405.720  .000
am                           2.403                       1    2.403      
      172.363    .000
nbr                           1.482                       3    .494   
          35.436       .000
barr                           .040                        1    .040    
        .841            .094
am * nbr                    018                         3    .006        
    .428            .733
am * barr                   .039                        1    .039        
     2.783         .098
nbr * barr                  .051                        3    .017        
     1.217         .306
am * nbr * barr         .022                        3     .007         
    .521           .669
Error                        1.771                      127  .014        
   
Total                        80.796                    143               
Corrected Total        5.772                      142               
a    R Squared = .693 (Adjusted R Squared = .657)


Am I missing something else?  I don't know the best way to post the data 
set, so I will send it to John and maybe he can post it if it is of 
interest.

Thanks again!

Karla

Karla Sartor
Montana State University - LRES
ksartor at montana.edu



 


John Fox wrote:

>Dear Karla,
>
>I suggested last night that you send me further information, but decided
>this morning to try out a reproducible example of my own:
>
>  
>
>>set.seed(12345)
>>A <- factor(sample(c("a1", "a2", "a3"), 100, replace=TRUE))
>>B <- factor(sample(c("b1", "b2"), 100, replace=TRUE))
>>C <- factor(sample(c("c1", "c2", "c3"), 100, replace=TRUE))
>>mu <- array(1:18, c(3,2,3))
>>a <- as.numeric(A)
>>b <- as.numeric(B)
>>c <- as.numeric(C)
>>y <- mu[cbind(a,b,c)] + rnorm(100)
>>mod <- lm(y ~ A*B*C)
>>library(car)
>>options(contrasts=c("contr.sum", "contr.poly"))
>>Anova(mod, type="II")
>>    
>>
>Anova Table (Type II tests)
>
>Response: y
>           Sum Sq Df   F value    Pr(>F)    
>A           65.88  2   38.4098 1.696e-12 ***
>B          196.47  1  229.0775 < 2.2e-16 ***
>C         2441.00  2 1423.0809 < 2.2e-16 ***
>A:B          0.22  2    0.1259    0.8819    
>A:C          6.92  4    2.0174    0.0996 .  
>B:C          0.87  2    0.5095    0.6027    
>A:B:C        2.89  4    0.8432    0.5018    
>Residuals   70.33 82                        
>---
>Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 
>  
>
>>Anova(mod, type="III")
>>    
>>
>Anova Table (Type III tests)
>
>Response: y
>            Sum Sq Df   F value    Pr(>F)    
>(Intercept) 7830.2  1 9129.8959 < 2.2e-16 ***
>A             55.7  2   32.4913 4.059e-11 ***
>B            189.5  1  221.0076 < 2.2e-16 ***
>C           2124.0  2 1238.2549 < 2.2e-16 ***
>A:B            0.2  2    0.0942    0.9102    
>A:C            5.9  4    1.7323    0.1507    
>B:C            0.6  2    0.3417    0.7115    
>A:B:C          2.9  4    0.8432    0.5018    
>Residuals     70.3 82                        
>---
>Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 
>
>
>I don't have a working copy of SPSS anymore, but here's what SAS does with
>this example:
>
>      Source                      DF      Type II SS     Mean Square    F
>Value    Pr > F
>
>      A                            2       65.884048       32.942024
>38.41    <.0001
>      B                            1      196.467384      196.467384
>229.08    <.0001
>      A*B                          2        0.215883        0.107942
>0.13    0.8819
>      C                            2     2440.998718     1220.499359
>1423.08    <.0001
>      A*C                          4        6.920872        1.730218
>2.02    0.0996
>      B*C                          2        0.873945        0.436973
>0.51    0.6027
>      A*B*C                        4        2.892820        0.723205
>0.84    0.5018
>
>
>      Source                      DF     Type III SS     Mean Square    F
>Value    Pr > F
>
>      A                            2       55.732128       27.866064
>32.49    <.0001
>      B                            1      189.546201      189.546201
>221.01    <.0001
>      A*B                          2        0.161608        0.080804
>0.09    0.9102
>      C                            2     2123.968177     1061.984089
>1238.25    <.0001
>      A*C                          4        5.942845        1.485711
>1.73    0.1507
>      B*C                          2        0.586168        0.293084
>0.34    0.7115
>      A*B*C                        4        2.892820        0.723205
>0.84    0.5018
>
>So, as you can see, the results check.
>
>It's hard to know what to make of this without more information about what
>you did. Much as I'm not an admirer of SPSS, I doubt whether it computes
>type-III sums of squares incorrectly, so I suspect something wrong with
>either your SPSS commands or your R commands.
>
>I hope this helps,
> John
>
>--------------------------------
>John Fox
>Department of Sociology
>McMaster University
>Hamilton, Ontario
>Canada L8S 4M4
>905-525-9140x23604
>http://socserv.mcmaster.ca/jfox 
>-------------------------------- 
>
>  
>
>>-----Original Message-----
>>From: r-help-bounces at stat.math.ethz.ch 
>>[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Karla Sartor
>>Sent: Saturday, December 18, 2004 6:43 PM
>>To: r-help at stat.math.ethz.ch
>>Subject: [R] Sums of sq in car package Anova function
>>
>>Hello R users,
>>
>>I am trying to run a three factor ANOVA on a data set with 
>>unequal sample sizes.
>>
>>I fit the data to a 'lm' object and used the Anova function 
>>from the 'car' package with the 'type=III' option to get type 
>>III sums of squares.  I also set the contrast coding option 
>>to 'options(contrasts = c("contr.sum", "contr.poly"))' as 
>>cautioned in Jon Fox's book "An R and S-plus Companion to 
>>Applied Regression'.
>>
>>Is there anything else that I need to consider when using the 
>>type III option with the Anova function?
>>
>>When I run the same data set in SPSS with General Linear 
>>Model and type III  sums of squares, the sums of squares are 
>>different enough that one of the main effect terms is 
>>significant in the R table and not in the SPSS table.  I 
>>found a similar discrepancy with a different data set, only 
>>SPSS showed a significant interaction effect while, while the 
>>'Anova' function did not.
>>
>>I also compared the results from SPSS those from the 'anova' 
>>function in the base package, and the results are nearly 
>>identical.  I would expect the two methods with type III sums 
>>of squares to be more similar, does anyone have any ideas as 
>>to why that was not the case?  I am hoping to not go back to 
>>SPSS at this point, so am trying to decide which of the two R 
>>functions is most appropriate for me (and defensible, 
>>considering the unequal sample sizes).
>>
>>Thank you in advance for any ideas you may have!
>>
>>Karla
>>
>>Karla Sartor
>>Montana State University - LRES
>>ksartor at montana.edu
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide! 
>>http://www.R-project.org/posting-guide.html
>>    
>>
>
>
>  
>




More information about the R-help mailing list