[R] Anova and unbalanced designs

Sat Jan 24 12:30:04 CET 2009

Dear John,

thank you for your answer. You are right, I also would not have expected 
a divergent result.
I have double-checked it again. No, I got type-III tests.
When I use type II, I get the same results in SPSS as in 'Anova' (using 
also type-II tests).
My guess was that the somehow weighted means SPSS shows could be 
responsible for this difference.
Or that using 'Anova' would not be correct for unequal group n's, which 
was not the case I think.
Do you have any further ideas?

Thank you!
Nils

John Fox schrieb:
> Dear Nils,
>
> This is a pretty simple design, and I wouldn't have thought that there was
> much room for getting different results. More generally, but not here (since
> there's only one between-subject factor), one shouldn't use
> contr.treatment() with "type-III" tests, as you did. Is it possible that you
> got "type-II" tests from SPSS:
>
> ------ snip ----------
>
>   
>> summary(Anova(betweenanova, idata=with, idesign= ~within, type = "II" ))
>>     
>
> Type II Repeated Measures MANOVA Tests:
>
> ------------------------------------------
>  
> Term: between 
>
>  Response transformation matrix:
>    (Intercept)
> w1           1
> w2           1
>
> Sum of squares and products for the hypothesis:
>             (Intercept)
> (Intercept)         9.6
>
> Sum of squares and products for error:
>             (Intercept)
> (Intercept)          18
>
> Multivariate Tests: between
>                  Df test stat approx F num Df den Df   Pr(>F)  
> Pillai            1  0.347826 4.266667      1      8 0.072726 .
> Wilks             1  0.652174 4.266667      1      8 0.072726 .
> Hotelling-Lawley  1  0.533333 4.266667      1      8 0.072726 .
> Roy               1  0.533333 4.266667      1      8 0.072726 .
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
>
> ------------------------------------------
>  
> Term: within 
>
>  Response transformation matrix:
>    within1
> w1       1
> w2      -1
>
> Sum of squares and products for the hypothesis:
>         within1
> within1     0.4
>
> Sum of squares and products for error:
>          within1
> within1 21.33333
>
> Multivariate Tests: within
>                  Df test stat  approx F num Df den Df  Pr(>F)
> Pillai            1 0.0184049 0.1500000      1      8 0.70864
> Wilks             1 0.9815951 0.1500000      1      8 0.70864
> Hotelling-Lawley  1 0.0187500 0.1500000      1      8 0.70864
> Roy               1 0.0187500 0.1500000      1      8 0.70864
>
> ------------------------------------------
>  
> Term: between:within 
>
>  Response transformation matrix:
>    within1
> w1       1
> w2      -1
>
> Sum of squares and products for the hypothesis:
>          within1
> within1 4.266667
>
> Sum of squares and products for error:
>          within1
> within1 21.33333
>
> Multivariate Tests: between:within
>                  Df test stat  approx F num Df den Df  Pr(>F)
> Pillai            1 0.1666667 1.6000000      1      8 0.24150
> Wilks             1 0.8333333 1.6000000      1      8 0.24150
> Hotelling-Lawley  1 0.2000000 1.6000000      1      8 0.24150
> Roy               1 0.2000000 1.6000000      1      8 0.24150
>
> Univariate Type II Repeated-Measures ANOVA Assuming Sphericity
>
>                     SS num Df Error SS den Df      F  Pr(>F)  
> between         4.8000      1   9.0000      8 4.2667 0.07273 .
> within          0.2000      1  10.6667      8 0.1500 0.70864  
> between:within  2.1333      1  10.6667      8 1.6000 0.24150  
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
>
> ------ snip ----------
>
> I hope this helps,
>  John
>
> ------------------------------
> John Fox, Professor
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> web: socserv.mcmaster.ca/jfox
>
>
>   
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>>     
> On
>   
>> Behalf Of Skotara
>> Sent: January-23-09 12:16 PM
>> To: r-help at r-project.org
>> Subject: [R] Anova and unbalanced designs
>>
>> Dear R-list!
>>
>> My question is related to an Anova including within and between subject
>> factors and unequal group sizes.
>> Here is a minimal example of what I did:
>>
>> library(car)
>> within1 <- c(1,2,3,4,5,6,4,5,3,2); within2 <- c(3,4,3,4,3,4,3,4,5,4)
>> values <- data.frame(w1 = within1, w2 = within2)
>> values <- as.matrix(values)
>> between <- factor(c(rep(1,4), rep(2,6)))
>> betweenanova <- lm(values ~ between)
>> with <- expand.grid(within = factor(1:2))
>> withinanova <- Anova(betweenanova, idata=with, idesign=
>> ~as.factor(within), type = "III" )
>>
>> I do not know if this is the appropriate method to deal with unbalanced
>> designs.
>>
>> I observed, that SPSS calculates everything identically except the main
>> effect of the within factor, here, the SSQ and F-value are very different
>> If selecting the option "show means", the means for the levels of the
>> within factor in SPSS are the same as:
>> mean(c(mean(values$w1[1:4]),mean(values$w1[5:10]))) and
>> mean(c(mean(values$w2[1:4]),mean(values$w2[5:10]))).
>> In other words, they are calculated as if both groups would have the
>> same size.
>>
>> I wonder if this is a good solution and if so, how could I do the same
>> thing in R?
>> However, I think if this is treated in SPSS as if the group sizes are
>> identical,
>> then why not the interaction, which yields to the same result as using
>> Anova()?
>>
>> Many thanks in advance for your time and help!
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>>     
> http://www.R-project.org/posting-guide.html
>   
>> and provide commented, minimal, self-contained, reproducible code.
>>