[R] aov() and anova() making faulty F-tests

peter dalgaard pdalgd at gmail.com
Wed Mar 6 11:16:38 CET 2013


On Mar 6, 2013, at 03:56 , Rolf Turner wrote:

> 
> 
> Your subject line is patent nonsense.  The aov() and anova() functions
> have been around for decades.  If they were doing something wrong
> it would have been noticed long since.
> 
> You should realize that the fault is in your understanding, not in these
> functions.
> 
> I cannot really follow your convoluted and messy code, but it would
> appear that you want to consider "M" and "I" to be random effects.

Only M and M:I, AFAICT.  And, yes, it is messy; in particular, I refuse to believe that y~M*I has generated output with lowercase m and i!

> 
> Where have you informed aov() as to the presence of these
> random effects?

To be specific, try y~I + Error(M + M:I). Without the random effects, aov() is just telling you that there is a highly significant interaction between M and I, and beyond that, no sensible comparisons can be made.

> 
>    cheers,
> 
>        Rolf Turner
> 
> On 03/06/2013 03:36 PM, PatGauthier wrote:
>> Dear useRs,
>> 
>> I've just encountered a serious problem involving the F-test being carried
>> out in aov() and anova(). In the provided example, aov() is not making the
>> correct F-test for an hypothesis involving the expected mean square (EMS) of
>> a factor divided by the EMS of another factor (i.e., instead of the error
>> EMS).
>> 
>> Here is the example:
>> 
>> 
>>                   Expected Mean Square            df
>> Mi                     σ2+18σ2M                              1
>> Ij                  σ2+6σ2MI+12Ф(I)                      2
>> MIij                   σ2+6σ2MI                              2
>> ε(ijk)l                        σ2                                   30
>> 
>> The clear test for Ij is EMS(I) / EMS(MI) -  F(2,2)
>> 
>> However, observe the following example carried out in R,
>> 
>> M <- rep(c("M1", "M2"), each = 18)
>> I <- as.ordered(rep(rep(c(5,10,15), each = 6), 2))
>> y <-
>> c(44,39,48,40,43,41,27,20,25,21,28,22,35,30,29,34,31,38,12,7,6,11,7,12,15,10,12,17,11,13,22,15,27,22,21,19)
>> dat <- data.frame(M, I, y)
>> summary(aov(y~M*I, data = dat))
>>                        Df    Sum Sq       Mean Sq         F value
>> Pr(>F)
>> m                     1     3136.0           3136.0            295.85      <
>> 2e-16 ***
>> i                        2      513.7              256.9              24.23
>> 5.45e-07 ***
>> m:i                   2      969.5              484.7              45.73
>> 7.77e-10 ***
>> Residuals   30       318.0                10.6
>> ---
>> 
>> In this example aov has taken the F-ratio of MS(I) / MS(ε) -  F(2,30) =
>> 24.23 with F-crit = qf(0.95,2,3) = 9.55 -- significant
>> 
>> However, as stated above,  the correct F-ratio is MS(I) / MS(MI) -  F(2,2) =
>> 0.53 with F-crit = qf(0.95,2,2) = 19 -- non-significant
>> 
>> Why is aov() miscalculating the F-ratio, and is there a way to fix this
>> without prior knowledge of the appropriate test (e.g., EMS(I)/EMS(MI)?
>> 
>> Thanks for your help,
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list