[R] anova.lm F test confusion
Ben Bolker
bbolker at gmail.com
Wed Mar 21 03:19:20 CET 2012
msteane <michellesteane <at> hotmail.com> writes:
>
> I am using anova.lm to compare 3 linear models. Model 1 has 1 variable,
> model 2 has 2 variables and model 3 has 3 variables. All models are fitted
> to the same data set.
(I assume these are nested models, otherwise the analysis doesn't
make sense ...)
>
> anova.lm(model1,model2) gives me:
>
> Res.Df RSS Df Sum of Sq F Pr(>F)
> 1 135 245.38
> 2 134 184.36 1 61.022 44.354 6.467e-10 ***
>
> anova.lm(model1,model2,model3) gives me:
>
> Res.Df RSS Df Sum of Sq F Pr(>F)
> 1 135 245.38
> 2 134 184.36 1 61.022 50.182 7.355e-11 ***
> 3 133 161.73 1 22.628 18.609 3.105e-05 ***
>
> Why aren't the 2nd row F values from each of the anova tables the same??? I
> thought in each case the 2nd row is comparing model 2 to model 1?
From ?anova.lm:
Normally the F statistic is most appropriate, which compares the mean
square for a row to the residual sum of squares for the largest model
considered.
>
> I figured out that for anova.lm(model1,model2)
> F(row2)=Sum of Sq(row2)/MSE of Model 2
>
> and for anova.lm(model1,model2,model3)
> F(row2)=Sum of Sq(row 2)/MSE of Model 3 <-- I don't get why the MSE of
> model 3 is being included if we're comparing Model 2 to Model 2
See above ...
More information about the R-help
mailing list