[R] Selecting Best Model in an anova.

Thu Mar 25 08:36:00 CET 2010

Hello,

I have a simple theorical question about regresion...

Let's suppose I have this:

Model 1:
Y = B0 + B1*X1 + B2*X2 + B3*X3
and
Model 2:
Y = B0 + B2*X2 + B3*X3
I.E.
Model1 = lm(Y~X1+X2+X3)
Model2 = lm(Y~X2+X3)

The Ajusted R-Square for Model1 is 0.9 and the Ajusted R-Square for Model2 is 0.99, among many other significant improvements.

And I want to do the anova test to choose the best one:

H0: B1 = 0
H1: B1 != 0

Test = Anova(Model2,Model1)

How do I know what model wins? (I'm using a confidence level of 0.1)...

My guess is that:
If p-value of summary(Test) is greater than 0.1 then I don't reject H0 so Model2 is better and otherwise I reject H0 so Model1 is better?

My teacher once said: "If p-value is greater than 0,5 we choose the short model and otherwise we choose the long model", but she never said how the p-value and the significance level were related in this test... Actually she never talked about significance level...

In short: Should I consider the significance level or always use 0.05 for this kind of test?

Thanks a lot!

Hector Guilarte
Enviado desde mi dispositivo movil BlackBerry® de Digitel.