[R] Doing partial-f test for stepwise regression

Sun Apr 1 16:23:18 CEST 2007

Petr Klasterecky wrote:
> And what about to read the help page ?anova ...?
> 
>  >>>
> When given a sequence of objects, 'anova' tests the models against
>       one another in the order specified.
> <<<
> 
> Generally you almost never fit a full model (including all possible 
> interactions etc) - no one can interpret such complicated models. Anova 
> gives you a comparison between a broader model (the first argument to 
> anova) and its submodel(s).

True you might not fit a model with high-order interactions, but the 
full pre-specified model is the only one whose standard errors and test 
statistics work as advertised.

Frank

> 
> Petr
> 
> zhuanyi at zay.name napsal(a):
>> Hello all,
>> I am trying to figure out an optimal linear model by using stepwise
>> regression which requires partial f-test, I did some Googling on the
>> Internet and realised that someone seemed to ask the question before:
>>
>> Jim Milks <jrclmilks at joimail.com> writes: 
>>> Dear all: 
>>>
>>> I have a regression model that has collinearity problems (between 
>>> three regressor variables). I need a F-test that will allow me to 
>>> compare between full (with all variables) and partial models (minus 
>>> 1=< variables). The general F-test formula I'm using is: 
>>>
>>> F = {[SS(full model) - SS(reduced model)] / (#variables taken out)} / 
>>> MSS(full model) 
>>>
>>> Unfortunately, the ANOVA table parses the SS and MSS between the 
>>> variables and does not give the statistics for the regression model as 
>>> a whole, otherwise I'd do this by hand. 
>>>
>>> So, really, I have two questions: 1) Can I just add up all the SS and 
>>> MSS for all the variables to get the model SS and MSS and 2) Are 
>>> there any functions or packages I can use to calculate the F-statistic? 
>>> Just use anova(model1, model2). 
>>> (One potential catch: Make sure that both models are fitted to the same
>>> data set. Missing values in predictors may interfere.) 
>> However, in the answer provided by Mr. Peter Dalgaard,(use
>> anova(model1,model2) I could not understand what model1 and model2 are
>> supposed to referring to, which one is supposedly to be the full model and
>> which one is to be the partial model? Or it does not matter?
>>
>> Thanks in advance for help from anyone!
>>
>> Regards,
>> Anyi Zhu
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University