[R] can I do this with R?

Frank E Harrell Jr f.harrell at vanderbilt.edu
Thu May 29 14:39:27 CEST 2008


Smita Pakhale wrote:
> Using any 'significance level', I think is the main
> problem in the stepwise variable selection method. As
> such in 'normal' circumstances the interpretation of
> p-value is topsy-turvy. Then you can only imagine as
> to what happens to this p-value interpretation in this
> process of variable selection...you no longer no, what
> does the significance level mean, if at all anything?
> smita

True, and AIC/BIC are just translations of P-values.

Frank

> 
> --- Frank E Harrell Jr <f.harrell at vanderbilt.edu>
> wrote:
> 
>> Xiaohui Chen wrote:
>>> step or stepAIC functions do the job. You can opt
>> to use BIC by changing 
>>> the mulplication of penalty.
>>>
>>> I think AIC and BIC are not only limited to
>> compare two pre-defined 
>>> models, they can be used as model search criteria.
>> You could enumerate 
>>> the information criteria for all possible models
>> if the size of full 
>>> model is relatively small. But this is not
>> generally scaled to practical 
>>> high-dimensional applications. Hence, it is often
>> only possible to find 
>>> a 'best' model of a local optimum, e.g. measured
>> by AIC/BIC.
>>
>> Sure you can use them that way, and they may perform
>> better than other 
>> measures, but the resulting model will be highly
>> biased (regression 
>> coefficients biased away from zero).  AIC and BIC
>> were not designed to 
>> be used in this fashion originally.  Optimizing AIC
>> or BIC will not 
>> produce well-calibrated models as does penalizing a
>> large model.
>>
>>> On the other way around, I wouldn't like to say
>> the over-penalization of 
>>> BIC. Instead, I think AIC is usually
>> underpenalizing larger models in 
>>> terms of the positive probability of incoperating
>> irrevalent variables 
>>> in linear models.
>> If you put some constraints on the process (e.g., if
>> using AIC to find 
>> the optimum penalty in penalized maximum likelihood
>> estimation), AIC 
>> works very well and BIC results if far too much
>> shrinkage 
>> (underfitting).  If using a dangerous process such
>> as stepwise variable 
>> selection, the more conservative BIC may be better
>> in some sense, worse 
>> in others.  The main problem with stepwise variable
>> selection is the use 
>> of significance levels for entry below 1.0 and
>> especially below 0.1.
>>
>> Frank
>>
>>> X
>>>
>>> Frank E Harrell Jr 写道:
>>>> Smita Pakhale wrote:
>>>>> Hi Maria,
>>>>>
>>>>> But why do you want to use forwards or backwards
>>>>> methods? These all are 'backward' methods of
>> modeling.
>>>>> Try using AIC or BIC. BIC is much better than
>> AIC.
>>>>> And, you do not have to believe me or any one
>> else on
>>>>> this. 
>>>> How does that help? BIC gives too much
>> penalization in certain 
>>>> contexts; both AIC and BIC were designed to
>> compare two pre-specified 
>>>> models. They were not designed to fix problems of
>> stepwise variable 
>>>> selection.
>>>>
>>>> Frank
>>>>
>>>>> Just make a small data set with a few variables
>> with
>>>>> known relationship amongst them. With this
>> simulated
>>>>> data set, use all your modeling methods:
>> backwards,
>>>>> forwards, AIC, BIC etc and then see which one
>> gives
>>>>> you a answer closest to the truth. The beauty of
>> using
>>>>> a simulated dataset is that, you 'know' the
>> truth, as
>>>>> you are the 'creater' of it!
>>>>>
>>>>> smita
>>>>>
>>>>> --- Charilaos Skiadas <cskiadas at gmail.com>
>> wrote:
>>>>>> A google search for "logistic regression with
>>>>>> stepwise forward in r" returns the following
>> post:
>>>>>>
> https://stat.ethz.ch/pipermail/r-help/2003-December/043645.html
>>>>>> Haris Skiadas
>>>>>> Department of Mathematics and Computer Science
>>>>>> Hanover College
>>>>>>
>>>>>> On May 28, 2008, at 7:01 AM, Maria wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>> I am just about to install R and was wondering
>>>>>> about a few things.
>>>>>>> I have only worked in Matlab because I wanted
>> to
>>>>>> do a logistic
>>>>>>> regression. However Matlab does not do
>> logistic
>>>>>> regression with
>>>>>>> stepwiseforward method. Therefore I thought
>> about
>>>>>> testing R. So my
>>>>>>> question is
>>>>>>> can I do logistic regression with stepwise
>> forward
>>>>>> in R?
>>>>>>> Thanks /M
>>>>>> ______________________________________________
>>>
>>
>> -- 
>> Frank E Harrell Jr   Professor and Chair          
>> School of Medicine
>>                       Department of Biostatistics  
>> Vanderbilt University
>>
> 
> 
> 
>       
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list