[R] can I do this with R?

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu May 29 08:03:06 CEST 2008


On Wed, 28 May 2008, Xiaohui Chen wrote:

> step or stepAIC functions do the job. You can opt to use BIC by changing the 
> mulplication of penalty.
>
> I think AIC and BIC are not only limited to compare two pre-defined models,

Indeed, the original Akaike papers were for a finite nested sequence of 
models.

> they can be used as model search criteria. You could enumerate the 
> information criteria for all possible models if the size of full model is 
> relatively small. But this is not generally scaled to practical 
> high-dimensional applications. Hence, it is often only possible to find a 
> 'best' model of a local optimum, e.g. measured by AIC/BIC.
>
> On the other way around, I wouldn't like to say the over-penalization of BIC. 
> Instead, I think AIC is usually underpenalizing larger models in terms of the 
> positive probability of incoperating irrevalent variables in linear models.

This depends on the aim, and the aims are different.  AIC is aiming at 
good predictions, for which adding 'irrevalent variables' is a small cost 
but leaving out relevant ones is a large cost.  So it will tend to 
over-fit, that is include all the relevant variables and some others.

BIC on the other hand is aimed at choosing the true model (assuming there 
is one).

> X
>
> Frank E Harrell Jr 写道:
>> Smita Pakhale wrote:
>>> Hi Maria,
>>> 
>>> But why do you want to use forwards or backwards
>>> methods? These all are 'backward' methods of modeling.
>>> Try using AIC or BIC. BIC is much better than AIC.
>>> And, you do not have to believe me or any one else on
>>> this. 
>> 
>> How does that help? BIC gives too much penalization in certain contexts; 
>> both AIC and BIC were designed to compare two pre-specified models. They 
>> were not designed to fix problems of stepwise variable selection.
>> 
>> Frank
>> 
>>> 
>>> Just make a small data set with a few variables with
>>> known relationship amongst them. With this simulated
>>> data set, use all your modeling methods: backwards,
>>> forwards, AIC, BIC etc and then see which one gives
>>> you a answer closest to the truth. The beauty of using
>>> a simulated dataset is that, you 'know' the truth, as
>>> you are the 'creater' of it!
>>> 
>>> smita
>>> 
>>> --- Charilaos Skiadas <cskiadas at gmail.com> wrote:
>>> 
>>>> A google search for "logistic regression with
>>>> stepwise forward in r" returns the following post:
>>>> 
>>>> 
>>> https://stat.ethz.ch/pipermail/r-help/2003-December/043645.html
>>>> Haris Skiadas
>>>> Department of Mathematics and Computer Science
>>>> Hanover College
>>>> 
>>>> On May 28, 2008, at 7:01 AM, Maria wrote:
>>>> 
>>>>> Hello,
>>>>> I am just about to install R and was wondering
>>>> about a few things.
>>>>> I have only worked in Matlab because I wanted to
>>>> do a logistic
>>>>> regression. However Matlab does not do logistic
>>>> regression with
>>>>> stepwiseforward method. Therefore I thought about
>>>> testing R. So my
>>>>> question is
>>>>> can I do logistic regression with stepwise forward
>>>> in R?
>>>>> Thanks /M
>>>> ______________________________________________
>>> 
>> 
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-help mailing list