[R] Stepwise GLM selection by LRT?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Thu Jul 12 21:33:00 CEST 2007
On Thu, 12 Jul 2007, Lutz Ph. Breitling wrote:
> Thank you very much for the prompt reply. Seems like I had not fully
> understood what the k-parameter to stepAIC is doing.
> Your suggested approach looks indeed fine to me, actually I do not
> quite understand why you say that it's only an approximation to the
> LRT?
So this is computing AIC_k = -2L + kp. If you compare models with p and
p+q parameters, this is equvalent to comparing 2 log LR with kq and so for
q=1 the Wilks' LRT is found for k = qchisq(1-p, df=1) (which is just a
squared Normal).
However, no one said q would always be one, and stepAIC steps in terms,
not individual coefficients. Therein lies one of the approximations
(another is in the asympototic distribution theory of the test).
> Best wishes-
> Lutz
>
> On 7/11/07, Ravi Varadhan <rvaradhan at jhmi.edu> wrote:
>> Check out the stepAIC function in MASS package. This is a nice tool, where
>> you can actually implement any penalty even though the function's name has
>> "AIC" in it because it is the default. Although this doesn't do an LRT test
>> based variable selection, you can sort of approximate it by using a penalty
>> of k = qchisq(1-p, df=1), where p is the p-value for variable selection.
>> This penalty means that a variable enters/exits an existing model, when the
>> absolute value of change in log-likelihood is greater than qchisq(1-p,
>> df=1). For p = 0.1, k = 2.71, and for p=0.05, k = 3.84. Is this whhant
>> you'd like to do?
>>
>> Ravi.
>>
>> ----------------------------------------------------------------------------
>> -------
>>
>> Ravi Varadhan, Ph.D.
>>
>> Assistant Professor, The Center on Aging and Health
>>
>> Division of Geriatric Medicine and Gerontology
>>
>> Johns Hopkins University
>>
>> Ph: (410) 502-2619
>>
>> Fax: (410) 614-9625
>>
>> Email: rvaradhan at jhmi.edu
>>
>> Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
>>
>>
>>
>> ----------------------------------------------------------------------------
>> --------
>>
>>
>> -----Original Message-----
>> From: r-help-bounces at stat.math.ethz.ch
>> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Lutz Ph. Breitling
>> Sent: Wednesday, July 11, 2007 3:06 PM
>> To: r-help at stat.math.ethz.ch
>> Subject: [R] Stepwise GLM selection by LRT?
>>
>> Dear List,
>>
>> having searched the help and archives, I have the impression that
>> there is no automatic model selection procedure implemented in R that
>> includes/excludes predictors in logistic regression models based on
>> LRT P-values. Is that true, or is someone aware of an appropriate
>> function somewhere in a custom package?
>>
>> Even if automatic model selection and LRT might not be the most
>> appropriate methods, I actually would like to use these in order to
>> simulate someone else's modeling approach...
>>
>> Many thanks for all comments-
>> Lutz
>> -----
>> Lutz Ph. Breitling
>> German Cancer Research Center
>> Heidelberg/Germany
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list