[R] which alternative tests instead of AIC/BIC for choosing models

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Aug 13 22:29:38 CEST 2008


Cp is either the same thing as AIC, or an approximation to it.  So it is 
not an 'alternative'.

See e.g. the discussion in MASS or ?add1.

On Wed, 13 Aug 2008, tolga.i.uzuner at jpmorgan.com wrote:

> By way of partial follow-up to my own question, and on the odd chance
> anyone else wonders about this issue, some alternatives to this appear to
> be in the leaps package, which implements the leaps routine (Mallows Cp)
> and regsubsets. In my case Mallows' Cp does not work either (see below),
> so I have implemented the following.
>
> regr # <- holds a zoo object with the 1st column being the dependent
> variable
>
> r2test<- (result$lm.r2>Rsqr) &
>        (all(unlist(lapply(2:(dim(regr)[2]),function(i)
> summary(lm(regr[,1]~regr[,i]))$adj.r.squared ))>0.1)) &
>        which.min(leaps(as.matrix(regr[,-1]),regr[,1])$Cp)==dim(regr)[2]
>
> leaps on the same problem below
> ===============================
>
>> leaps(as.matrix(regr3[,-1]),regr3[,1],method=c("adjr2"))
> $which
>      1     2
> 1 FALSE  TRUE
> 1  TRUE FALSE
> 2  TRUE  TRUE
>
> $label
> [1] "(Intercept)" "1"           "2"
>
> $size
> [1] 2 2 3
>
> $adjr2
> [1] 0.950757134 0.001681389 0.954859493
>
>> leaps(as.matrix(regr3[,-1]),regr3[,1],method=c("Cp"))
> $which
>      1     2
> 1 FALSE  TRUE
> 1  TRUE FALSE
> 2  TRUE  TRUE
>
> $label
> [1] "(Intercept)" "1"           "2"
>
> $size
> [1] 2 2 3
>
> $Cp
> [1]   38.53367 8490.55327    3.00000
>
>>
>
>
>
> Tolga I Uzuner/JPMCHASE
> 13/08/2008 17:33
>
> To
> r-help at r-project.org
> cc
>
> Subject
> which alternative tests instead of AIC/BIC for choosing models
>
>
>
>
>
> Dear R Users,
>
> I am looking for an alternative to AIC or BIC to choose model parameters.
> This is somewhat of a general statistics question, but I ask it in this
> forum as I am looking for a R solution.
>
> Suppose I have one dependent variable, y, and two independent variables,
> x1 an x2.
>
> I can perform three regressions:
> reg1: y~x1
> reg2: y~x2
> reg3: y~x1+x2
>
> The AIC of reg1 is 2000, reg2 is 1000 and reg3 is 950. One would,
> presumably, conclude that one should use both x1 and x2.  However, the
> R^2's are quite different: R^2 of reg1 is 0.5%, reg2 is 95% and reg3 is
> 95.25%. Knowing that, I would actually conclude that x1 adds litte and
> should probably not be used.
>
> There is the overall question of what potentially explains this outcome,
> i.e. the reduction in AIC in going from reg2 to reg3 even though R^2 does
> not materially improve
> with the addition of x1 to reg 2 (to get to reg3). But that is more of a
> generic statistics issue and not my question here.
>
> The question I do have is, is there a package in R which implements a test
> and provides some diagnostic information I can use to rule out the use of
> x1 in a systematic way as it's addition to the equation adds little in
> terms of explaining the variability of y.
>
> Thanks in advance,
> Tolga
>
>
> Generally, this communication is for informational purposes only
> and it is not intended as an offer or solicitation for the purchase
> or sale of any financial instrument or as an official confirmation
> of any transaction. In the event you are receiving the offering
> materials attached below related to your interest in hedge funds or
> private equity, this communication may be intended as an offer or
> solicitation for the purchase or sale of such fund(s).  All market
> prices, data and other information are not warranted as to
> completeness or accuracy and are subject to change without notice.
> Any comments or statements made herein do not necessarily reflect
> those of JPMorgan Chase & Co., its subsidiaries and affiliates.
>
> This transmission may contain information that is privileged,
> confidential, legally privileged, and/or exempt from disclosure
> under applicable law. If you are not the intended recipient, you
> are hereby notified that any disclosure, copying, distribution, or
> use of the information contained herein (including any reliance
> thereon) is STRICTLY PROHIBITED. Although this transmission and any
> attachments are believed to be free of any virus or other defect
> that might affect any computer system into which it is received and
> opened, it is the responsibility of the recipient to ensure that it
> is virus free and no responsibility is accepted by JPMorgan Chase &
> Co., its subsidiaries and affiliates, as applicable, for any loss
> or damage arising in any way from its use. If you received this
> transmission in error, please immediately contact the sender and
> destroy the material in its entirety, whether in electronic or hard
> copy format. Thank you.
> Please refer to http://www.jpmorgan.com/pages/disclosures for
> disclosures relating to UK legal entities.
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list