# [R] What BIC is calculated by 'regsubsets'?

Paul Murtaugh murtaugh at stat.oregonstate.edu
Fri Dec 19 20:39:17 CET 2008

```The function 'regsubsets' appears to calculate a BIC value that is
different from that returned by the function 'BIC'.  The latter is
explained in the documentation, but I can't find an expression for the
statistic returned by 'regsubsets'.

Incidentally, both of these differ from the BIC that is given in Ramsey
and Schafer's, The Statistical Sleuth.  I assume these are all linear
transformations of each other, but I'd like to know the 'regsubsets'
formula (so that I can develop a way to do all-subsets selection based
on the AIC rather than the BIC).

The following code defines a function that illustrates the issue.
Thanks
-Paul

script.ic <- function() {

library(datasets)
print(names(airquality))              # Ozone Solar.R Wind Temp Month Day

# Fit a model with two predictors
mod1 <- lm(Ozone ~ Wind + Temp, data=airquality)
npar <- length(mod1\$coef)+1         # no. parameters in fitted model,
#   including s2, is 4
nobs <- length(mod1\$fitted)         # no. of observations = 116
s2 <- summary(mod1)\$sigma2         # MSE = 477.6371
logL <- as.vector(logLik(mod1))     # log likelihood = -520.8705

# Use the R function BIC, defined as: -2*log-likelihood + npar*log(nobs)
tmp1 <- BIC(mod1)                           # 1060.755
tmp2 <- -2 * (-520.8705) + 4 * log(116)     # 1060.755, agrees
cat(paste("\nFrom R's BIC:",signif(tmp1,5),"(",signif(tmp2,5),
"obtained 'by hand')\n\n"))

# Now see how 'regsubsets' calculates the BIC
tmp3 <- regsubsets(Ozone ~ Solar.R + Wind + Temp, data=airquality)
tmp3.s <- summary(tmp3)

# 'mod1' is the second model in 'tmp3'; what is the formula for this BIC?
cat("\nThe corresponding model from 'regsubsets':\n")
print(tmp3.s\$which[2,])
tmp4 <- tmp3.s\$bic[2]                       # -82.52875
cat(paste("\nBIC =",signif(tmp4,5),"\n"))

# Incidentally, the 'rsq' and 'rss' components of tmp3.s do not agree
# with the values in the 'mod1' lm output object.

# Just for kicks, try the formula for Schwarz's BIC from Ramsey & Schafer,
# Statistical Sleuth:  nobs*log(MSE) + npar*log(nobs))
tmp5 <- 116 * log(477.6371) + 4*log(116)    # 734.6011
cat(paste("\nStat. Sleuth's BIC =",signif(tmp5,5),"\n"))
cat("\nUff da!\n")

browser()

}

```