[R] Obtaining fitted model information

Sun Oct 31 20:48:39 CET 2004

On Sun, 31 Oct 2004, Prof Brian Ripley wrote:

> The harder task is actually to get `n', not `npar'.
> 
> length(resid(x)) may or may not include missing values, depending on the 
> na.action used, and will include observations with weight zero.
> However, logLik's "lm" method returns an attribute "nobs" that is a better 
> choice.

But only better, as it looks like it has fallen into the first trap.
AFAICS, if 'fit' is an lm fit, then

n = df.residual(fit) + fit$rank

Quick check:

x <- rnorm(10); x[2] <- NA
y <- rnorm(10); y[1] <- NA
w <- c(rep(1,9), 0)
fit <- lm(y ~x, weights = w, na.action=na.exclude)
df.residual(fit) + fit$rank  # 7, correct
attributes(logLik(fit)) # nall 9 nobs 9 df 3
# but AIC fails
length(resid(fit)) # 10

fit <- lm(y ~x, weights = w, na.action=na.omit)
df.residual(fit) + fit$rank  # 7, correct
attributes(logLik(fit)) # nall 7 nobs 7 df 3
length(resid(fit)) # 8

> 
> On Sun, 31 Oct 2004, Renaud Lancelot wrote:
> 
> > With models estimated with lm, the number of parameters is obtained 
> > adding 1 to the rank of the fitted model (to account for the residuals 
> > variance). The number of parameters is found in logLik objects:
> > 
> >  > # example from ?lm
> >  > ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
> >  > trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
> >  > group <- gl(2,10,20, labels=c("Ctl","Trt"))
> >  > weight <- c(ctl, trt)
> >  > lm.D9 <- lm(weight ~ group)
> >  >
> >  > # rank of the model
> >  > lm.D9$rank
> > [1] 2
> >  >
> >  > # loglik
> >  > logLik(lm.D9)
> > `log Lik.' -20.08824 (df=3)
> >  >
> >  > # number of parameters in the model
> >  > attr(logLik(lm.D9), "df")
> > [1] 3
> >  >
> >  > # AIC
> >  > AIC(lm.D9)
> > [1] 46.17648
> >  >
> >  > c(- 2 * logLik(lm.D9) + 2 * attr(logLik(lm.D9), "df"))
> > [1] 46.17648
> >  >
> >  > # AICc = AIC + 2 * k * (k + 1)/(n - k - 1)
> >  >
> >  > AICc_lm <- function(x){
> > +   n <- length(resid(x))
> > +   k <- attr(logLik(lm.D9), "df")
> > +   AIC(x) + 2 * k * (k + 1) / (n - k - 1)
> > +   }
> >  >
> >  > AICc_lm(lm.D9)
> > [1] 47.67648
> > 
> > Best regards,
> > 
> > Renaud
> > 
> > 
> > John Fox a Ã©crit :
> > 
> > > Dear Thomas,
> > > 
> > > To get the number of independent parameters in the lm object mod, you can
> > > use mod$rank, sum(!is.na(coef(mod)), or -- if there are no linear
> > > dependencies among the columns of the model matrix -- length(coef(mod)).
> > > 
> > > I hope this helps,
> > >  John
> > >>-----Original Message-----
> > >>From: r-help-bounces at stat.math.ethz.ch 
> > >>[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Thomas 
> > >>W Volscho
> > >>Sent: Sunday, October 31, 2004 12:41 PM
> > >>To: r-help at stat.math.ethz.ch
> > >>Subject: [R] Obtaining fitted model information
> > >>
> > >>Dear list,
> > >>I am brand new to R  and using Dalgaard's (2002) book 
> > >>Introductory Statistics with R (thus, some of my terminology 
> > >>may be incorrect).
> > >>
> > >>I am fitting regression models and I want to use Hurvich and 
> > >>Tsai's AICC statistic to examine my regression models.  This 
> > >>penalty can be expressed as: 2*npar * (n/(n-npar-1)).
> > >>
> > >>While you can obtain AIC, BIC, and logLik, I want to impose 
> > >>the AICC penalty instead.
> > >>
> > >>After fitting a model.  Is there a way of obtaining the 
> > >>"npar" and then assigning it to a variable?
> > >>
> > >>Essentially, I want to then write a simple function to add 
> > >>the AICC penalty to (-2*logLik).
> > >>
> > >>Thank you in advance for any help,
> > >>Tom Volscho
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595