[Rd] request for patch in "drop1" (add.R)

Thu Feb 24 00:12:09 CET 2011

residuals() and $residuals are often very different:  residuals() is 
generic, but even the default method is *not* simple extraction. Their 
values can be of different lengths: think about an lm fit with 
na.action = na.exclude.  That is precisely the sort of thing the tests 
in add.R were designed to detect, hence the use of $residuals.

None of this is used in drop1()!  One of the places $residuals was 
used in that file was the default method for drop1(), others being 
step() and the default method for add1().  As default methods these 
have to continue to work for any class of object that has previously 
been thrown at them over the last 10+ years, and even all CRAN 
packages will not be a good enough test suite.

In any case, the current code in R-devel makes use of the new generic 
nobs(), which is intended to help sort out the many versions of 
functions called 'BIC" in packages, but is also useful here.  (It is 
still under development.)

terms() is also generic so there is also some danger that its 
substitution could give an inappropriate result.  But as it used in 
several other places in add.R the breakage will probably occur 
elsewhere already.

On Wed, 23 Feb 2011, Martin Maechler wrote:

>>>>>> Ben Bolker <bbolker at gmail.com>
>>>>>>     on Wed, 23 Feb 2011 09:14:37 -0500 writes:
>
>    >   By changing three lines in drop1 from access based on $
>    > to access based on standard accessor methods (terms() and
>    > residuals()), it becomes *much* easier to extend drop1 to
>    > work with other model types.  The use of $ rather than
>    > accessors in this context seems to be an oversight rather
>    > than a design decision, but maybe someone knows better ...
>
>    >   In particular, if one makes these changes (which I am
>    > pretty certain will not break anything, as the original
>    > code basically mimicked the default methods anyway), it
>    > becomes possible to make drop1() work with mer objects
>    > (Doug Bates's new mixed model code) merely by defining:
>
>    > terms.mer <- function(x, ...) {
>    > attr(x at frame,"terms")
>    > }
>
>    > extractAIC.default <- function(fit, scale=0, k=2, ...) {
>    > L <- logLik(fit)
>    > edf <- attr(L,"df")
>    > c(edf,-2*L+2*edf)
>    > }
>
>    > Adding this definition of extractAIC.default also makes drop1() work
>    > with lme fits ...
>
>    > Comments?  Should I submit to the bug database as "enhancement
>    > request"?  Are there any hidden downsides to this?
>
> drawback: a possible very small performance cut for the cases
> where the "$" access is ok.  But that should not count.
>
> I like the idea.... {it's a pity that only S3 methods work that way,
> because residuals(), terms(), etc... are unfortunately not
> general (implicit) S4 generics but just S3 ones..
>
> I'm currently testing your change, plus some more in step().
> However, for step() to work automagically there is more needed.
> It currently relies in more places on 'object' being a list to
> which you can append new components, basically by
>   fit <- object
>   fit$new1 <- ...
>   fit$new2 <- ...
>
> That would have to be changed to something like
>   fit <- list(obj = object)
>   fit$new1 <- ...
>   fit$new2 <- ...
> and more changes where 'fit' has to be replaced by 'fit$obj'.
> Would that not be of interest as well?
>
> Martin
>
>
>    > Ben Bolker
>
>    > ----------------------------------------------------------------------
>    > Index: add.R
>    > ===================================================================
>    > --- add.R	(revision 54562)
>    > +++ add.R	(working copy)
>    > @@ -330,7 +330,7 @@
>    > drop1.default <- function(object, scope, scale = 0, test=c("none", "Chisq"),
>    > k = 2, trace = FALSE, ...)
>    > {
>    > -    tl <- attr(object$terms, "term.labels")
>    > +    tl <- attr(terms(object), "term.labels")
>    > if(missing(scope)) scope <- drop.scope(object)
>    > else {
>    > if(!is.character(scope))
>    > @@ -344,7 +344,7 @@
>    > ans <- matrix(nrow = ns + 1L, ncol = 2L,
>    > dimnames =  list(c("<none>", scope), c("df", "AIC")))
>    > ans[1, ] <- extractAIC(object, scale, k = k, ...)
>    > -    n0 <- length(object$residuals)
>    > +    n0 <- length(residuals(object))
>    > env <- environment(formula(object))
>    > for(i in seq(ns)) {
>    > tt <- scope[i]
>    > @@ -356,7 +356,7 @@
>    > evaluate = FALSE)
>    > nfit <- eval(nfit, envir=env) # was  eval.parent(nfit)
>    > ans[i+1, ] <- extractAIC(nfit, scale, k = k, ...)
>    > -        if(length(nfit$residuals) != n0)
>    > +        if(length(residuals(nfit)) != n0)
>    > stop("number of rows in use has changed: remove missing values?")
>    > }
>    > dfs <- ans[1L , 1L] - ans[, 1L]
>
>    > ----------------------------------------------------------------------
>    > ______________________________________________
>    > R-devel at r-project.org mailing list
>    > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595