[R] make methods work in lapply - remove lapply's environment
Duncan Murdoch
murdoch at stats.uwo.ca
Tue Sep 9 04:09:46 CEST 2008
On 08/09/2008 9:37 PM, Tim Hesterberg wrote:
> I've defined my own version of summary.default,
> that gives a better summary for highly skewed vectors.
>
> If I call
> summary(x)
> the method is used.
>
> If I call
> summary(data.frame(x))
> the method is not used.
>
> I've traced this to lapply; this uses the new method:
> lapply(list(x), function(x) summary(x))
> and this does not:
> lapply(list(x), summary)
>
> If I make a copy of lapply, WITHOUT the environment,
> then the method is used.
>
> lapply <- function (X, FUN, ...) {
> FUN <- match.fun(FUN)
> if (!is.vector(X) || is.object(X))
> X <- as.list(X)
> .Internal(lapply(X, FUN))
> }
>
> I'm curious to hear reactions to this.
> There is a March 2006 thread
> object size vs. file size
> in which Duncan Murdoch wrote:
>> Functions in R consist of 3 parts: the formals, the body, and the
>> environment. You can't remove any part, but you can change it.
> That is exactly what I want to do, remove the environment, so that
> when I define a better version of some function that the better
> version is used.
But that's not removing the environment, that's changing it. Your
function has globalenv() as its environment.
>
> Here's a function to automate the process:
> copyFunction <- function(Name){
> # Copy a function, without its environment.
> # Name should be quoted
> # Return the copy
> file <- tempfile()
> on.exit(unlink(file))
> dput(get(Name), file = file)
> f <- source(file)$value
> f
> }
> lapply <- copyFunction("lapply")
>
A shorter version is
copyFunction <- function(fn){
environment(fn) <- globalenv()
fn
}
(which doesn't require quoting the function name).
But getting back to your original question: the real problem is with S3
method dispatch and its interaction with lapply. In the bad case,
lapply calls the generic, which calls the dataframe method, which calls
methods (probably via the generic again, but I didn't look) for each of
the columns. This is an ambiguous case: should the dataframe method
act the way its author expected, and call the standard method, or should
it follow the search order you want?
I'd tend to think authors of functions should be able to depend on their
behaviour not changing based on what's happening in the global
environment. That's not an absolute rule: you should be able to define
a new class and a method for it and have things work, but I don't think
the fact that you have a summary.default should affect functions in
namespaces that were written for the standard summary.default.
Duncan Murdoch
More information about the R-help
mailing list