[Rd] often unnecessary duplicate in sapply / as.vector
Thomas Lumley
tlumley at u.washington.edu
Tue Jul 11 17:02:55 CEST 2006
On Tue, 11 Jul 2006, Prof Brian Ripley wrote:
> On Fri, 7 Jul 2006, Thomas Lumley wrote:
>
>> On Fri, 7 Jul 2006, Martin Morgan wrote:
>>
>>> sapply calls lapply as
>>>
>>> answer <- lapply(as.list(X), FUN, ...)
>>>
>>> which, when X is a list, causes X to be duplicated unnecessarily. The
>>> coercion is unnecessary for other mode(X) because in lapply we have
>>>
>>> if (!is.list(X)) X <- as.list(X)
>>
>> That looks reasonable.
>
> And you have made the change. Unfortunately it is not really reasonable,
> as is.list(X) does not test that X is a list (see its documentation) in
> the same sense as as.list, so pairlists are now passed to the internal
> code.
Where do we still get pairlists in interpreted code? I thought they had
all been hidden.
> There's something rather undesirable going on here. The internal code for
> lapply (in its current version, not the one I wrote) does the internal
> equivalent of
>
> rval <- vector("list", length(X))
> for(i in seq(along = X))
> rval[i] <- list(FUN(X[[i]], ...))
>
> from the earlier
>
> lapply <- function(X, FUN, ...) {
> FUN <- match.fun(FUN)
> if (!is.list(X))
> X <- as.list(X)
> rval <- vector("list", length(X))
> for(i in seq(along = X))
> rval[i] <- list(FUN(X[[i]], ...))
> names(rval) <- names(X) # keep `names' !
> return(rval)
> }
>
> so all that is needed is that X[[i]] work.
>
> For a pairlist [[i]] done repeatedly is very inefficient (since it starts
> at the beginning each time), so we *do* want to coerce pairlists here.
Or have a separate loop using CDR and CAR rather than [[, which would mean
not having to copy X.
> On the other hand, we do not need to coerce expressions or atomic vectors
> for which [[]] works just fine.
Indeed.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-devel
mailing list