[Rd] sapply improvements
William Dunlap
wdunlap at tibco.com
Wed Nov 4 21:53:29 CET 2009
It looks good on following examples:
> z <- split(log(1:10), rep(letters[1:2],c(3,7)))
> sapply(z, length, FUN.VALUE=numeric(1))
Error in sapply(z, length, FUN.VALUE = numeric(1)) :
FUN values must be of type 'double'
(I'd like the error to say "... must be of type 'double',
not 'integer'", to give the user a fuller diagnosis of
the problem.)
> sapply(z, range, FUN.VALUE=c(Min=0,Max=0))
a b
Min 0.000000 1.386294
Max 1.098612 2.302585
Exactly matching the typeof's and using the names
for row.names on matrix output seem good to me.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: Duncan Murdoch [mailto:murdoch at stats.uwo.ca]
> Sent: Wednesday, November 04, 2009 12:24 PM
> To: William Dunlap
> Cc: michael.m.spiegel at gmail.com; r-devel at stat.math.ethz.ch
> Subject: sapply improvements
>
> On 11/4/2009 12:15 PM, William Dunlap wrote:
> >> -----Original Message-----
> >> From: r-devel-bounces at r-project.org
> >> [mailto:r-devel-bounces at r-project.org] On Behalf Of Duncan Murdoch
> >> Sent: Wednesday, November 04, 2009 8:47 AM
> >> To: michael.m.spiegel at gmail.com
> >> Cc: R-bugs at r-project.org; r-devel at stat.math.ethz.ch
> >> Subject: Re: [Rd] error in install.packages() (PR#14042)
> >>
...
> >> For future reference: the problem was that it assigned
> the result of
> >> sapply() to a subset of a vector. Normally sapply()
> simplifies its
> >> result to a vector, but in this case the result was empty, so
> >> sapply()
> >> returned an empty list; assigning a list to a vector coerced
> >> the vector
> >> to a list, and then the "invalid subscript type 'list'" came
> >> soon after.
> >
> > I've run into this sort of problem a lot (0-long input to sapply
> > causes it to return list()). A related problem is that
> when sapply's
> > FUN doesn't always return the type of value you expect for some
> > corner case then sapply won't do the expected simplication. If
> > sapply had an argument that gave the expected form of FUN's output
> > then sapply could (a) die if some call to FUN didn't return
> something
> > of that form and (b) return a 0-long object of the correct form
> > if sapply's X has length zero so FUN is never called. E.g.,
> > sapply(2:0, function(i)(11:20)[i], FUN.VALUE=integer(1)) # die on
> > third iteration
> > sapply(integer(0), function(i)i>0, FUN.VALUE=logical(1)) # return
> > logical(0)
> >
> > Another benefit of sapply knowing the type of FUN's return value is
> > that it wouldn't have to waste space creating a list of FUN's return
> > values but could stuff them directly into the final output
> structure.
> > A list of n scalar doubles is 4.5 times bigger than
> double(n) and the
> > factor is 9.0 for integers and logicals.
>
>
> What do you think of the behaviour of the sapply function below? (I
> wouldn't put it into R as it is, I'd translate it to C code
> to avoid the
> lapply call; but I'd like to get the behaviour right before
> doing that.)
>
> This one checks that the length() and typeof() results are
> consistent.
> If the FUN.VALUE has names, those are used (but it doesn't
> require the
> names from FUN to match).
...
More information about the R-devel
mailing list