[Rd] sapply improvements
Duncan Murdoch
murdoch at stats.uwo.ca
Thu Nov 5 16:06:55 CET 2009
On 11/5/2009 4:05 AM, Martin Maechler wrote:
>>>>>> "PD" == Peter Dalgaard <p.dalgaard at biostat.ku.dk>
>>>>>> on Thu, 05 Nov 2009 00:28:51 +0100 writes:
>
> PD> William Dunlap wrote: ...
> >>>
> >>> if (x <= 0) NA else log(x)
> >>>
> >>> variety otherwise.
> >>
> >> Would you only want it to coerce upwards to FUN.VALUES's
> >> type? E.g., allow sapply(z, length,
> >> FUN.VALUE=numeric(1)) to return a numeric vector but die
> >> on sapply(z, function(zi)as.complex(zi[1]),
> >> FUN.VALUE=numeric(1)) If the latter doesn't die should it
> >> return a complex or a numeric vector? (I'd say it needs
> >> to be numeric, but I'd prefer that it died.)
>
> PD> I'd say that it should probably die on downwards
> PD> coercion. Getting a double when an integer is expected,
> PD> or complex instead of double as you indicate, is a
> PD> likely user error. If not, then the user can always
> PD> coerce explicitly inside FUN.
>
> I agree with Peter: Do allow coercion downwards
>
> PD> Another issue is whether one would want to go beyond the
> PD> base classes of S (logical, integer, double, complex,
> PD> character). For other classes, there may be no notion of
> PD> "up" and "down" in coercion. Then again, sapply was
> PD> always limited to what unlist() will handle, so e.g.
>
> >> sapply(1:10,FUN=function(i)Sys.Date())
> PD> [1] 14553 14553 14553 14553 14553 14553 14553 14553
> PD> 14553 14553
>
> PD> as opposed to
>
> >> structure(rep(14553,10), class="Date")
> PD> [1] "2009-11-05" "2009-11-05" "2009-11-05"
> PD> "2009-11-05" "2009-11-05" [6] "2009-11-05" "2009-11-05"
> PD> "2009-11-05" "2009-11-05" "2009-11-05"
>
> Well, using
> as(<prelim_result>, class(<prototype>) )
>
> would be really nice here....
> but alas, we are still not allowed to use as(.,.) in base
> code which I'd tend to call a "design bug" nowadays..
Part of the difficulty here is that we have too many concepts of "class"
and "type" in R. For example, as() is not consistent with as.vector()
in the following sense:
If neither input is an S4 object, we should have
as(<prelim_result>, class(<prototype>) )
be the same as
as.vector(<prelim_result>, typeof(<prototype>))
and
as.vector(<prelim_result>, class(<prototype>))
and currently as() gives a different result. For example,
> str(as(1:10, class(double(1))))
int [1:10] 1 2 3 4 5 6 7 8 9 10
> str(as.vector(1:10, typeof(double(1))))
num [1:10] 1 2 3 4 5 6 7 8 9 10
> str(as.vector(1:10, class(double(1))))
num [1:10] 1 2 3 4 5 6 7 8 9 10
So if the coercion were to support as(), we'd need to decide when to
follow its rules, and when to follow the existing as.vector() rules
(which I think we're more or less following in the current sapply()).
We'd also need to handle the cases involving S4 objects:
I'd say if the prototype is not S4 but the result is, we should die with
an error.
If the prototype is S4, then we should use as(). We have fast C code to
detect S4 objects, do we have C code to do the coercion? I'd rather not
write it, but I wouldn't object if someone else did/already has.
Duncan Murdoch
More information about the R-devel
mailing list