[Rd] data.matrix (was sapply(Date, is.numeric))
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Aug 1 23:45:25 CEST 2008
I've committed a more liberal version to R-devel. (It even handles S4
classes with an as() method.)
On Thu, 31 Jul 2008, Martin Maechler wrote:
>>>>>> "PBR" == Prof Brian Ripley <ripley at stats.ox.ac.uk>
>>>>>> on Thu, 31 Jul 2008 08:36:22 +0100 (BST) writes:
>
> PBR> I've now committed fixes in R-patched and R-devel.
> PBR> There is one consequence: data.matrix() was testing for numeric columns by
> PBR> unlist(lapply(x, is.numeric)) and so incorrectly treating Date and POSIXct
> PBR> columns as numeric (which we had decided they were not). This affects
> PBR> package gvlma.
>
> PBR> data.matrix() is now working as documented, but as we have an exception
> PBR> for factors, do we also want exceptions for Date and POSIXct?
>
> Yes, that's a good idea, and much in the spirit of
> data.matrix()
> as I have understood it.
>
> Note the following from help(data.matrix)
>
> where I think the 'Title' and 'Description' are more liberal
> (rightly so) than 'Details' :
>
> >> Convert a Data Frame to a Numeric Matrix
> >>
> >> Description:
> >>
> >> Return the matrix obtained by converting all the variables in a
> >> data frame to numeric mode and then binding them together as the
> >> columns of a matrix. Factors and ordered factors are replaced by
> >> their internal codes.
>
> [...........]
>
> >> Details:
> >>
> >> Supplying a data frame with columns which are not numeric, factor
> >> or logical is an error. A warning is given if any non-factor
> >> column has a class, as then information can be lost.
>
>
> Do we really have good reasons to give an error if a column is
> not numeric (nor of the "exception class")?
>
> Couldn't we just lapply(., as.numeric)
> and if that doesn't give errors
> just "be happy" ?
>
> Martin
>
>
> PBR> On Wed, 30 Jul 2008, Martin Maechler wrote:
>
> >>>>>>> "BDR" == Prof Brian Ripley <ripley at stats.ox.ac.uk>
> >>>>>>> on Wed, 30 Jul 2008 13:29:38 +0100 (BST) writes:
> >>
> BDR> On Wed, 30 Jul 2008, Martin Maechler wrote:
> >> >>>>>>> "RobMcG" == McGehee, Robert <Robert.McGehee at geodecapital.com>
> >> >>>>>>> on Tue, 29 Jul 2008 15:40:37 -0400 writes:
> >> >>
> RobMcG> FYI,
> RobMcG> I've tried posting the below message twice to the bug tracking system,
> >> >>
> >> >> [....... r-bugs problems discussed in a separate thread ....]
> >> >>
> >> >>
> >> >>
> RobMcG> R-developers,
> RobMcG> The results below are inconsistent. From the documentation for
> RobMcG> is.numeric, I expect FALSE in both cases.
> >> >>
> >> >> >> x <- data.frame(dt=Sys.Date())
> >> >> >> is.numeric(x$dt)
> RobMcG> [1] FALSE
> >> >> >> sapply(x, is.numeric)
> RobMcG> dt
> RobMcG> TRUE
> >> >>
> RobMcG> ## Yet, sapply seems aware of the Date class
> >> >> >> sapply(x, class)
> RobMcG> dt
> RobMcG> "Date"
> >> >>
> >> >> Yes, thanks a lot, Robert, for the report.
> >> >>
> >> >> That *is* a bug somewhere in the .Internal(lapply(...)) C code,
> >> >> when S3 dispatch of primitive functions should happen.
> >>
> BDR> The bug is in do_is, which uses CHAR(PRINTNAME(CAR(call))), and when
> BDR> called from lapply that gives "FUN" not "is.numeric". The root cause is
> BDR> the following comment
> >>
> BDR> FUN = CADR(args); /* must be unevaluated for use in e.g. bquote */
> >>
> BDR> and hence that the function in the *call* passed to do_is can be
> BDR> unevaluated.
> >>
> >> aah! I see.
> >>
> >> >> Here's an R scriptlet exposing a 2nd example
> >> >>
> >> >> ### lapply(list, FUN)
> >> >> ### ------------------ seems to sometimes fail for
> >> >> ### .Primitive S3-generic functions
> >> >>
> >> >> (ds <- seq(from=Sys.Date(), by=1, length=4))
> >> >> ##[1] "2008-07-30" "2008-07-31" "2008-08-01" "2008-08-02"
> >> >> ll <- list(d=ds)
> >> >> lapply(list(d=ds), round)
> >> >> ## -> Error in lapply(list(d = ds), round) : dispatch error
> >>
> >>
> BDR> And that's a separate issue, in DispatchGroup which states that arguments
> BDR> have been evaluated (true) but the 'call' from lapply gives the
> BDR> unevaluated arguments and so there is a mismatch.
> >>
> >> yes, I too found that this was a separate issue, the latter
> >> one being new since version 2.7.0
> >>
> BDR> I'm testing fixes for both.
> >>
> >> Excellent!
> >> Martin
> >>
> >>
> >> >> ## or -- related to bug report by Robert McGehee on R-devel, on 2008-07-29:
> >> >> sapply(list(d=ds), is.numeric)
> >> >> ## TRUE
> >> >>
> >> >> ## in spite of
> >> >> is.numeric(`[[`(ll,1)) ## FALSE , because of
> >> >> is.numeric.date
> >> >>
> >> >> ## or
> >> >> round(`[[`(ll,1))
> >> >> ## [1] "2008-07-30" "2008-07-31" "2008-08-01" "2008-08-02"
> >> >>
> >> >> ##-----------------------------
> >> >>
> >> >> But I'm currently too much tied up with other duties,
> >> >> to find and test bug-fix.
> >> >>
> >> >> Martin Maechler, ETH Zurich and R-Core Team
> >>
> >> ______________________________________________
> >> R-devel at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
>
> PBR> --
> PBR> Brian D. Ripley, ripley at stats.ox.ac.uk
> PBR> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> PBR> University of Oxford, Tel: +44 1865 272861 (self)
> PBR> 1 South Parks Road, +44 1865 272866 (PA)
> PBR> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list