[Rd] degraded performance with rank()
Duncan Murdoch
murdoch at stats.uwo.ca
Sat May 30 23:11:59 CEST 2009
Tim Bergsma wrote:
> Hi.
>
> I'm maintaining a package that creates an object that is essentially a
> classed version of numeric. I updated recently from 2.7.1 to 2.9.0,
> and merges involving my class suddenly took a huge performance hit.
> I've traced the problem to something near rank(). From NEWS, it seems
> rank() etc. changed in 2.8.0. Methods for xtfrm() are supposed to
> help, but I've had no success. There was some chatter about this in
> the archives back in Sept 08 (though apparently regarding S4), with a
> suggestion that it is related to `[.` methods. That has been my
> experience. In the toy example below, the problem disappears if
> `[.my` is not defined. Under R 2.7.1 on Mac, both cat() statements
> take the same amount of time, and that time depends very little on the
> length of x. Under 2.9.0, the classed version takes much longer, and
> the time grows (more than?) exponentially with length(x).
>
> Is there something I can do to xtfrm.my() or [.my(), etc. to restore
> the performance?
>
If the object x you pass to xtfrm evaluates is.object(x) as false,
you'll get fast behaviour, because it will use internal methods. If you
have a class on it, you'll get dispatch to your method for every
comparison, which will be slow.
So if speed matters, I would not define an xtfrm.my, I would unclass
things before sorting.
Prior to a day or so ago, the dispatch for comparison methods was
broken, and you would get a mix of internal and external comparisons. I
suspect fixing that bug has made it even slower if you choose to try to
sort a classed object.
Duncan Murdoch
> Thanks in advance,
>
> Tim.
>
> rm(list=ls())
> as.my <- function(x,...)UseMethod('as.my')
> as.my.default <- function(x,...)structure(x, class=c('my',class(x)))
> `[.my` <- function (x, ...) structure(NextMethod("["), class = class(x))
> xtfrm.my <- function(x)as.numeric(x)
> x <- 1:10000
> cat(system.time(rank(x))[3]);cat(' ')
> cat(system.time(rank(as.my(x)))[3]);cat('\n')
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
More information about the R-devel
mailing list