[Rd] degraded performance with rank()

Duncan Murdoch murdoch at stats.uwo.ca
Sat May 30 23:11:59 CEST 2009


Tim Bergsma wrote:
> Hi.
>
> I'm maintaining a package that creates an object that is essentially a
> classed version of numeric.  I updated recently from 2.7.1 to 2.9.0,
> and merges involving my class suddenly took a huge performance hit.
> I've traced the problem to something near rank().  From NEWS, it seems
> rank() etc. changed in 2.8.0.  Methods for xtfrm() are supposed to
> help, but I've had no success.  There was some chatter about this in
> the archives back in Sept 08 (though apparently regarding S4), with a
> suggestion that it is related to `[.` methods.  That has been my
> experience.  In the toy example below, the problem disappears if
> `[.my` is not defined.  Under R 2.7.1 on Mac, both cat() statements
> take the same amount of time, and that time depends very little on the
> length of x.  Under 2.9.0, the classed version takes much longer, and
> the time grows (more than?) exponentially with length(x).
>
> Is there something I can do to xtfrm.my() or [.my(), etc. to restore
> the performance?
>   

If the object  x you pass to xtfrm evaluates is.object(x) as false, 
you'll get fast behaviour, because it will use internal methods.  If you 
have a class on it, you'll get dispatch to your method for every 
comparison, which will be slow.

So if speed matters, I would not define an xtfrm.my, I would unclass 
things before sorting.

Prior to a day or so ago, the dispatch for comparison methods was 
broken, and you would get a mix of internal and external comparisons.  I 
suspect fixing that bug has made it even slower if you choose to try to 
sort a classed object.

Duncan Murdoch
> Thanks in advance,
>
> Tim.
>
> rm(list=ls())
> as.my <- function(x,...)UseMethod('as.my')
> as.my.default <- function(x,...)structure(x, class=c('my',class(x)))
> `[.my` <- function (x, ...) structure(NextMethod("["), class = class(x))
> xtfrm.my <- function(x)as.numeric(x)
> x <- 1:10000
> cat(system.time(rank(x))[3]);cat(' ')
> cat(system.time(rank(as.my(x)))[3]);cat('\n')
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list