[Rd] xftrm is more than 100x slower for AsIs than for character vectors

Ivan Krylov |kry|ov @end|ng |rom d|@root@org
Sun Jul 14 11:24:29 CEST 2024


В Fri, 12 Jul 2024 17:35:19 +0200
Hilmar Berger via R-devel <r-devel using r-project.org> пишет:

> This can be finally traced to base::rank() (called from
> xtfrm.default), where I found that
> 
> "NB: rank is not itself generic but xtfrm is, and rank(xtfrm(x), ....)
> will have the desired result if there is a xtfrm method. Otherwise,
> rank will make use of ==, >, is.na and extraction methods for classed
> objects, possibly rather slowly. "

The problem is indeed that the vector reaches base::rank in both cases,
but since it has a class, the function has to construct and evaluate a
call to .gt every time it wants to compare two elements.

xtfrm.AsIs even tries to remove the 'AsIs' class before continuing the
method dispatch process:

>> if (length(cl <- class(x)) > 1) oldClass(x) <- cl[-1L]

It doesn't work in the (very contrived) case when 'AsIs' is not the
first class and it doesn't remove 'AsIs' as the only class (making
static int equal(...) take the slower branch). What's going to break if
we allow removing the class attribute altogether? This seems to speed
up xtfrm(I(x)) and survive LC_ALL=C.UTF-8 make check-devel:

Index: src/library/base/R/sort.R
===================================================================
--- src/library/base/R/sort.R	(revision 86895)
+++ src/library/base/R/sort.R	(working copy)
@@ -297,7 +297,8 @@
 
 xtfrm.AsIs <- function(x)
 {
-    if(length(cl <- class(x)) > 1) oldClass(x) <- cl[-1L]
+    cl <- oldClass(x)
+    oldClass(x) <- cl[cl != 'AsIs']
     NextMethod("xtfrm")
 }
 

-- 
Best regards,
Ivan



More information about the R-devel mailing list