[Rd] table() and as.character() performance for logical values

Karolis Koncevičius k@ro||@@koncev|c|u@ @end|ng |rom gm@||@com
Fri Mar 21 13:26:44 CET 2025


I was calling table() on some long logical vectors and noticed that it took a long time.

Out of curiosity I checked the performance of table() on different types, and had some unexpected results:

    C <- sample(c("yes", "no"), 10^7, replace = TRUE)
    F <- factor(sample(c("yes", "no"), 10^7, replace = TRUE))
    N <- sample(c(1,0), 10^7, replace = TRUE)
    I <- sample(c(1L,0L), 10^7, replace = TRUE)
    L <- sample(c(TRUE, FALSE), 10^7, replace = TRUE)

                           # ordered by execution time
                           #   user  system elapsed
    system.time(table(F))  #  0.088   0.006   0.093
    system.time(table(C))  #  0.208   0.017   0.224
    system.time(table(I))  #  0.242   0.019   0.261
    system.time(table(L))  #  0.665   0.015   0.680
    system.time(table(N))  #  1.771   0.019   1.791


The performance for Integers and specially booleans is quite surprising.
After investigating the source of table, I ended up on the reason being “as.character()”:

    system.time(as.character(L))
     user  system elapsed       
    0.461   0.002   0.462       

Even a manual conversion can achieve a speed-up by a factor of ~7:

    system.time(c("FALSE", "TRUE")[L+1])
     user  system elapsed               
    0.061   0.006   0.067               
   

Tested on 4.4.3 as well as devel trunk.

Just reporting for comments and attention.
Karolis K.


More information about the R-devel mailing list