[Rd] speed of cor(..., use="pairwise.complete.obs") when no NAs are present.

Karolis K k@ro||@@koncev|c|u@ @end|ng |rom gm@||@com
Mon Jul 5 18:17:37 CEST 2021


Hello,

I was iterating over some matrices with cor(x, use=“pairwise.complete.obs”) to handle cases with NA values and noticed that such “use=“ setting had a big influence on speed even for matrices that had no NAs. Given that anyNA(x) is so quick, maybe in the case of use=“pairwise.complete.obs” it would make sense to first check if the arguments have any NAs, and switch to use=“everything” in such cases?

Below are my quick benchmarks.

x <- matrix(rnorm(1e6), ncol=1e3)

system.time(cor(x))                                                                                                                                 
#    user  system elapsed                                                                                                                               
 #  0.636   0.003   0.641

system.time(cor(x, use="pair"))                                                                                                                     
#    user  system elapsed                                                                                                                               
#   3.509   0.013   3.538

system.time(anyNA(x))                                                                                                                               
#    user  system elapsed                                                                                                                               
#  0.001   0.000   0.001


Warm regards,
Karolis K.


More information about the R-devel mailing list