[R] is match slow?
Thomas Lumley
tlumley at u.washington.edu
Tue Nov 20 18:35:38 CET 2001
On Tue, 20 Nov 2001, Agustin Lobo wrote:
>
> I'm doing
>
> m <- match(matriz, origen, 0)
>
> where matriz is a 270x900 matrix and
> origen a 11675 elements vector, and is taking
> a very long time.
>
> Is match a function
> implemented in C? If not, would a C
> code be faster?
Well, typing the function name at the R prompt gives
R> match
function (x, table, nomatch = NA, incomparables = FALSE)
{
if (!is.logical(incomparables) || incomparables)
.NotYetUsed("incomparables != FALSE")
.Internal(match(if (is.factor(x)) as.character(x) else x,
if (is.factor(table)) as.character(table) else table,
nomatch))
showing that it is .Internal and thus in compiled C code. Looking at
src/main/unique.c reveals that it is implemented by sticking `table' in a
hash table and looking up each element of x, which is a pretty good
algorithm for this problem. If the hash function is good it will take
about length(table)+length(x) hash computations, and you won't be able to
beat that easily.
I don't even find it that slow
> matriz<-matrix(rnorm(270*900),ncol=900)
> origen<-rnorm(11675)
> system.time(match(matriz,origen,0))
[1] 0.27 0.01 0.33 0.00 0.00
or with a lot of matches
> matriz<-matrix(sample(270*900,1:20,TRUE),ncol=900)
> origen<-1:11675
> system.time(match(matriz,origen,0))
[1] 0.01 0.00 0.01 0.00 0.00
-thomas
Thomas Lumley Asst. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list