[R] Matching rows?

gb gb at stat.umu.se
Mon Jul 3 13:15:22 CEST 2000


I have a matrix X with many rows and a vector y, which I want
to match against the rows of the matrix. I especially want
to know if there is a match or not. I can do this with 'all'
and a 'while' construct:

row.match <- function(y, X)
{
  found <- FALSE
  j <- 0
  while ( (!found) && (j < nrow(X)) )
    {
      j <- j + 1
      found <- all(y == X[j, ])
    }
  return ( found )
}

Two alternatives:

any( apply(X, 1, all.equal, y) == "TRUE")

any(apply(apply(X, 1, "==", y), 1, all))


When X is 53000x11 and y == X[53000, ]:

> unix.time(row.match(X, y))
[1] 5.18 0.07 5.34 0.00 0.00

> unix.time(any( apply(X, 1, all.equal, y) == "TRUE"))
[1] 122.08   2.34 126.28   0.00   0.00

> unix.time(any(apply(apply(X, 1, "==", y), 1, all)))
[1] 4.71 0.00 4.80 0.00 0.00

The last attempt is faster than row.match, but only in this 
"worst" case (and the double 'apply' looks ugly, but how can 
I avoid that?).

The problem with the apply approaches is apparently that all 
comparisons  must be made, while row.match quits as soon as
one match is found.

Question: What simple solution have I overlooked? 

A related problem: How do I find the unique rows of X?

Göran
-- 
 Göran Broström                      tel: +46 90 786 5223
 Department of Statistics            fax: +46 90 786 6614
 Umeå University
 SE-90187 Umeå, Sweden             e-mail: gb at stat.umu.se

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list