[R] Tagging identical rows of a matrix
Waichler, Scott R
Scott.Waichler at pnl.gov
Fri May 14 22:12:08 CEST 2004
Thanks to all of you who responded to my help request.
Here is the very efficient upshot of your advice:
> mat2 <- apply(mat, 1, paste, collapse=":")
> vec <- match(mat2, unique(mat2))
> vec
[1] 1 2 1 1 2 3
P.S. I found that Andy Liaw's method didn't preserve the
index order that I wanted; it yields
2 3 2 2 3 1
To get the order of integers I was looking for required an
invocation of unique:
as.numeric(factor(apply(mat, 1, paste, collapse=":"),
levels=unique(apply(mat, 1, paste, collapse=":"))))
But the first method above is obviously cleaner and is twice
as fast, only 9 seconds for a 100000 row matrix on an ordinary PC.
Regards,
Scott Waichler
> > I would like to generate a vector having the same length
> > as the number of rows in a matrix. The vector should contain an
> > integer indicating the "group" of the row, where identical
> matrix rows
> > are in a group, and a unique row has a unique integer. Thus, for
> >
> > a <- c(1,2)
> > b <- c(1,3)
> > c <- c(1,2)
> > d <- c(1,2)
> > e <- c(1,3)
> > f <- c(2,1)
> > mat <- rbind(a,b,c,d,e,f)
> >
> > I would like to get the vector c(1,2,1,1,2,3). I know dist() gives
> > part of the answer, but I can't figure out how to use it for this
> > purpose without doing a lot of looping. I need to apply this to
> > matrices up to ~100000 rows.
More information about the R-help
mailing list