[R] Count unique rows/columns in a matrix
Gabor Csardi
csardi at rmki.kfki.hu
Sat Jan 12 20:15:31 CET 2008
On Sat, Jan 12, 2008 at 12:35:47PM -0500, John Kane wrote:
> I definately did not read it that way but that may
> have been my fault. That table approach is quite
> nice!
>
> Using it, you could just rebuild the vectors from the
> names. Does this do more or less what you want?
John, thanks. Still not good enough. :( The problem is not that the
result was in string format, but that not the real values are
compared, only the rounded values to six (?) decimals. I know this is only
the default and more could be done by setting some parameters
(probably options(digits) is enough), but then it is not very efficient,
since instead of comparing 8 byte doubles i'll be comparing quite long
strings for every single number in the matrix. This seems quite a hack
to me.
I'm thinking about the following solution. We hash every row/column
of the matrix, then sort the hashed values, and compare only those
rows/columns for which the hash values are the same. (With the proper
comparision, ie. via "==" or all.equal.)
Of course i'm not completely sure that this is faster than comparing
long strings, but i'll give it a try. I have quite big matrices,
that's why i need an efficient solution.
(I'm sending this to the list, because someone else was also
interested, but i lost his email address.)
Gabor
> X<-matrix(c(1,2,3,1,2,3,4,5,6,1,3,2,4,5,6,1,1,1),6,3,byrow=TRUE)
> xx <-table(apply(X, 1, paste, collapse=","))
> hh <- names(xx)
> nnk <-(strsplit(hh, ","))
> kkn <- lapply(nnk, as.numeric)
> df1 <-t(as.data.frame(kkn))
> cbind(df1,xx)
>
[...]
--
Csardi Gabor <csardi at rmki.kfki.hu> UNIL DGM
More information about the R-help
mailing list