[R] finding the most frequent row
Gabor Grothendieck
ggrothendieck at myway.com
Sat Dec 11 01:42:48 CET 2004
Jean Eid <jeaneid <at> chass.utoronto.ca> writes:
:
: You can do the following also
:
:
: X <- matrix(c(1,2,1,2,1,3,1,4), ncol=2)
: Y<-unique(X)
: Y[which.max(apply(Y, 1,function(i) sum(apply(matrix(1:nrow(X)),
: 1,function(x) identical(X[x,], i[1:2]))))),]
:
: I do not know what your strategy is when there are multiple maxima i.e two
: different rows appear at the same frequency. The above will ge the first
: row that appears to be the maximum as an example suppose that we have
The two calls to apply in this last statement can be simplified
by eliminating the indices:
Y[which.max(apply(Y, 1, function(y) sum( apply(X, 1, identical, y)))),]
Also, although less efficient, its interesting to note that
the two apply calls without the sum can be regarded as multiplying
Y times X transpose under the identical() inner product:
# multiply x times y' under the inner product f.
# With f <- function(x,y)sum(x,y) it corresponds to ordinary
# matrix multiplication of a times b transpose.
xyt <- function(a,b,f) apply(b,1,function(x)apply(a,1,function(y)f(x,y)))
# factoring out sum this generalized multiplication gives us:
Y[which.max(colSums( xyt(Y,X,identical) )),]
More information about the R-help
mailing list