[R] Efficient computation of trimmed stats?
Benilton Carvalho
bcarvalh at jhsph.edu
Mon May 14 18:58:42 CEST 2007
Hi everyone,
I was wondering if there is anything already implemented for
efficient ("row-wise") computation of group-specific trimmed stats
(mean and sd on the trimmed vector) on large matrices.
For example:
set.seed(1)
nc = 300
nr = 250000
x = matrix(rnorm(nc*nr), ncol=nc)
g = matrix(sample(1:3, nr*nc, rep=T), ncol=nc)
trimmedMeanByGroup <- function(y, grp, trim=.05)
tapply(y, factor(grp, levels=1:3), mean, trim=trim)
sapply(1:10, function(i) trimmedMeanByGroup(x[i,], g[i,]))
works fine... but:
> system.time(sapply(1:nr, function(i) trimmedMeanByGroup(x[i,], g
[i,])))
user system elapsed
399.928 0.019 399.988
does not look interesting for me.
Maybe some package has some implementation of the above?
Thank you very much,
-b
--
Benilton Carvalho
PhD Candidate
Department of Biostatistics
Bloomberg School of Public Health
Johns Hopkins University
bcarvalh at jhsph.edu
More information about the R-help
mailing list