[R] data filtering
HENRIKSON, JEFFREY
JEFHEN at SAFECO.com
Wed Jun 2 21:19:30 CEST 2004
I would like to know if there is a way to do the following command in
one step, primarily for speed on large data (5 million elements), and
secondarily for readablity.
mean(delta[(intersect(which(x[['class']]==0),which(delta<1)))])
Do I really have to rely on an intersect operator? Isn't that
O(nlg(n))? Can't I just filter in one step? As an R newbie, I would
have guessed I could write
mean(delta[which((x[['class']]==0) && (delta<1))])
But I guess no such luck since (delta<1), etc are vectors. Are they
really implemented as vectors? Ie, if I take 5M data points, does it
allocate 20MB of RAM to make a test that passes most of the elements?
The only thing I can think of is to use closures to write something like
a Lisp list "filter". Not sure on the readabilty merits, especially if
there is a direct way to do it. If Matlab had closures I know running
them in a loop would be a bear on runtime anyway.
Jeff Henrikson
More information about the R-help
mailing list