[R] Efficient subsetting

R A F raf1729 at hotmail.com
Fri May 16 20:16:04 CEST 2003


Hi, I'm facing this problem quite a lot, so it seems worthwhile
to check to see what the most efficient solution is.

I've two vectors x (values ordered) and y.  I've ranges
x < x0, x0 <= x < x1, x1 <= x < x2, x2 <= x < x3, x > xn
and want to construct a subvector yprime of y which consists
of the first/last value of y whose x values are in the range.

For example,

x   y
1   2
1   3
2   3
3   4
4   5
5   6

and let's say the ranges are 1 <= x < 3 and 3 <= x < 5.  I
should produce yprime as c( 2, 4 ) (if I ask for the first value
of y whose x is in the range).  [If there're no x values within
a given range, output an NA.]

Obviously I can do a loop and use which, etc., but it seems
like there should be a better way.

Thanks very much.

A general solution would be nice, but if it helps to make the
algorithm efficient, I'm happy to assume

(a) x values are ordered
(b) the ranges are always evenly spaced:  for example, x in
0 to 10, 10 to 20, 20 to 30, etc.




More information about the R-help mailing list