[R] efficiency when processing ordered data frames

Brigid Mooney bkmooney at gmail.com
Wed May 20 14:54:28 CEST 2009


Hoping for a little insight into how to make sure I have R running as
efficiently as possible.

Suppose I have a data frame, A, with n rows and m columns, where col1
is a date time stamp.  Also suppose that when this data is imported
(from a csv or SQL), that the data is already sorted such that the
time stamp in col1 is in ascending (or descending) order.

If I then wanted to select only the rows of A where col1 <= a certain
time, I am wondering if R has to read through the entirety of col1 to
select those rows (all n of them).  Is it possible for R to recognize
(or somehow be told) that these rows are already in order, thus
allowing the computation could be completed in ~log(n) row reads
instead?

Thanks!




More information about the R-help mailing list