How much is it currently costing you in time to do the selection process?
Is it having a large impact on your program? Is it the part that is really
consuming the overall time? What is your concern in this area? Here is the
timing that it take so select from 10M values those that are less than a
specific value. This takes less than 0.2 seconds:
> x <- runif(1e7)
> system.time(y <- x < .5)
user system elapsed
0.15 0.05 0.20
> x <- sort(x)
> system.time(y <- x < .5)
user system elapsed
0.11 0.03 0.14
>
On Wed, May 20, 2009 at 8:54 AM, Brigid Mooney wrote:
> Hoping for a little insight into how to make sure I have R running as
> efficiently as possible.
>
> Suppose I have a data frame, A, with n rows and m columns, where col1
> is a date time stamp. Also suppose that when this data is imported
> (from a csv or SQL), that the data is already sorted such that the
> time stamp in col1 is in ascending (or descending) order.
>
> If I then wanted to select only the rows of A where col1 <= a certain
> time, I am wondering if R has to read through the entirety of col1 to
> select those rows (all n of them). Is it possible for R to recognize
> (or somehow be told) that these rows are already in order, thus
> allowing the computation could be completed in ~log(n) row reads
> instead?
>
> Thanks!
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
[[alternative HTML version deleted]]