[Rd] cut takes long time
Deepayan Sarkar
deepayan.sarkar at gmail.com
Thu Jun 17 07:50:02 CEST 2010
On Wed, Jun 16, 2010 at 3:56 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> The following cut command takes nearly 10 seconds on my machine even
> though the length of input vector is only 6. I am running on Windows
> Vista with C2D BLAS using R 2.11.1. Using the default BLAS and either
> R 2.10.1 or "R version 2.12.0 Under development (unstable) (2010-05-31
> r52164)" also gives me results in the 9-11 second range.
> I would have expected it to take much less time.
>
>
> tt <- structure(c(631206000, 631206060, 631206180, 631206240, 631206300,
> 978224400), class = c("POSIXt", "POSIXct"), tzone = "")
>
> system.time(cut(tt, "2 hours", include = TRUE)) # 9.45 0.01 9.58
The POSIXt aspect is not relevant to this, it's the number of breakpoints.
> system.time(cut(tt, "2 hours", include = TRUE))
user system elapsed
5.884 0.108 6.033
> system.time(cut(rnorm(6), breaks = 50000))
user system elapsed
5.200 0.000 5.558
And the time seems linear in the number of breakpoints, which is not
surprising. The "Note" section in ?cut does mention more efficient
alternatives.
Note that
> system.time(cut(tt, "2 hours", include = TRUE, labels = FALSE))
user system elapsed
0.02 0.00 0.02
so it's the conversion to factors that seems to take most of the time.
-Deepayan
More information about the R-devel
mailing list