[R] data frame vs. matrix
Jeff Newmiller
jdnewmil at dcn.davis.CA.us
Mon Mar 17 01:31:31 CET 2014
Did you really intend to make all of the x values the same? If so, try one line instead of the for loop:
dfr$x[ 2:n ] <- dfr$x[ 1 ]
If that was merely an error in your example, then you could use a different one-liner:
dfr$x[ 2:n ] <- dfr$x[ seq.int( n-1 ) ]
In either case, the speedup is considerable.
I use data frames far more than matrices and don't feel I am suffering for it, but then I also use creative indexing way more than for loops.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On March 16, 2014 11:57:33 AM PDT, "Göran Broström" <goran.brostrom at umu.se> wrote:
>I have always known that "matrices are faster than data frames", for
>instance this function:
>
>
>dumkoll <- function(n = 1000, df = TRUE){
> dfr <- data.frame(x = rnorm(n), y = rnorm(n))
> if (df){
> for (i in 2:NROW(dfr)){
> if (!(i %% 100)) cat("i = ", i, "\n")
> dfr$x[i] <- dfr$x[i-1]
> }
> }else{
> dm <- as.matrix(dfr)
> for (i in 2:NROW(dm)){
> if (!(i %% 100)) cat("i = ", i, "\n")
> dm[i, 1] <- dm[i-1, 1]
> }
> dfr$x <- dm[, 1]
> }
>}
>
>--------------------
> > system.time(dumkoll())
>
> user system elapsed
> 0.046 0.000 0.045
>
> > system.time(dumkoll(df = FALSE))
>
> user system elapsed
> 0.007 0.000 0.008
>----------------------
>
>OK, no big deal, but I stumbled over a data frame with one million
>records. Then, with df = TRUE,
>----------------------------
> user system elapsed
>44677.141 1271.544 46016.754
>----------------------------
>This is around 12 hours.
>
>With df = FALSE, it took only six seconds! About 7500 time faster.
>
>I was really surprised by the huge difference, and I wonder if this is
>to be expected, or if it is some peculiarity with my installation: I'm
>running Ubuntu 13.10 on a MacBook Pro with 8 Gb memory, R-3.0.3.
>
>Göran B.
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list