[R] big speed difference in source btw. R 2.15.2 and R 3.0.2 ?
Heinz Tuechler
tuechler at gmx.at
Wed Oct 30 13:49:02 CET 2013
All was run on the identical machine in independent sessions. I did not
restart Windows. I also tried 32bit R 3.0.2 and it seemed slightly
faster than 64bit.
Using Process Explorer v15.23
(http://technet.microsoft.com/de-de/sysinternals/bb896653) my impression
was that R 3.0.2 manages memory in a different way than R 2.15.2. While
in R 2.15.2 the physical memory used grows steadily, when sourcing a big
file, in R 3.0.2 growth and shrinking cycle.
best,
Heinz
on/am 30.10.2013 13:28, Carl Witthoft wrote/hat geschrieben:
> Did you run the identical code on the identical machine, and did you verify
> there were no other tasks running which might have limited the RAM available
> to R? And equally important, did you run these tests in the reverse order
> (in case R was storing large objects from the first run, thus chewing up
> RAM)?
>
>
>
> Dear All,
>
> is it known that source works much faster in R 2.15.2 than in R 3.0.2 ?
> In the example below I observe e.g. for a data.frame with 10^7 rows the
> following timings:
>
> R version 2.15.2 Patched (2012-11-29 r61184)
> length: 1e+07
> user system elapsed
> 62.04 0.22 62.26
>
> R version 3.0.2 Patched (2013-10-27 r64116)
> length: 1e+07
> user system elapsed
> 388.63 176.42 566.41
>
> Is there a way to speed R version 3.0.2 up to the performance of R
> version 2.15.2?
>
> best regards,
>
> Heinz Tüchler
>
>
> example:
> sessionInfo()
> sample.vec <-
> c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from', 'the',
> 'named', 'file', 'or', 'URL', 'or', 'connection')
> dmp.size <- c(10^(1:7))
> set.seed(37)
>
> for(i in dmp.size) {
> df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
> dump('df0', file='testdump')
> cat('length:', i, '\n')
> print(system.time(source('testdump', keep.source = FALSE,
> encoding='')))
> }
>
> output for R version 2.15.2 Patched (2012-11-29 r61184):
>> sessionInfo()
> R version 2.15.2 Patched (2012-11-29 r61184)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252
> [3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
> [5] LC_TIME=German_Switzerland.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>> sample.vec <-
> + c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from',
> 'the',
> + 'named', 'file', 'or', 'URL', 'or', 'connection')
>> dmp.size <- c(10^(1:7))
>> set.seed(37)
>>
>> for(i in dmp.size) {
> + df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
> + dump('df0', file='testdump')
> + cat('length:', i, '\n')
> + print(system.time(source('testdump', keep.source = FALSE,
> + encoding='')))
> + }
> length: 10
> user system elapsed
> 0 0 0
> length: 100
> user system elapsed
> 0 0 0
> length: 1000
> user system elapsed
> 0 0 0
> length: 10000
> user system elapsed
> 0.02 0.00 0.01
> length: 1e+05
> user system elapsed
> 0.21 0.00 0.20
> length: 1e+06
> user system elapsed
> 4.47 0.04 4.51
> length: 1e+07
> user system elapsed
> 62.04 0.22 62.26
>>
>
>
> output for R version 3.0.2 Patched (2013-10-27 r64116):
>> sessionInfo()
> R version 3.0.2 Patched (2013-10-27 r64116)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252
> [3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
> [5] LC_TIME=German_Switzerland.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>> sample.vec <-
> + c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from',
> 'the',
> + 'named', 'file', 'or', 'URL', 'or', 'connection')
>> dmp.size <- c(10^(1:7))
>> set.seed(37)
>>
>> for(i in dmp.size) {
> + df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
> + dump('df0', file='testdump')
> + cat('length:', i, '\n')
> + print(system.time(source('testdump', keep.source = FALSE,
> + encoding='')))
> + }
> length: 10
> user system elapsed
> 0 0 0
> length: 100
> user system elapsed
> 0 0 0
> length: 1000
> user system elapsed
> 0 0 0
> length: 10000
> user system elapsed
> 0.01 0.00 0.01
> length: 1e+05
> user system elapsed
> 0.36 0.06 0.42
> length: 1e+06
> user system elapsed
> 6.02 1.86 7.88
> length: 1e+07
> user system elapsed
> 388.63 176.42 566.41
>>
>
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/big-speed-difference-in-source-btw-R-2-15-2-and-R-3-0-2-tp4679314p4679346.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list