[R] big speed difference in source btw. R 2.15.2 and R 3.0.2 ?
Carl Witthoft
carl at witthoft.com
Wed Oct 30 13:28:47 CET 2013
Did you run the identical code on the identical machine, and did you verify
there were no other tasks running which might have limited the RAM available
to R? And equally important, did you run these tests in the reverse order
(in case R was storing large objects from the first run, thus chewing up
RAM)?
Dear All,
is it known that source works much faster in R 2.15.2 than in R 3.0.2 ?
In the example below I observe e.g. for a data.frame with 10^7 rows the
following timings:
R version 2.15.2 Patched (2012-11-29 r61184)
length: 1e+07
user system elapsed
62.04 0.22 62.26
R version 3.0.2 Patched (2013-10-27 r64116)
length: 1e+07
user system elapsed
388.63 176.42 566.41
Is there a way to speed R version 3.0.2 up to the performance of R
version 2.15.2?
best regards,
Heinz Tüchler
example:
sessionInfo()
sample.vec <-
c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from', 'the',
'named', 'file', 'or', 'URL', 'or', 'connection')
dmp.size <- c(10^(1:7))
set.seed(37)
for(i in dmp.size) {
df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
dump('df0', file='testdump')
cat('length:', i, '\n')
print(system.time(source('testdump', keep.source = FALSE,
encoding='')))
}
output for R version 2.15.2 Patched (2012-11-29 r61184):
> sessionInfo()
R version 2.15.2 Patched (2012-11-29 r61184)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252
[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
> sample.vec <-
+ c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from',
'the',
+ 'named', 'file', 'or', 'URL', 'or', 'connection')
> dmp.size <- c(10^(1:7))
> set.seed(37)
>
> for(i in dmp.size) {
+ df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
+ dump('df0', file='testdump')
+ cat('length:', i, '\n')
+ print(system.time(source('testdump', keep.source = FALSE,
+ encoding='')))
+ }
length: 10
user system elapsed
0 0 0
length: 100
user system elapsed
0 0 0
length: 1000
user system elapsed
0 0 0
length: 10000
user system elapsed
0.02 0.00 0.01
length: 1e+05
user system elapsed
0.21 0.00 0.20
length: 1e+06
user system elapsed
4.47 0.04 4.51
length: 1e+07
user system elapsed
62.04 0.22 62.26
>
output for R version 3.0.2 Patched (2013-10-27 r64116):
> sessionInfo()
R version 3.0.2 Patched (2013-10-27 r64116)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252
[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
> sample.vec <-
+ c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from',
'the',
+ 'named', 'file', 'or', 'URL', 'or', 'connection')
> dmp.size <- c(10^(1:7))
> set.seed(37)
>
> for(i in dmp.size) {
+ df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
+ dump('df0', file='testdump')
+ cat('length:', i, '\n')
+ print(system.time(source('testdump', keep.source = FALSE,
+ encoding='')))
+ }
length: 10
user system elapsed
0 0 0
length: 100
user system elapsed
0 0 0
length: 1000
user system elapsed
0 0 0
length: 10000
user system elapsed
0.01 0.00 0.01
length: 1e+05
user system elapsed
0.36 0.06 0.42
length: 1e+06
user system elapsed
6.02 1.86 7.88
length: 1e+07
user system elapsed
388.63 176.42 566.41
>
--
View this message in context: http://r.789695.n4.nabble.com/big-speed-difference-in-source-btw-R-2-15-2-and-R-3-0-2-tp4679314p4679346.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list