[R] big speed difference in source btw. R 2.15.2 and R 3.0.2 ?
Heinz Tuechler
tuechler at gmx.at
Wed Oct 30 01:00:24 CET 2013
Dear All,
is it known that source works much faster in R 2.15.2 than in R 3.0.2 ?
In the example below I observe e.g. for a data.frame with 10^7 rows the
following timings:
R version 2.15.2 Patched (2012-11-29 r61184)
length: 1e+07
user system elapsed
62.04 0.22 62.26
R version 3.0.2 Patched (2013-10-27 r64116)
length: 1e+07
user system elapsed
388.63 176.42 566.41
Is there a way to speed R version 3.0.2 up to the performance of R
version 2.15.2?
best regards,
Heinz Tüchler
example:
sessionInfo()
sample.vec <-
c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from', 'the',
'named', 'file', 'or', 'URL', 'or', 'connection')
dmp.size <- c(10^(1:7))
set.seed(37)
for(i in dmp.size) {
df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
dump('df0', file='testdump')
cat('length:', i, '\n')
print(system.time(source('testdump', keep.source = FALSE,
encoding='')))
}
output for R version 2.15.2 Patched (2012-11-29 r61184):
> sessionInfo()
R version 2.15.2 Patched (2012-11-29 r61184)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252
[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
> sample.vec <-
+ c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from',
'the',
+ 'named', 'file', 'or', 'URL', 'or', 'connection')
> dmp.size <- c(10^(1:7))
> set.seed(37)
>
> for(i in dmp.size) {
+ df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
+ dump('df0', file='testdump')
+ cat('length:', i, '\n')
+ print(system.time(source('testdump', keep.source = FALSE,
+ encoding='')))
+ }
length: 10
user system elapsed
0 0 0
length: 100
user system elapsed
0 0 0
length: 1000
user system elapsed
0 0 0
length: 10000
user system elapsed
0.02 0.00 0.01
length: 1e+05
user system elapsed
0.21 0.00 0.20
length: 1e+06
user system elapsed
4.47 0.04 4.51
length: 1e+07
user system elapsed
62.04 0.22 62.26
>
output for R version 3.0.2 Patched (2013-10-27 r64116):
> sessionInfo()
R version 3.0.2 Patched (2013-10-27 r64116)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252
[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
> sample.vec <-
+ c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from',
'the',
+ 'named', 'file', 'or', 'URL', 'or', 'connection')
> dmp.size <- c(10^(1:7))
> set.seed(37)
>
> for(i in dmp.size) {
+ df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
+ dump('df0', file='testdump')
+ cat('length:', i, '\n')
+ print(system.time(source('testdump', keep.source = FALSE,
+ encoding='')))
+ }
length: 10
user system elapsed
0 0 0
length: 100
user system elapsed
0 0 0
length: 1000
user system elapsed
0 0 0
length: 10000
user system elapsed
0.01 0.00 0.01
length: 1e+05
user system elapsed
0.36 0.06 0.42
length: 1e+06
user system elapsed
6.02 1.86 7.88
length: 1e+07
user system elapsed
388.63 176.42 566.41
>
--
Heinz Tüchler +4317146261 / +436605653878
More information about the R-help
mailing list