[Rd] 10x slower merge in mac 2.9.1 vs. 2.9.0 (PR#13890)
Simon Urbanek
simon.urbanek at r-project.org
Thu Aug 13 17:52:09 CEST 2009
Rick,
I'm sorry, but I cannot reproduce it. You didn't supply sessionInfo()
and the actual data, so all I can do is guess, but according to your
description this test case shows no difference:
set.seed(1)
n=10000
d1
=
data
.frame
(seqn
=
as
.integer
(runif
(n
)*n
),a
=
rnorm
(n
),b
=
rnorm
(n
),c
=
rnorm
(n),d=rnorm(n),e=rnorm(n),f=rnorm(n),g=rnorm(n),h=rnorm(n),i=rnorm(n))
d2
=
data
.frame
(seqn
=
as
.integer
(runif
(n
)*n
),a
=
rnorm
(n
),b
=
rnorm
(n
),c
=
rnorm
(n),d=rnorm(n),e=rnorm(n),f=rnorm(n),g=rnorm(n),h=rnorm(n),i=rnorm(n))
system.time(merge(d1,d2,by="seqn",all.x=TRUE))
R 2.9.1:
> system.time(merge(d1,d2,by="seqn",all.x=TRUE))
user system elapsed
0.150 0.067 0.217
R 2.9.0:
> system.time(merge(d1,d2,by="seqn",all.x=TRUE))
user system elapsed
0.148 0.068 0.216
To substantiate your claim, please provide a reproducible example as
well as sessionInfo() [and details on how you run it - GUI, CLI, ...],
but I suspect the difference may be in your data, not R.
Thanks,
Simon
On Aug 12, 2009, at 12:25 , richard_stahlhut at urmc.rochester.edu wrote:
> Full_Name: Rick Stahlhut
> Version: 2.9.1
> OS: os x 10.5.7
> Submission from: (NULL) (128.151.71.23)
>
>
> I upgraded to 2.9.1 today from 2.9.0. I work with large CDC
> (center for
> disease control) datasets and start, frequently, with a series of 23
> large-ish
> merges to create the final dataset I work on. I do this each time
> because (a) R
> is fast. why not? and b) the datasets occasionally get updated by
> CDC and
> it's easier to swap in new files that way.
>
> One such merge is two data.frames with 10 variables and 10,000 rows
> each. The
> command in question is:
>
> temp = merge (demo.2,ph,by="seqn",all.x=TRUE)
>
> in 2.9.0, this command took 3.3 seconds.
> in 2.9.1, it took 35.8 seconds.
>
> I have reverted back to 2.9.0.
>
> Additional packages loaded are:
>
> library(Hmisc)
> library(alr3)
> library(epicalc)
> library(ggplot2)
> library(lattice)
> library(reshape)
> library(survey)
> library(car)
>
> thanks very much for all the effort. R is wonderful.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
More information about the R-devel
mailing list