[R] scatterplot of 100000 points and pdf file format
Liaw, Andy
andy_liaw at merck.com
Thu Nov 25 03:45:27 CET 2004
> From: Ted.Harding at nessie.mcc.ac.uk
>
> On 25-Nov-04 Ted Harding wrote:
> > 'unique' will eat x for breakfast, indeed, but will have some
> > trouble chewing (x,y).
> >
> > I still can't think of a neat way of doing that.
> >
> > Best wishes,
> > Ted.
>
> Sorry, I don't want to be misunderstood.
> I didn't mean that 'unique' won't work for arrays.
> What I meant was:
>
> > X<-round(rnorm(1e6),3);Y<-round(rnorm(1e6),3)
> > system.time(unique(X))
> [1] 0.74 0.07 0.81 0.00 0.00
> > system.time(unique(cbind(X,Y)))
> [1] 350.81 4.56 356.54 0.00 0.00
Do you know if majority of that time is spent in unique() itself? If so,
which method? What I see is:
> X<-round(rnorm(1e6),3);Y<-round(rnorm(1e6),3)
> system.time(unique(X), gcFirst=TRUE)
[1] 0.25 0.01 0.26 NA NA
> system.time(unique(cbind(X,Y)), gcFirst=TRUE)
[1] 101.80 0.34 104.61 NA NA
> system.time(dat <- data.frame(x=X, y=Y), gcFirst=TRUE)
[1] 10.17 0.00 10.24 NA NA
> system.time(unique(dat), gcFirst=TRUE)
[1] 23.94 0.11 24.15 NA NA
Andy
> However, still rounding to 3 d.p. we can try packing:
>
> > Z<-100000000*X + 1000*Y
> > system.time(W<-unique(Z))
> [1] 0.83 0.05 0.88 0.00 0.00
> > length(W)
> [1] 961523
>
> Though the runtime is small we don't get much reduction
> and still W has to be unpacked.
>
> With rounding to 2 d.p.
>
> > X<-round(rnorm(1e6),2);Y<-round(rnorm(1e6),2)
> > Z<-100000000*X + 1000*Y
> > system.time(W<-unique(Z))
> [1] 1.31 0.01 1.32 0.00 0.00
> > length(W)
> [1] 209882
>
> so now it's about 1/5, but visible discretisation must be
> getting close.
>
> With 1 d.p.
>
> > X<-round(rnorm(1e6),1);Y<-round(rnorm(1e6),1)
> > Z<-100000000*X + 1000*Y
> > system.time(W<-unique(Z))
> [1] 0.92 0.01 0.93 0.00 0.00
> > length(W)
> [1] 4953
>
> there's a good reduction (about 1/200) but the discretisation
> would definitely now be visible. However, as I suggested before,
> there's an issue of choice of constant (i.e. of the resolution
> of the discretisation so that there's a useful reduction and
> also the plot is acceptable).
>
> I'd still like to learn of a method which avoids the
> above method of packing, which strikes me as clumsy
> (but maybe it's the best way after all).
>
> Ted.
>
>
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
> Fax-to-email: +44 (0)870 094 0861 [NB: New number!]
> Date: 25-Nov-04 Time: 01:45:48
> ------------------------------ XFMail ------------------------------
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>
More information about the R-help
mailing list