set functions
Prof Brian D Ripley
ripley@stats.ox.ac.uk
Wed, 5 Jan 2000 10:21:24 +0000 (GMT)
On Wed, 5 Jan 2000, Martin Maechler wrote:
> On 4 Jan 2000, Peter Dalgaard BSA wrote:
> > Watch:
> >
> > > x<-1:50000
> > > y<-x[order(runif(50000))]
> > > "equiv2" <- function(x, y) all(c(match(x, y, 0)>0, match(y, x, 0)>0))
> > > equiv<-function(x,y)
> > + length(x<-unique(x))==length(y<-unique(y)) &&
> > + all(sort(x)==sort(y))
> > > system.time(equiv2(x,y))
> > [1] 3.10 0.02 3.00 0.00 0.00
> > > system.time(equiv(x,y))
> > [1] 0.77 0.00 1.00 0.00 0.00
>
> JonR> Yup -- that's much quicker! To re-ask the original question,
> JonR> would it be reasonable to include such a function along with the
> JonR> other set functions? Cheers, Jonathan.
>
> quite a good idea, particularly, since we all have now learned that it is
> non-trivial to write really efficiently.
Some of us knew that. What worries me a bit is that optimizing code for the
current R may not be a good idea. R currently spends a lot of its
time on garbage collection (30 to 50% on my profiling) and it is planned to
alter the memory allocator real soon now. When hashing of environments
was introduced it made a lot of difference to some code, and little to
others. That's not to say that we should not optimize, but
trying hard may be a waste of time. (Says he having learnt the hard way
across S-PLUS versions.)
Brian
--
Brian D. Ripley, ripley@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._