[R] effective way to return only the first argument of "which()"
Milan Bouchet-Valat
nalimilan at club.fr
Wed Sep 19 17:55:11 CEST 2012
Le mercredi 19 septembre 2012 à 15:23 +0000, William Dunlap a écrit :
> The original method is faster than which.max for longish numeric vectors
> (in R-2.15.1), but you should check time and memory usage on your
> own machine:
>
> > x <- runif(18e6)
> > system.time(for(i in 1:100)which(x>0.99)[1])
> user system elapsed
> 11.64 1.05 12.70
> > system.time(for(i in 1:100)which.max(x>0.99))
> user system elapsed
> 16.38 2.94 19.35
If you the probability that such an element appears at the beginning of
the vector, a custom hack might well be more efficient. The problem with
">", which() and which.max() is that they will go over all the elements
of the vector even if it's not needed at all. So you can start with a
small subset of the vector, and increase its size in a few steps until
you find the value you're looking for.
Proof of concept (the values of n obviously need to be adapted):
x <-runif(1e7)
find <- function(x, lim) {
len <- length(x)
for(n in 2^(14:0)) {
val <- which(x[seq.int(1L, len/n)] > lim)
if(length(val) > 0) return(val[1])
}
return(NULL)
}
> system.time(for(i in 1:100)which(x>0.999)[1])
utilisateur système écoulé
9.740 5.795 15.890
> system.time(for(i in 1:100)which.max(x>0.999))
utilisateur système écoulé
14.288 9.510 24.562
> system.time(for(i in 1:100)find(x, .999))
utilisateur système écoulé
0.017 0.002 0.019
> find(x, .999)
[1] 1376
(Looks almost like cheating... ;-)
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
> > -----Original Message-----
> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> > Of Jeff Newmiller
> > Sent: Wednesday, September 19, 2012 8:06 AM
> > To: Mike Spam; r-help at r-project.org
> > Subject: Re: [R] effective way to return only the first argument of "which()"
> >
> > ?which.max
> > ---------------------------------------------------------------------------
> > Jeff Newmiller The ..... ..... Go Live...
> > DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
> > Live: OO#.. Dead: OO#.. Playing
> > Research Engineer (Solar/Batteries O.O#. #.O#. with
> > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
> > ---------------------------------------------------------------------------
> > Sent from my phone. Please excuse my brevity.
> >
> > Mike Spam <ichmagspam at googlemail.com> wrote:
> >
> > >Hi,
> > >
> > >I was looking for a function like "which()" but only returns the first
> > >argument.
> > >Compare:
> > >
> > >x <- c(1,2,3,4,5,6)
> > >y <- 4
> > >which(x>y)
> > >
> > >returns:
> > >5,6
> > >
> > >which(x>y)[1]
> > >returns:
> > >5
> > >
> > >which(x>y)[1] is exactly what i need. I did use this but the dataset
> > >is too big (~18 mio. Points).
> > >That's why i need a more effective way to get the first element of a
> > >vector which is bigger/smaller than a specific number.
> > >
> > >I found "match()" but this function only works for equal numbers.
> > >
> > >
> > >
> > >Thanks,
> > >Nico
> > >
> > >______________________________________________
> > >R-help at r-project.org mailing list
> > >https://stat.ethz.ch/mailman/listinfo/r-help
> > >PLEASE do read the posting guide
> > >http://www.R-project.org/posting-guide.html
> > >and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list