[Rd] Match .3 in a sequence
William Dunlap
wdunlap at tibco.com
Wed Mar 18 19:49:59 CET 2009
> -----Original Message-----
> From: r-devel-bounces at r-project.org
> [mailto:r-devel-bounces at r-project.org] On Behalf Of Duncan Murdoch
> Sent: Tuesday, March 17, 2009 12:15 PM
> To: Daniel Murphy
> Cc: r-devel at r-project.org
> Subject: Re: [Rd] Match .3 in a sequence
>
> On 3/17/2009 11:26 AM, Daniel Murphy wrote:
> > Is this a reasonably fast way to do an approximate match of
> a vector x to
> > values in a list?
> >
> > match.approx <- function(x,list,tol=.0001)
> > sapply(apply(abs(outer(list,x,"-"))<tol,2,which),"[",1)
>
> If you are willing to assume that the list values are all
> multiples of
> 2*tol, then it's easy: just divide both x and list by 2*tol,
> round to
> nearest integer, and use the regular match function.
>
> If not, it becomes harder; I'd probably use a solution like yours.
>
> Duncan Murdoch
Here are 2 other implentations of that match.approx function
which use much less memory (and are faster) when the length
of 'x' and 'list' are long (>100, say). The first uses
approx(method="const") to figure out which entries in the
list are just below and above each entry in x and the second
uses sorting tricks to do the same thing. Then you only have
to figure out if the closest of those 2 entries is close enough.
The original one above fails when tol>min(diff(sort(list))).
match.approx2 <-
function(x,list,tol=.0001) {
o1 <- rep.int(c(FALSE,TRUE),
c(length(x),length(list)))[order(c(x,list))]
o2 <- rep.int(c(FALSE,TRUE),
c(length(x),length(list)))[order(c(x,list))]
below <- approx(list, list, xout=x, method="constant", f=0)$y
above <- approx(list, list, xout=x, method="constant", f=1)$y
stopifnot(all(below<=x, na.rm=TRUE), all(above>=x, na.rm=TRUE))
closestInList <- ifelse(x-below < above-x, below, above)
closestInList[x<min(list)] <- min(list)
closestInList[x>max(list)] <- max(list)
closestInList[abs(x-closestInList)>tol] <- NA
match(closestInList, list)
}
match.approx3 <-
function(x, list, tol=.0001){
stopifnot(length(list)>0, !any(is.na(x)), !any(is.na(list)))
oox <- order(order(x)) # essentially rank(x)
i <- rep(c(FALSE,TRUE), c(length(x),length(list)))[order(c(x,
list))]
i <- cumsum(i)[!i] + 1L
i[i > length(list)] <- NA
i <- order(list)[i]
leastUpperBound <- i[oox]
i <- rep(c(TRUE,FALSE), c(length(list),length(x)))[order(c(list,
x))]
i <- cumsum(i)[!i]
i[i < 1L] <- NA
i <- order(list)[i]
greatestLowerBound <- i[oox]
closestInList <-
ifelse(is.na(greatestLowerBound),
leastUpperBound, # above max(list)
ifelse(is.na(leastUpperBound),
greatestLowerBound, # below min(list)
ifelse(x-list[greatestLowerBound]<list[leastUpperBound]-x,
greatestLowerBound,
leastUpperBound)))
if (tol<Inf)
closestInList[abs(x - list[closestInList])>tol] <- NA
closestInList
}
> >
> > Thanks.
> > -Dan
> >
> > On Mon, Mar 16, 2009 at 8:24 AM, Stavros Macrakis
> <macrakis at alum.mit.edu>wrote:
> >
> >> Well, first of all, seq(from=.2,to=.3) gives c(0.2), so I
> assume you
> >> really mean something like seq(from=.2,to=.3,by=.1), which gives
> >> c(0.2, 0.3).
> >>
> >> %in% tests for exact equality, which is almost never a
> good idea with
> >> floating-point numbers.
> >>
> >> You need to define what exactly you mean by "in" for floating-point
> >> numbers. What sort of tolerance are you willing to allow?
> >>
> >> Some possibilities would be for example:
> >>
> >> approxin <- function(x,list,tol) any(abs(list-x)<tol) # absolute
> >> tolerance
> >>
> >> rapproxin <- function(x,list,tol) (x==0 && 0 %in% list) ||
> >> any(abs((list-x)/x)<=tol,na.rm=TRUE)
> >> # relative tolerance; only exact 0 will match 0
> >>
> >> Hope this helps,
> >>
> >> -s
> >>
> >> On Mon, Mar 16, 2009 at 9:36 AM, Daniel Murphy
> <chiefmurphy at gmail.com>
> >> wrote:
> >> > Hello:I am trying to match the value 0.3 in the sequence
> seq(.2,.3). I
> >> get
> >> >> 0.3 %in% seq(from=.2,to=.3)
> >> > [1] FALSE
> >> > Yet
> >> >> 0.3 %in% c(.2,.3)
> >> > [1] TRUE
> >> > For arbitrary sequences, this "invisible .3" has been
> problematic. What
> >> is
> >> > the best way to work around this?
> >>
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
More information about the R-devel
mailing list