[R] cut2 not binning interval endpoints correctly

S Ellison S.Ellison at lgcgroup.com
Tue Nov 26 14:46:20 CET 2013


> -----Original Message-----
> > I am attempting to bin a vector of numbers between 0 and 1 into
> > intervals of 0.001 but many values at the endpoints of the intervals
> > are getting binned into the wrong interval. For example, the first 3
> > rows are binned incorrectly here:
>
> From: Jim Holtman
> FAQ 7.31
> 
Maybe. But 

#and
0.308 == seq(0, 0.310, 0.001)[309]
# [1] TRUE

seems to suggest that while some oddities may be explained by finite precision, 0.308 is exactly represented by the cut sequence  here, so .308 should be OK.

#in addition, extending  the OP's example
df <- data.frame(x=c(0.308,0.422,0.174,0.04709))
df$bucket <- cut2(df$x,seq(0,1,0.001),oneval=FALSE)
df$cutR <- cut(df$x,seq(0,1,0.001),right=FALSE)
df

#         x        bucket          cutR
# 1 0.30800 [0.307,0.308) [0.308,0.309)
# 2 0.42200 [0.421,0.422) [0.422,0.423)
# 3 0.17400 [0.173,0.174) [0.173,0.174)
# 4 0.04709 [0.047,0.048) [0.047,0.048)

implies that cut2 is not doing the same thing as cut despite the same intended outcome (at least on R 3.0.1, my present version at work).

This may be one for Frank Harrell ...

S Ellison



*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}



More information about the R-help mailing list