[R] cut with floating point, a bug?
Duncan Murdoch
murdoch at stats.uwo.ca
Fri Jun 19 09:09:49 CEST 2009
Shawn Rutledge wrote:
> With floating point numbers I'm seeing 'cut' putting values in the wrong
> bands. An example below places 0.3 in (0.3,0.6] i.e. 0.3 > 0.3.
>
>
>> x = 1:5*.1
>> x
>>
> [1] 0.1 0.2 0.3 0.4 0.5
>
>> cut(x, br=c(0,.3,.6))
>>
> [1] (0,0.3] (0,0.3] (0.3,0.6] (0.3,0.6] (0.3,0.6]
> Levels: (0,0.3] (0.3,0.6]
>
> I'm sure this is probably the same issue documented in the FAQ (7.31 Why
> doesn't R think these numbers are equal?)
> http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
>
> [1] Is there a way to make cut work correctly (a code fix)?
>
It is working correctly. The third element of x is bigger than 0.3.
> [2] Is there a workaround for using the current cut?
>
You could round all values to the same number of decimal places.
> [3] Why does 'hist' work correctly on the same data?
>
See ?hist. It applies a numerical tolerance when working on the edges
of bins.
Duncan Murdoch
>
>> table(cut(x, br=c(0,.3,.6)))
>>
> (0,0.3] (0.3,0.6]
> 2 3
>
>> hist(x, br=c(0,.3,.6), plot=F)$counts
>>
> [1] 3 2
>
>
>> sessionInfo()
>>
> R version 2.9.0 (2009-04-17)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list