[Rd] Bug in %in% (match)
Tony Plate
tplate at blackmesacapital.com
Fri Apr 4 13:18:55 MEST 2003
You're hitting a limit in floating point number representation (not in
match), where you are coming across the fact that sometimes in floating
point arithmetic you can get results like x + 0.1 + 0.1 != x + 0.2 (which
is partly due to the fact that decimal fractions cannot be represented
exactly in binary floating point numbers).
Consider:
> table(seq(100,125,by=0.2) %in% seq(93,125,by=0.1)) # start the superset
at 93 -- works OK
TRUE
126
>
> table(seq(100,125,by=0.2) %in% seq(92,125,by=0.1)) # starting the
superset at 92 instead of 93
FALSE TRUE
1 125
By way of further explanation:
> table(round(seq(100,125,by=0.2),digits=1) %in%
round(seq(92,125,by=0.1),digits=1)) # round both sets to 1 decimal digit --
works OK
TRUE
126
> table(seq(92,125,by=0.1) == round(seq(92,125,by=0.1),digits=1)) # see
how many numbers in the by=0.1 sequence are not equal to their rounded versions
FALSE TRUE
2 329
>
> seq(92,125,by=0.1)[seq(92,125,by=0.1) !=
round(seq(92,125,by=0.1),digits=1)] # the guilty parties
[1] 124.3 124.8
>
(Also, note that the differences between the rounded and non-rounded
sequences are far larger for sequences starting at 0).
At Friday 05:07 PM 4/4/2003 +0800, Nicholas Lewin-Koh wrote:
>Hi,
>Am I hitting some limit in match? Consider the following example:
>
> > tst<-seq(100,125,by=.2)%in%seq(0,800,by=.1)
> > sum(tst)
>[1] 76
> > seq(100,125,by=.2)
> [1] 100.0 100.2 100.4 100.6 100.8 101.0 101.2 101.4 101.6 101.8 102.0
>102.2
> [13] 102.4 102.6 102.8 103.0 103.2 103.4 103.6 103.8 104.0 104.2 104.4
>104.6
> [25] 104.8 105.0 105.2 105.4 105.6 105.8 106.0 106.2 106.4 106.6 106.8
>107.0
> [37] 107.2 107.4 107.6 107.8 108.0 108.2 108.4 108.6 108.8 109.0 109.2
>109.4
> [49] 109.6 109.8 110.0 110.2 110.4 110.6 110.8 111.0 111.2 111.4 111.6
>111.8
> [61] 112.0 112.2 112.4 112.6 112.8 113.0 113.2 113.4 113.6 113.8 114.0
>114.2
> [73] 114.4 114.6 114.8 115.0 115.2 115.4 115.6 115.8 116.0 116.2 116.4
>116.6
> [85] 116.8 117.0 117.2 117.4 117.6 117.8 118.0 118.2 118.4 118.6 118.8
>119.0
> [97] 119.2 119.4 119.6 119.8 120.0 120.2 120.4 120.6 120.8 121.0 121.2
>121.4
>[109] 121.6 121.8 122.0 122.2 122.4 122.6 122.8 123.0 123.2 123.4 123.6
>123.8
>[121] 124.0 124.2 124.4 124.6 124.8 125.0
> > seq(100,125,by=.2)[tst]
> [1] 100.0 100.2 100.4 101.0 101.2 101.4 102.0 102.2 102.4 103.0 103.2
>103.4
>[13] 104.0 104.2 104.4 105.0 105.2 105.4 106.0 106.2 106.4 107.0 107.2
>107.4
>[25] 108.0 108.2 108.4 109.0 109.2 109.4 110.0 110.2 110.4 111.0 111.2
>111.4
>[37] 112.0 112.2 112.4 113.0 113.2 113.4 114.0 114.2 114.4 115.0 115.2
>115.4
>[49] 116.0 116.2 116.4 117.0 117.2 117.4 118.0 118.2 118.4 119.0 119.2
>119.4
>[61] 120.0 120.2 120.4 121.0 121.2 121.4 122.0 122.2 122.4 123.0 123.2
>123.4
>[73] 124.0 124.2 124.4 125.0
>
>But
>tst<-seq(100,125,by=.2)%in%seq(100,125,by=.1)
> > sum(tst)
>[1] 126
> > seq(100,125,by=.2)[tst]
> [1] 100.0 100.2 100.4 100.6 100.8 101.0 101.2 101.4 101.6 101.8 102.0
>102.2
> [13] 102.4 102.6 102.8 103.0 103.2 103.4 103.6 103.8 104.0 104.2 104.4
>104.6
> [25] 104.8 105.0 105.2 105.4 105.6 105.8 106.0 106.2 106.4 106.6 106.8
>107.0
> [37] 107.2 107.4 107.6 107.8 108.0 108.2 108.4 108.6 108.8 109.0 109.2
>109.4
> [49] 109.6 109.8 110.0 110.2 110.4 110.6 110.8 111.0 111.2 111.4 111.6
>111.8
> [61] 112.0 112.2 112.4 112.6 112.8 113.0 113.2 113.4 113.6 113.8 114.0
>114.2
> [73] 114.4 114.6 114.8 115.0 115.2 115.4 115.6 115.8 116.0 116.2 116.4
>116.6
> [85] 116.8 117.0 117.2 117.4 117.6 117.8 118.0 118.2 118.4 118.6 118.8
>119.0
> [97] 119.2 119.4 119.6 119.8 120.0 120.2 120.4 120.6 120.8 121.0 121.2
>121.4
>[109] 121.6 121.8 122.0 122.2 122.4 122.6 122.8 123.0 123.2 123.4 123.6
>123.8
>[121] 124.0 124.2 124.4 124.6 124.8 125.0
>
>Gives the correct answer. Did I miss something?
>
>Nicholas
>---------------------------------
>pertinent R info
>
>platform i686-pc-linux-gnu
>arch i686
>os linux-gnu
>system i686, linux-gnu
>status
>major 1
>minor 6.2
>year 2003
>month 01
>day 10
>language R
>
>______________________________________________
>R-devel at stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list