[Rd] Bug in %in% (match)

Tony Plate tplate at blackmesacapital.com
Fri Apr 4 13:18:55 MEST 2003


You're hitting a limit in floating point number representation (not in 
match), where you are coming across the fact that sometimes in floating 
point arithmetic you can get results like  x + 0.1 + 0.1 != x + 0.2 (which 
is partly due to the fact that decimal fractions cannot be represented 
exactly in binary floating point numbers).

Consider:

 > table(seq(100,125,by=0.2) %in% seq(93,125,by=0.1)) # start the superset 
at 93 -- works OK

TRUE
  126
 >
 > table(seq(100,125,by=0.2) %in% seq(92,125,by=0.1)) # starting the 
superset at 92 instead of 93

FALSE  TRUE
     1   125

By way of further explanation:

 > table(round(seq(100,125,by=0.2),digits=1) %in% 
round(seq(92,125,by=0.1),digits=1)) # round both sets to 1 decimal digit -- 
works OK

TRUE
  126
 > table(seq(92,125,by=0.1) == round(seq(92,125,by=0.1),digits=1)) # see 
how many numbers in the by=0.1 sequence are not equal to their rounded versions

FALSE  TRUE
     2   329
 >
 > seq(92,125,by=0.1)[seq(92,125,by=0.1) != 
round(seq(92,125,by=0.1),digits=1)] # the guilty parties
[1] 124.3 124.8
 >

(Also, note that the differences between the rounded and non-rounded 
sequences are far larger for sequences starting at 0).


At Friday 05:07 PM 4/4/2003 +0800, Nicholas Lewin-Koh wrote:
>Hi,
>Am I hitting some limit in match? Consider the following example:
>
> > tst<-seq(100,125,by=.2)%in%seq(0,800,by=.1)
> > sum(tst)
>[1] 76
> > seq(100,125,by=.2)
>   [1] 100.0 100.2 100.4 100.6 100.8 101.0 101.2 101.4 101.6 101.8 102.0
>102.2
>  [13] 102.4 102.6 102.8 103.0 103.2 103.4 103.6 103.8 104.0 104.2 104.4
>104.6
>  [25] 104.8 105.0 105.2 105.4 105.6 105.8 106.0 106.2 106.4 106.6 106.8
>107.0
>  [37] 107.2 107.4 107.6 107.8 108.0 108.2 108.4 108.6 108.8 109.0 109.2
>109.4
>  [49] 109.6 109.8 110.0 110.2 110.4 110.6 110.8 111.0 111.2 111.4 111.6
>111.8
>  [61] 112.0 112.2 112.4 112.6 112.8 113.0 113.2 113.4 113.6 113.8 114.0
>114.2
>  [73] 114.4 114.6 114.8 115.0 115.2 115.4 115.6 115.8 116.0 116.2 116.4
>116.6
>  [85] 116.8 117.0 117.2 117.4 117.6 117.8 118.0 118.2 118.4 118.6 118.8
>119.0
>  [97] 119.2 119.4 119.6 119.8 120.0 120.2 120.4 120.6 120.8 121.0 121.2
>121.4
>[109] 121.6 121.8 122.0 122.2 122.4 122.6 122.8 123.0 123.2 123.4 123.6
>123.8
>[121] 124.0 124.2 124.4 124.6 124.8 125.0
> > seq(100,125,by=.2)[tst]
>  [1] 100.0 100.2 100.4 101.0 101.2 101.4 102.0 102.2 102.4 103.0 103.2
>103.4
>[13] 104.0 104.2 104.4 105.0 105.2 105.4 106.0 106.2 106.4 107.0 107.2
>107.4
>[25] 108.0 108.2 108.4 109.0 109.2 109.4 110.0 110.2 110.4 111.0 111.2
>111.4
>[37] 112.0 112.2 112.4 113.0 113.2 113.4 114.0 114.2 114.4 115.0 115.2
>115.4
>[49] 116.0 116.2 116.4 117.0 117.2 117.4 118.0 118.2 118.4 119.0 119.2
>119.4
>[61] 120.0 120.2 120.4 121.0 121.2 121.4 122.0 122.2 122.4 123.0 123.2
>123.4
>[73] 124.0 124.2 124.4 125.0
>
>But
>tst<-seq(100,125,by=.2)%in%seq(100,125,by=.1)
> > sum(tst)
>[1] 126
> > seq(100,125,by=.2)[tst]
>   [1] 100.0 100.2 100.4 100.6 100.8 101.0 101.2 101.4 101.6 101.8 102.0
>102.2
>  [13] 102.4 102.6 102.8 103.0 103.2 103.4 103.6 103.8 104.0 104.2 104.4
>104.6
>  [25] 104.8 105.0 105.2 105.4 105.6 105.8 106.0 106.2 106.4 106.6 106.8
>107.0
>  [37] 107.2 107.4 107.6 107.8 108.0 108.2 108.4 108.6 108.8 109.0 109.2
>109.4
>  [49] 109.6 109.8 110.0 110.2 110.4 110.6 110.8 111.0 111.2 111.4 111.6
>111.8
>  [61] 112.0 112.2 112.4 112.6 112.8 113.0 113.2 113.4 113.6 113.8 114.0
>114.2
>  [73] 114.4 114.6 114.8 115.0 115.2 115.4 115.6 115.8 116.0 116.2 116.4
>116.6
>  [85] 116.8 117.0 117.2 117.4 117.6 117.8 118.0 118.2 118.4 118.6 118.8
>119.0
>  [97] 119.2 119.4 119.6 119.8 120.0 120.2 120.4 120.6 120.8 121.0 121.2
>121.4
>[109] 121.6 121.8 122.0 122.2 122.4 122.6 122.8 123.0 123.2 123.4 123.6
>123.8
>[121] 124.0 124.2 124.4 124.6 124.8 125.0
>
>Gives the correct answer. Did I miss something?
>
>Nicholas
>---------------------------------
>pertinent R info
>
>platform i686-pc-linux-gnu
>arch     i686
>os       linux-gnu
>system   i686, linux-gnu
>status
>major    1
>minor    6.2
>year     2003
>month    01
>day      10
>language R
>
>______________________________________________
>R-devel at stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list