[Rd] seq() function accuracy inacceptable
Thomas Lumley
tlumley at u.washington.edu
Tue Apr 18 20:04:18 CEST 2006
> The seq-command produces unnescessary inaccurate results, which can be extremely
> annoying. I absolutely do not see the nescessity of numerical garbage
> to appear in the following simple case. E.g. try this:
> > seq ( 61.55 , 62.00 , by=0.01 ) - round ( seq ( 61.55 , 62.00 , by=0.01 ) ,
> digits=2 )
An even simpler case may help explain why this is not *unnecessary*
inaccuracy.
Consider the three expressions
2+0.01+0.01
2+0.01*2
2.02
These need not give the same answer. As it happens, on my computer 2.02
and 2+0.01*2 are the same, but they differ by the smallest representable
amount from 2+0.01+0.01. All three could be different in other examples.
Since you think the correct output of seq() is easy to determine, which of
these should be equal to the third element of seq(2, 3, by=0.01)?
By the way, seq() is an interesting example, because the code goes to some
effort to do the sort of thing you want. It is designed to give less
accurate answers so as to be consistent with naive expectations when
'to'-'from' is close to a multiple of 'by'. This doesn't affect your
example, but if you had used seq(61.56,62,by=0.01) you would have
benefitted from the fact that, although (62-61.56)/0.01 is very slightly
less than 44, seq() still includes the 44th step. In general, though, R
is better off using as much accuracy as possible for a given computation
rather than trying to guess what a user will want to use it for.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-devel
mailing list