[Rd] seq() function accuracy inacceptable

Tue Apr 18 20:04:18 CEST 2006

> The seq-command produces unnescessary inaccurate results, which can be extremely
> annoying.  I absolutely do not see the nescessity of numerical garbage 
> to appear in the following simple case.  E.g. try this:
> > seq ( 61.55 , 62.00 , by=0.01 ) - round ( seq ( 61.55 , 62.00 , by=0.01 ) ,
> digits=2 )

An even simpler case may help explain why this is not *unnecessary*
inaccuracy.

Consider the three expressions
   2+0.01+0.01
   2+0.01*2
   2.02

These need not give the same answer. As it happens, on my computer 2.02 
and 2+0.01*2 are the same, but they differ by the smallest representable 
amount from 2+0.01+0.01. All three could be different in other examples.

Since you think the correct output of seq() is easy to determine, which of 
these should be equal to the third element of seq(2, 3, by=0.01)?

By the way, seq() is an interesting example, because the code goes to some 
effort to do the sort of thing you want. It is designed to give less 
accurate answers so as to be consistent with naive expectations when 
'to'-'from' is close to a multiple of 'by'. This doesn't affect your 
example, but if you had used seq(61.56,62,by=0.01) you would have 
benefitted from the fact that, although (62-61.56)/0.01 is very slightly 
less than 44, seq() still includes the 44th step.  In general, though, R 
is better off using as much accuracy as possible for a given computation 
rather than trying to guess what a user will want to use it for.

 	-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle