[R] unique and precision of long integers
Thomas Lumley
tlumley at u.washington.edu
Mon May 14 17:41:15 CEST 2001
On Mon, 14 May 2001, Michael Herron wrote:
>
> Hello.
>
> I have a dataset with about 500,000 observations, most of which are
> not unique. The first 10 observations look like
>
> 901000000000100000010100101011002
> 901101101110100000010100101011002
> 901000000000100000010100000001002
> 901000000000100000010101001011002
> 901000000000100000010101010011002
> 901000000000100000010100110101002
> 901000000000100000010100101011002
> 900000000000100000010010101011002
> 901000000000100000010100101101002
> 901000000000100000010100101011002
>
> Each digit reflects a separate field, but above all spaces are
> removed.
>
> I read in the data with scan(), and then use unique() to get the
> unique observations. But, when I print these elements to a file I
> lose precision. For instance, let x be a vector of the first 10
> observations from the dataset:
>
> > write (x,file="output",ncol=1)
>
> more output
>
> 9.01e+32
> 9.011011e+32
> 9.01e+32
> 9.01e+32
> 9.01e+32
> 9.01e+32
> 9.01e+32
> 9e+32
> 9.01e+32
> 9.01e+32
>
> Is there a way to get all the digits back?
>
> > write (format(x,digits=22),file="output",ncol=1)
>
> does not do it, and I cannot seem to increase digits >22.
>
You can't store numbers to more than the precision provided by your
compiler/hardware, so there's probably only 16 accurate digits no matter
how many R prints.
In order to unique() them you can read them as strings, which have
essentially unlimited precision.
-thomas
Thomas Lumley Asst. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list