[R] strangely long floating point with write.table()
Mike Miller
mbmiller+l at gmail.com
Tue Mar 18 01:43:54 CET 2014
On Mon, 17 Mar 2014, Duncan Murdoch wrote:
> On 14-03-17 6:22 PM, Mike Miller wrote:
>
>> Thanks! Another thing I've figured out: Use of "drop0trailing=T" in
>> format() fixes the .00000 stuff that I didn't like:
>>
>> write.table(format(data[1:10,], digits=5, trim=T, drop0trailing=T), row.names=F, col.names=F, quote=F)
[snip]
>>
>> I still have more to figure out, but for most smaller table-writing
>> jobs, I think something like the last command above will be my usual
>> approach. In real life, I would use a tab delimiter, though.
>>
>> I'm still unsure about the best way for dealing with very large data
>> frames, though. There's probably a good way to stream data into a file
>> so that it doesn't have to be written as an additional large object in
>> memory. There must be a way to make a connection and then just pipe
>> the formatted data into it. Maybe something related to sprintf() will
>> work.
>
> You've never explained why you want to write these gigantic text files.
> Text is a lossy way to store numbers: it takes 15 bytes to store about
> 8 bytes of information, and you'll probably lose a few bits at the end.
> Why not write your files in binary, storing exactly what you have in
> memory? It'll be a lot faster to write and to read, you won't need to
> duplicated before writing, etc.
Thanks for asking, Duncan. A typical problem is that I am running 12
processes at once on a 12-core machine with 32 GB of RAM, so each process
has to be limited to about 2.5 GB total. Then I try to load as much data
as I can within that limitation. The output data does not always need to
be in text format, but it usually does because it has to be read by other
programs.
I was hoping I could read a line from a data frame and format it like
this:
> sprintf(c(rep("%s",2), rep("%d",2), rep("%.4f",4)), data[1,1:8])
But sprintf reads vectors, so they have to be of a single type.
Thanks for your help.
Mike
More information about the R-help
mailing list