[R] naive question
Tony Plate
tplate at blackmesacapital.com
Wed Jun 30 17:34:25 CEST 2004
As far as I know, read.table() in S-plus performs similarly to read.table()
in R with respect to speed. So, I wouldn't put high hopes in finding much
satisfaction there.
I do frequently read large tables in S-plus, and with a considerable amount
of work was able to speed things up significantly, mainly by using scan()
with appropriate arguments. It's possible that some of the add-on modules
for S-plus (e.g., the data-mining module) have faster I/O, but I haven't
investigated those. I get the best read performance out of S-plus by using
a homegrown binary file format with each column stored in a contiguous
block of memory and meta data (i.e., column types and dimensions) stored at
the start of the file. The S-plus read function reads the columns one at a
time using readRaw(). One would be able to do something similar in R. If
you have to read from a text file, then, as others have suggested, writing
a C program wouldn't be that hard, as long as you make the format inflexible.
-- Tony Plate
At Tuesday 06:19 PM 6/29/2004, Igor Rivin wrote:
>I was not particularly annoyed, just disappointed, since R seems like
>a much better thing than SAS in general, and doing everything with a
>combination
>of hand-rolled tools is too much work. However, I do need to work with
>very large data sets, and if it takes 20 minutes to read them in, I have
>to explore other
>options (one of which might be S-PLUS, which claims scalability as a major
>, er, PLUS over R).
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list