[R] naive question
bates at stat.wisc.edu
Wed Jun 30 02:56:09 CEST 2004
Igor Rivin wrote:
> I was not particularly annoyed, just disappointed, since R seems like
> a much better thing than SAS in general, and doing everything with a combination
> of hand-rolled tools is too much work. However, I do need to work with very large data sets, and if it takes 20 minutes to read them in, I have to explore other
> options (one of which might be S-PLUS, which claims scalability as a major
> , er, PLUS over R).
If you are routinely working with very large data sets it would be
worthwhile learning to use a relational database (PostgreSQL, MySQL,
even Access) to store the data and then access it from R with RODBC or
one of the specialized database packages.
R is slow reading ASCII files because it is assembling the meta-data on
the fly and it is continually checking the types of the variables being
read. If you know all this information and build it into your table
definitions, reading the data will be much faster.
A disadvantage of this approach is the need to learn yet another
language and system. I was going to do an example but found I could not
because I left all my SQL books at home (I'm travelling at the moment)
and I couldn't remember the particular commands for loading a table from
an ASCII file.
More information about the R-help