[R] efficient equivalent to read.csv / write.csv
David Scott
d.scott at auckland.ac.nz
Tue Sep 28 22:16:31 CEST 2010
On 29/09/2010 6:24 a.m., statquant2 wrote:
>
> Hi, after testing
> R) system.time(read.csv("myfile.csv"))
> user system elapsed
> 1.126 0.038 1.177
>
> R) system.time(read.csv.sql("myfile.csv"))
> user system elapsed
> 1.405 0.025 1.439
> Warning messages:
> 1: closing unused connection 4 ()
> 2: closing unused connection 3 ()
>
> It seems that the function is less efficient that the base one ... so ...
I presume you have had a good look at the R Data Import/Export manual?
It does there warn of inefficiency with read.table (hence also read.csv)
and suggest more direct use of scan which in your case might be via
connections and readLines and writeLines.
If that doesn't work, why not go to a database. Use RODBC or some such
to read and write tables in the database. There are many options for
databases to use (MySQL works for me). You can easily read data in and
out of the database in .csv format. If the .csv files are similar there
shouldn't be too much overhead in defining table formats for the database.
David Scott
--
_________________________________________________________________
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142, NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email: d.scott at auckland.ac.nz, Fax: +64 9 373 7018
Director of Consulting, Department of Statistics
More information about the R-help
mailing list