[Rd] Importing csv files

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Dec 23 16:16:28 CET 2004


I think we need to know what you mean by `large' and why read.table is 
not fast enough (and hence if some of the planned improvements might be 
all that is needed).

Could you make some examples available for profiling?

It seems to me that there are some delicate licensing issues in 
distributing a product that writes .rda format except under GPL. See, for 
example, the GPL FAQ.

On Thu, 23 Dec 2004, Frank E Harrell Jr wrote:

> There is a recurring need for importing large csv files quickly.  David 
> Baird's dataload is a standalone program that will directly create .rda files 
> from .csv (it also handles many other conversions).  Unfortunately dataload 
> is no longer publicly available because of some kind of relationship with 
> Stat/Transfer.  The idea is a good one, though.  I wonder if anyone would 
> volunteer to replicate the csv->rda standalone functionality or to provide 
> some Perl or Python tools for making creation of .rda files somewhat easy 
> outside of R.
>
> As an aside, I routinely see 30-fold reductions in file sizes for .rda files 
> (made with save(..., compress=TRUE)) compared with the size of SAS binary 
> datasets.  And load( ) times are fast.
>
> It's been a great year for R.  Let me take this opportunity to thank the R 
> leaders for a fantastic job that gives immeasurable benefits to the 
> community.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list