[R] R tools for large files

Murray Jorgensen maj at stats.waikato.ac.nz
Mon Aug 25 07:16:30 CEST 2003


Andrew,

This is no doubt true, but some things in R work very well with big 
files without the need for any extra software:

readLines(“c:/data/perry/data.csv”,n=12)
# prints out the first 12 lines as strings

flows <- read.csv(“c:/data/perry/data.csv”,na.strings=”?”, 
header=F,nrows=1000)
# makes a data frame from the first 1000 records

I would like to get some solution where I don't find myself generating 
large numbers of derived files from the original data file.

Murray


Andrew C. Ward wrote:
> Dear Murray,
> 
> One way that works very well for many people (including me)
> is to store the data in an external database, such as MySQL,
> and read in just the bits you want using the excellent
> package RODBC. Getting a database to do all the selecting
> is very fast and efficient, leaving R to concentrate on the
> analysis and visualisation. This is all described in the
> R Import/Export Manual.
> 
> 
> Regards,
> 
> Andrew C. Ward
> 
> CAPE Centre
> Department of Chemical Engineering
> The University of Queensland
> Brisbane Qld 4072 Australia
> andreww at cheque.uq.edu.au
> 
> 
> Quoting Murray Jorgensen <maj at stats.waikato.ac.nz>:
> 
> 
>>I'm wondering if anyone has written some functions or
>>code for handling 
>>very large files in R. I am working with a data file that
>>is 41 
>>variables times who knows how many observations making up
>>27MB altogether.
>>
>>The sort of thing that I am thinking of having R do is
>>
>>- count the number of lines in a file
>>
>>- form a data frame by selecting all cases whose line
>>numbers are in a 
>>supplied vector (which could be used to extract random
>>subfiles of 
>>particular sizes)
>>
>>Does anyone know of a package that might be useful for
>>this?
>>
>>Murray
>>
>>-- 
>>Dr Murray Jorgensen     
>>http://www.stats.waikato.ac.nz/Staff/maj.html
>>Department of Statistics, University of Waikato,
>>Hamilton, New Zealand
>>Email: maj at waikato.ac.nz                               
>>Fax 7 838 4155
>>Phone  +64 7 838 4773 wk    +64 7 849 6486 home    Mobile
>>021 1395 862
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>
> 
> 
> 

-- 
Dr Murray Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: maj at waikato.ac.nz                                Fax 7 838 4155
Phone  +64 7 838 4773 wk    +64 7 849 6486 home    Mobile 021 1395 862




More information about the R-help mailing list