[R] Reading in 9.6GB .DAT File - OK with 64-bit R?

Gabor Grothendieck ggrothendieck at gmail.com
Fri Mar 9 01:04:55 CET 2012


On Thu, Mar 8, 2012 at 1:19 PM, RHelpPlease <rrumple at trghcsolutions.com> wrote:
> Hi there,
> I wish to read a 9.6GB .DAT file into R (64-bit R on 64-bit Windows machine)
> - to then delete a substantial number of rows & then convert to a .csv file.
> Upon the first attempt the computer crashed (at some point last night).
>
> I'm rerunning this now & am closely monitoring Processor/CPU/Memory.
>
> Apart from this crash being a computer issue alone (possibly), is R equipped
> to handle this much data?  I read up on the FAQs page that 64-bit R can
> handle larger data sets than 32-bit.
>
> I'm using the read.fwf function to read in the data.  I don't have access to
> a database program (SQL, for instance).

   # next line installs the sqldf package and all its dependencies
including sqlite
   install.packages("sqldf")


   library(sqldf)
   DF <- read.csv.sql("bigfile.csv", sql = "select * from file where a
> 3", ...other args...)

The single line creates an sqlite database, creates an appropriate
table layout for your data, reads your data into the table, performs
the sql statement and then only after all that reads it into R.  It
then destroys the database it created.

Replace "bigfile.csv" with the name of your file and where a > 3 with
your condition.  Also the ...other args... parts should specify the
format of your file.

See ?read.csv.sql
and also http://sqldf.googlecode.com

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list