[R] How to read HUGE data sets?

Gabor Grothendieck ggrothendieck at gmail.com
Thu Feb 28 23:16:35 CET 2008


read.table's colClasses= argument can take a "NULL" for those columns
that you want
ignored.  Also see the skip= argument.  ?read.table .

The sqldf package can read a subset of rows and columns (actually any
sql operation)
from a file larger than R can otherwise handle.  It will automatically
set up a temporary
SQLite database for you, load the file into the database without going
through R and
extract just the data you want into R and then automatically delete
the database.  All this
can be done in 2 lines of code.  See example 6 on the home page:
http://sqldf.googlecode.com

On Thu, Feb 28, 2008 at 12:03 AM, Jorge Iván Vélez
<jorgeivanvelez at gmail.com> wrote:
> Dear R-list,
>
> Does somebody know how can I read a HUGE data set using R? It is a hapmap
> data set (txt format) which is around 4GB. After read it, I need to delete
> some specific rows and columns. I'm running R 2.6.2 patched over XP SP2
> using a 2.4 GHz Core 2-Duo processor and 4GB RAM. Any suggestion would be
> appreciated.
>
> Thanks in advance,
>
> Jorge
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list