[R] Read big data (>3G ) methods ?

Horace Tso Horace.Tso at pgn.com
Sat Apr 27 00:21:49 CEST 2013


Long long time ago in a galaxy far far away, I've played with the LaF package for reading large CSV files. But it's been a while and I don't remember its performance and limitations. Give it a trial.

Horace






-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Kevin Hao
Sent: Friday, April 26, 2013 12:53 PM
To: lcn
Cc: R help
Subject: Re: [R] Read big data (>3G ) methods ?

Thanks lcn,

I will try to read data from different chunks.

Best,

Kevin


On Fri, Apr 26, 2013 at 3:05 PM, lcn <lcn918 at gmail.com> wrote:

> Do you really have the need loading all the data into memory?
>
> Mostly for large data set, people would just read a chunk of it for 
> developing analysis pipeline, and when that's done, the ready script 
> would just iterate through the entire data set. For example, the 
> read.table function has 'nrow' and 'skip' parameters to control the 
> reading of data chunks.
>
> read.table(file, nrows = -1, skip = 0, ...)
>
> And another tip here is, you can split the large file into smaller ones.
>
>
>
> On Fri, Apr 26, 2013 at 8:09 AM, Kevin Hao <rfans4chemo at gmail.com> wrote:
>
>> Hi all scientists,
>>
>> Recently, I am dealing with big data ( >3G  txt or csv format ) in my 
>> desktop (windows 7 - 64 bit version), but I can not read them faster, 
>> thought I search from internet. [define colClasses for read.table, 
>> cobycol and limma packages I have use them, but it is not so fast].
>>
>> Could you share your methods to read big data to R faster?
>>
>> Though this is an odd question, but we need it really.
>>
>> Any suggest appreciates.
>>
>> Thank you very much.
>>
>>
>> kevin
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list