[R] How to deal with more than 6GB dataset using R?
Allan Engelhardt
allane at cybaea.com
Fri Jul 23 18:45:08 CEST 2010
On 23/07/10 17:36, Duncan Murdoch wrote:
> On 23/07/2010 12:10 PM, babyfoxlove1 at sina.com wrote:
>> [...]
>
> You probably won't get much faster than read.table with all of the
> colClasses specified. It will be a lot slower if you leave that at
> the default NA setting, because then R needs to figure out the types
> by reading them as character and examining all the values. If the
> file is very consistently structured (e.g. the same number of
> characters in every value in every row) you might be able to write a C
> function to read it faster, but I'd guess the time spent writing that
> would be a lot more than the time saved.
And try the utils::read.fwf() function before you roll your own C code
for this use case.
If you do write C code, consider writing a converter to .RData format
which R seems to read quite efficiently.
Hope this helps.
Allan
More information about the R-help
mailing list