[R] Reading large files in R
Berton Gunter
gunter.berton at gene.com
Mon Aug 8 21:35:52 CEST 2005
... and it is likely that even if you did have enough memory (several times
the size of the data are generally needed) it would take a very long time.
If you do have enough memory and the data are all of one type -- numeric
here -- you're better off treating it as a matrix rather than converting it
to a data frame.
-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
"The business of the statistician is to catalyze the scientific learning
process." - George E. P. Box
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of
> Adaikalavan Ramasamy
> Sent: Monday, August 08, 2005 12:02 PM
> To: Jean-Pierre Gattuso
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] Reading large files in R
>
> >From Note section of help("read.delim") :
>
> 'read.table' is not the right tool for reading large matrices,
> especially those with many columns: it is designed to read _data
> frames_ which may have columns of very different classes. Use
> 'scan' instead.
>
> So I am not sure why you used 'scan', then converted it to a
> data frame.
>
> 1) Can provide an sample of the data that you are trying to read in.
> 2) How much memory does your machine has ?
> 3) Try reading in the first few lines using the nmax argument in scan.
>
> Regards, Adai
>
>
>
> On Mon, 2005-08-08 at 12:50 -0600, Jean-Pierre Gattuso wrote:
> > Dear R-listers:
> >
> > I am trying to work with a big (262 Mb) file but apparently
> reach a
> > memory limit using R on a MacOSX as well as on a unix machine.
> >
> > This is the script:
> >
> > > type=list(a=0,b=0,c=0)
> > > tmp <- scan(file="coastal_gebco_sandS_blend.txt", what=type,
> > sep="\t", quote="\"", dec=".", skip=1, na.strings="-99",
> nmax=13669628)
> > Read 13669627 records
> > > gebco <- data.frame(tmp)
> > Error: cannot allocate vector of size 106793 Kb
> >
> >
> > Even tmp does not seem right:
> >
> > > summary(tmp)
> > Error: recursive default argument reference
> >
> >
> > Do you have any suggestion?
> >
> > Thanks,
> > Jean-Pierre Gattuso
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> >
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
More information about the R-help
mailing list