[R] Importing big plain files from ERP-System/Data Mining with R

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Oct 26 13:02:37 CEST 2004

On Tue, 26 Oct 2004 r-help.20.stefan817 at spamgourmet.com wrote:

> On Tue, 26 Oct 2004 r-help.20.stefan817 at spamgourmet.com wrote:
> >> how can I import really big plain text data files (several GB) from an
> >Unlikely unless you have a 64-bit platform.
> Why? I have a 32-bit Win XP Platform running R 2.0.0. With ACL 8.21 e.g.
> 10 GB were no problem.

For the two reasons stated in the next para!

> >Only starting with R 2.0.0 can some 32-bit versions of R access files >
> >2Gb, and to import the file into R you need enough address space in R for
> >the object, which is normally more than the file size.
> Is this really so? I want to summarize the data or calculate clusters,
> so only the aggregated information should be in memory. Does R first
> import the whole file and then calculate with it? In ACL the concept is
> to leave the file itself on the harddisk, scanning it for each
> calculation and doing only the calculation in memory. (Surely not very
> fast, but probably the only method for big files)

That's the definition of `import', which is what you actually asked.
You didn't ask if R can do what ACL does, which of course it can.

> >Almost certainly not if the unmentioned platform is Windows, but you could 
> >access the data from a DBMS.
> I can do this also, but with several limitations.

What limitations?

PLEASE do read carefully the posting guide, as well as the replies.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

More information about the R-help mailing list