[R] Importing big plain files from ERP-System/Data Mining with R

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Oct 26 13:02:37 CEST 2004


On Tue, 26 Oct 2004 r-help.20.stefan817 at spamgourmet.com wrote:

> 
> On Tue, 26 Oct 2004 r-help.20.stefan817 at spamgourmet.com wrote:
> 
> >> how can I import really big plain text data files (several GB) from an
> 
> >Unlikely unless you have a 64-bit platform.
> 
> Why? I have a 32-bit Win XP Platform running R 2.0.0. With ACL 8.21 e.g.
> 10 GB were no problem.

For the two reasons stated in the next para!

> >Only starting with R 2.0.0 can some 32-bit versions of R access files >
> >2Gb, and to import the file into R you need enough address space in R for
> >the object, which is normally more than the file size.
> 
> Is this really so? I want to summarize the data or calculate clusters,
> so only the aggregated information should be in memory. Does R first
> import the whole file and then calculate with it? In ACL the concept is
> to leave the file itself on the harddisk, scanning it for each
> calculation and doing only the calculation in memory. (Surely not very
> fast, but probably the only method for big files)

That's the definition of `import', which is what you actually asked.
You didn't ask if R can do what ACL does, which of course it can.

> >Almost certainly not if the unmentioned platform is Windows, but you could 
> >access the data from a DBMS.
> 
> I can do this also, but with several limitations.

What limitations?

PLEASE do read carefully the posting guide, as well as the replies.
       ^^^^^^^

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list