[Rd] cannot allocate vector of size in merge (PR#765)

Prof Brian D Ripley ripley@stats.ox.ac.uk
Fri, 15 Dec 2000 07:41:05 +0000 (GMT)


On Fri, 15 Dec 2000 viktorm@pdf.com wrote:

> Full_Name: Viktor Moravetski
> Version: Version 1.2.0 (2000-12-13)
> OS: Win-NT 4.0 SP5
> Submission from: (NULL) (209.128.81.199)
> 
> 
> I've started R (v.1.20) with command:
> rgui --vsize 450M --nsize 40M

Um.  Whats `v.1.20' and where did you get that from?  In particular how did
you compile it and which run-time are you using? You clearly have not read
the documentation on the command-line flags for version 1.2.0, or even the
top item in NEWS.

> Then at the command prompt:
> > gc()
>           used (Mb) gc trigger (Mb)
> Ncells  358534  9.6   41943040 1120
> Vcells 3469306 26.5   58982400  450
> 
> >df <- data.frame(x=1:30000,y=2,z=3)
> >merge(df,df)
> Error: vector memory exhausted (limit reached?)
> 
> 
> In S-Plus it worked fine, no problems. 
> It looks like that "R" cannot merge dataframes with 
> more than 30K rows. It has enough memory, so what limit 
> was reached and what should I do?

How do you know it has enough memory: it has just told you it has not?
I think you are using an unreleased version and not reading the
documentation on the changes.  The CHANGES file says

  New command-line option --max-mem-size to set the maximum memory
  allocation: it has a minimum allowed value of 10M.  This is intended
  to catch attempts to allocate excessive amounts of memory which may
  cause other processes to run out of resources.  The default is the
  smaller of the amount of physical RAM in the machine and 256Mb.

and NEWS says (first item)

    o   There is a new memory management system using a generational
        garbage collector.  This improves performance, sometimes
        marginally but sometimes by double or more.  The workspace is
        no longer statically sized and both the vector heap and the
        number of nodes can grow as needed.  (They can shrink again,
        but never below the initially allocated sizes.)  See ?Memory
        for a longer description, including the new command-line
        options to manage the settings.


Beyond that, R's merge uses a flexible but memory-intensive algorithm. If
you want to do merges on this scale we recommend that you use one of the
RDBMS interfaces to a tool optimized for the job.


I really don't understand why you filed a bug report on your lack of
reading of documentation: please see the section on reporting bugs in the
FAQ.

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._