[Rd] cannot allocate vector of size in merge (PR#765)

Thomas Lumley thomas@biostat.washington.edu
Thu, 14 Dec 2000 16:05:17 -0800 (PST)


On Thu, 14 Dec 2000, Viktor Moravetski wrote:

> Hi Saikat,
> Yes, I don't need to specify nsize and vsize for version 1.2.0. 
> But error is the same, even if start R with default parameters. 
> It grows in memory and then gives an error. 
> See output below:
> > gc()
>           used (Mb) gc trigger (Mb)
> Ncells  358534  9.6     597831 16.0
> Vcells 3330352 25.5    3736477 28.6
> > df <- data.frame(x=1:30000,y=2,z=3)
> > merge(df,df)
> Error: cannot allocate vector of size 3515625 Kb
					^^^^^^^^^^!!

This is the problem. A merge of n rows takes n^2 space, because each row
of the first data frame is compared to  each row of the second.  You are
trying to allocate 3.5Gb, which is almost certainly more memory than you
have. This is (30000^2)*4

	-thomas

	

> > 
> 
> 
> Saikat DebRoy wrote:
> > 
> > >>>>> "viktorm" == viktorm  <viktorm@pdf.com> writes:
> > 
> >   viktorm> Full_Name: Viktor Moravetski
> >   viktorm> Version: Version 1.2.0 (2000-12-13)
> >   viktorm> OS: Win-NT 4.0 SP5
> >   viktorm> Submission from: (NULL) (209.128.81.199)
> > 
> >   viktorm> I've started R (v.1.20) with command:
> >   viktorm> rgui --vsize 450M --nsize 40M
> > 
> >   viktorm> Then at the command prompt:
> >   >> gc()
> >   viktorm>           used (Mb) gc trigger (Mb)
> >   viktorm> Ncells  358534  9.6   41943040 1120
> >   viktorm> Vcells 3469306 26.5   58982400  450
> > 
> >   >> df <- data.frame(x=1:30000,y=2,z=3)
> >   >> merge(df,df)
> >   viktorm> Error: vector memory exhausted (limit reached?)
> > 
> > Do you really need such a large number of Ncells ? I think not. Try
> > starting R without specifying --nsize (and maybe even --vsize). In
> > 1.2.0, R would  automatically allocate more memory if the intial value
> > is not enough. But (as far as I know) it would not decrease the amount
> > of memory below the initial amount.
> > 
> >   viktorm> In S-Plus it worked fine, no problems.
> >   viktorm> It looks like that "R" cannot merge dataframes with
> >   viktorm> more than 30K rows. It has enough memory, so what limit
> >   viktorm> was reached and what should I do?
> 
> --
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> 

Thomas Lumley
Assistant Professor, Biostatistics
University of Washington, Seattle

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._