[R] Error: cannot allocate vector of size xxx Mb
Petr PIKAL
petr.pikal at precheza.cz
Thu Aug 5 12:17:16 CEST 2010
Hi
I am not an expert in such issues (never really run into problems with
memory size).
>From what I have read in previous posts on this topic (and there are
numerous) the simplest way would be to go to 64 byte system (Linux, W
Vista, 7), where size of objects is limited by amount of memory only.
There are some packages dealing with big data (biglm, ...) or database
approach (sqldf)
Your version is a bit obsolete so upgrading could help but not with your
final operation.
Sometimes it can help to rethink why do you need such a huge amount of
data together in memory and if you can not use only sampled data for
further study.
Regards
Petr
Ralf B <ralf.bierig at gmail.com> napsal dne 05.08.2010 11:13:40:
> Thank you for such a careful and thorough analysis of the problem and
> your comparison with your configuration. I very much appreciate.
> For completeness and (perhaps) further comparison, I have executed
> 'version' and sessionInfo() as well:
>
>
> > version
> _
> platform i386-pc-mingw32
> arch i386
> os mingw32
> system i386, mingw32
> status RC
> major 2
> minor 10.0
> year 2009
> month 10
> day 25
> svn rev 50206
> language R
> version.string R version 2.10.0 RC (2009-10-25 r50206)
> > sessionInfo()
> R version 2.10.0 RC (2009-10-25 r50206)
> i386-pc-mingw32
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] splines stats4 grid stats graphics grDevices utils
> [8] datasets methods base
>
> other attached packages:
> [1] flexmix_2.2-7 multcomp_1.1-7 survival_2.35-8 mvtnorm_0.9-9
> [5] modeltools_0.2-16 lattice_0.18-3 car_1.2-16 psych_1.0-88
> [9] nortest_1.0 gplots_2.8.0 caTools_1.10 bitops_1.0-4.1
> [13] gdata_2.8.0 gtools_2.6.2 ggplot2_0.8.7 digest_0.4.2
> [17] reshape_0.8.3 plyr_0.1.9 proto_0.3-8 RJDBC_0.1-5
> [21] rJava_0.8-2 DBI_0.2-5
>
> loaded via a namespace (and not attached):
> [1] tools_2.10.0
>
> > memory.limit()
> [1] 2047
>
>
>
> Also, the example i presented was a simplified reproduction of the
> real data structure. My real data structure does not have reused
> vectors. I merely wanted to show the error occurring when processing
> large vectors into data frames and then binding these data frames
> together. I hope this additional information helps. I might add that I
> am running this in StatET under Eclipse with 512 MB of allocated RAM
> in the environment.
>
> Besides adding more memory, can you spot simple ways of how memory use
> can be improved? I know that I am running quite a bit of baggage.
> Unfortunately my script is rather comprehensive and my example is
> really just a simplified part that I created to reproduce the problem.
>
> Thanks,
> Ralf
>
>
>
>
>
>
> On Thu, Aug 5, 2010 at 4:44 AM, Petr PIKAL <petr.pikal at precheza.cz>
wrote:
> > Hi
> >
> > r-help-bounces at r-project.org napsal dne 05.08.2010 09:53:21:
> >
> >> I am dealing with very large data frames, artificially created with
> >> the following code, that are combined using rbind.
> >>
> >>
> >> a <- rnorm(5000000)
> >> b <- rnorm(5000000)
> >> c <- rnorm(5000000)
> >> d <- rnorm(5000000)
> >> first <- data.frame(one=a, two=b, three=c, four=d)
> >> second <- data.frame(one=d, two=c, three=b, four=a)
> >
> > Up to this point there is no error on my system
> >
> >> version
> > _
> > platform i386-pc-mingw32
> > arch i386
> > os mingw32
> > system i386, mingw32
> > status Under development (unstable)
> > major 2
> > minor 12.0
> > year 2010
> > month 05
> > day 31
> > svn rev 52164
> > language R
> > version.string R version 2.12.0 Under development (unstable)
(2010-05-31
> > r52164)
> >
> >> sessionInfo()
> > R version 2.12.0 Under development (unstable) (2010-05-31 r52164)
> > Platform: i386-pc-mingw32/i386 (32-bit)
> >
> > attached base packages:
> > [1] stats grDevices datasets utils graphics methods base
> >
> > other attached packages:
> > [1] lattice_0.18-8 fun_1.0
> >
> > loaded via a namespace (and not attached):
> > [1] grid_2.12.0 tools_2.12.0
> >
> >> rbind(first, second)
> >
> > Although size of first and second is only roughly 160 MB their
> > concatenation probably consumes all remaining memory space as you
already
> > have a-d first and second in memory.
> >
> > Regards
> > Petr
> >
> >>
> >> which results in the following error for each of the statements:
> >>
> >> > a <- rnorm(5000000)
> >> Error: cannot allocate vector of size 38.1 Mb
> >> > b <- rnorm(5000000)
> >> Error: cannot allocate vector of size 38.1 Mb
> >> > c <- rnorm(5000000)
> >> Error: cannot allocate vector of size 38.1 Mb
> >> > d <- rnorm(5000000)
> >> Error: cannot allocate vector of size 38.1 Mb
> >> > first <- data.frame(one=a, two=b, three=c, four=d)
> >> Error: cannot allocate vector of size 38.1 Mb
> >> > second <- data.frame(one=d, two=c, three=b, four=a)
> >> Error: cannot allocate vector of size 38.1 Mb
> >> > rbind(first, second)
> >>
> >> When running memory.limit() I am getting this:
> >>
> >> memory.limit()
> >> [1] 2047
> >>
> >> Which shows me that I have 2 GB of memory available. What is wrong?
> >> Shouldn't 38 MB be very feasible?
> >>
> >> Best,
> >> Ralf
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
More information about the R-help
mailing list