[R] Memory issues in R
David Winsemius
dwinsemius at comcast.net
Sun Apr 26 17:58:21 CEST 2009
On Apr 26, 2009, at 11:20 AM, Neotropical bat risk assessments wrote:
>
> How do people deal with R and memory issues?
They should read the R-FAQ and the Windows FAQ as you say you have.
http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-be-a-limit-on-the-memory-it-uses_0021
>
> I have tried using gc() to see how much memory is used at each step.
> Scanned Crawley R-Book and all other R books I have available and
> the FAQ
> on-line but no help really found.
> Running WinXP Pro (32 bit) with 4 GB RAM.
> One SATA drive pair is in RAID 0 configuration with 10000 MB
> allocated as
> virtual memory.
On the basis of my Windows experience this may not be enough
information. (The drive information is fairly irrelevant.)
The R-Win-FAQ suggests:
?Memory
?memory.size # "for information about memory usage. The limit can
be raised by calling memory.limit "
Although you read the FAQs, have you zeroed in on the relevant
sections? What does memory.size report? And what happens when you run
R "alone" in WinXP and alter the default settings with memory.limit?
>
> I do have another machine set up with Ubuntu but it only has 2 GB
> RAM and
> have not been able to get R installed on that system.
> I can run smaller sample data sets w/o problems and everything
> plots as
> needed.
> However I need to review large data sets.
> Using latest R version 2.9.0 (2009-04-17)
> My data is in CSV format with a header row and is a big data set
> with
> 1,200,240 rows!
It's long, but not particularly wide. Last year I was getting
satisfactory work done on a 990K by 50-60 column dataset in a memory
constraint of 4GB on a different OS. Your constraint is in the 2.5-
3.0 GB area but your dataframe is only a third of the size.
>
> E.g. below:
> Dur,TBC,Fmax,Fmin,Fmean,Fc,S1,Sc,
> 9.81,0,28.78,24.54,26.49,25.81,48.84,14.78,
> 4.79,1838.47,37.21,29.41,31.76,29.52,241.77,62.83,
> 4.21,5.42,28.99,26.23,27.53,27.4,76.03,11.44,
> 10.69,193.48,30.53,25.4,27.69,25.4,-208.19,26.05,
> 15.5,248.18,30.77,24.32,26.57,24.92,-202.76,18.64,
> 14.85,217.47,31.25,24.62,26.93,25.56,-88.4,10.32,
> 11.86,158.01,33.61,25.24,27.66,25.32,83.32,17.62,
> 14.05,229.74,30.65,24.24,26.76,25.24,61.87,14.06,
> 8.71,264.02,31.01,25.72,27.56,25.72,253.18,19.2,
> 3.91,10.3,25.32,24.02,24.55,24.02,-71.67,16.83,
> 16.11,242.21,29.85,24.02,26.07,24.62,79.45,19.11,
> 16.81,246.48,28.57,23.05,25.46,23.81,-179.82,15.95,
> 16.93,255.09,28.78,23.19,25.75,24.1,-112.21,16.38,
> 5.12,107.16,32,29.41,30.46,29.41,134.45,20.88,
> 16.7,150.49,27.97,22.92,24.91,23.95,42.96,16.81
> .... etc
> I am getting the following warning/error message:
> Error: cannot allocate vector of size 228.9 Mb
So you got the data into memory. That does not appear to exceed the
capacity of your hardware setup, if you address the options offered
above.
>
> Complete listing from R console below:
>> library(batcalls)
> Loading required package: ggplot2
> Loading required package: proto
> Loading required package: grid
> Loading required package: reshape
> Loading required package: plyr
> Attaching package: 'ggplot2'
> The following object(s) are masked from package:grid :
> nullGrob
>> gc()
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 186251 5.0 407500 10.9 350000 9.4
> Vcells 98245 0.8 786432 6.0 358194 2.8
>> BR <- read.csv ("C:/R-Stats/Bat calls/Reduced bats.csv")
>> gc()
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 188034 5.1 667722 17.9 378266 10.2
> Vcells 9733249 74.3 20547202 156.8 20535538 156.7
Looks like you need to use memory.limit(<some bigger number>)
>
>> attach(BR)
>> library(ggplot2)
>> library(MASS)
>> library(batcalls)
>> BRC<-kde2d(Sc,Fc)
> Error: cannot allocate vector of size 228.9 Mb
>> gc()
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 198547 5.4 667722 17.9 378266 10.2
> Vcells 19339695 147.6 106768803 814.6 124960863 953.4
>>
> Tnx for any insight,
> Bruce
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list