[R] Windows Memory Issues
Jason Turner
jasont at indigoindustrial.co.nz
Sat Dec 6 20:23:14 CET 2003
Richard Pugh wrote:
...
> I have run into some issues regarding the way R handles its memory,
> especially on NT.
...
Actually, you've run into NT's nasty memory management. Welcome! :)
R-core have worked very hard to work around Windows memory issues, so
they've probably got a better answer than I can give. I'll give you a
few quick answers, and then wait for correction when one them replies.
> A typical call
> may look like this .
>
>
>>myInputData <- matrix(sample(1:100, 7500000, T), nrow=5000)
>>myPortfolio <- createPortfolio(myInputData)
>
>
> It seems I can only repeat this code process 2/3 times before I have to
> restart R (to get the memory back). I use the same object names
> (myInputData and myPortfolio) each time, so I am not create more large
> objects ..
Actually, you do. Re-using a name does not re-use the same blocks of
memory. The size of the object may change, for example.
>
> I think the problems I have are illustrated with the following example
> from a small R session .
>
>
>># Memory usage for Rui process = 19,800
>>testData <- matrix(rnorm(10000000), 1000) # Create big matrix
>># Memory usage for Rgui process = 254,550k
>>rm(testData)
>># Memory usage for Rgui process = 254,550k
>>gc()
>
> used (Mb) gc trigger (Mb)
> Ncells 369277 9.9 667722 17.9
> Vcells 87650 0.7 24286664 185.3
>
>># Memory usage for Rgui process = 20,200k
>
>
> In the above code, R cannot recollect all memory used, so the memory
> usage increases from 19.8k to 20.2. However, the following example is
> more typical of the environments I use .
>
>
>># Memory 128,100k
>>myTestData <- matrix(rnorm(10000000), 1000)
>># Memory 357,272k
>>rm(myTestData)
>># Memory 357,272k
>>gc()
>
> used (Mb) gc trigger (Mb)
> Ncells 478197 12.8 818163 21.9
> Vcells 9309525 71.1 31670210 241.7
>
>># Memory 279,152k
R can return memory to Windows, but it cannot *make* Windows take it
back. Exiting the app is the only guaranteed way to do this, for any
application.
The fact that you get this with matricies makes me suspect
fragmentation issues with memory, rather than pure lack of memory.
Here, the memory is disorganised, thanks to some programmers in Redmond.
When a matrix gets assigned, it needs all its memory to be contiguous.
If the memory on your machine has, say, 250 MB free, but only in 1 MB
chunks, and you need to build a 2 MB matrix, you're out of luck.
From the sounds of your calculations, they *must* be done as big
matricies (true?). If not, try a data structure that isn't a matrix or
array; these require *contiguous* blocks of memory. Lists, by
comparison, can store their components in separate blocks. Would a list
of smaller matricies work?
> Could anyone point out what I could do to rectify this (if anything), or
> generally what strategy I could take to improve this?
Some suggestions:
1) call gc() somewhere inside your routines regularly. Not guaranteed
to help, but worth a try.
2) Get even more RAM, and hope it stabilises.
3) Change data structures to something other than one huge matrix.
Matricies have huge computational advantages, but are pigs for memory.
4) Export the data crunching part of the application to an operating
system that isn't notorious for bad memory management. <opinion,
subjective=yes> I've almost never had anguish from Solaris. Linux and
FreeBSD are not bad. </opinion> Have you considered running the results
on a different machine, and storing the results in a fresh table on the
same database as where you get the raw data?
Hope that helps.
Jason
--
Indigo Industrial Controls Ltd.
http://www.indigoindustrial.co.nz
64-21-343-545
jasont at indigoindustrial.co.nz
More information about the R-help
mailing list