[R] some problems with ram usage and warnings
Tom Knockinger
tomkn at gmx.at
Sat Dec 12 12:35:20 CET 2009
> Datum: Fri, 11 Dec 2009 21:52:30 -0500
>
> On Dec 11, 2009, at 11:08 AM, Tom Knockinger wrote:
>
> > Hi,
> > i am new to the R-project but until now i have found solutions for
> > every problem in toturials, R Wikis and this mailing list, but now i
> > have some problems which I can't solve with this knowledge.
> >
> > [snip]
> > 2) ram usage and program shutdowns
> > length(data) is usually between 50 to 1000. So it takes some space
> > in ram (approx 100-200 mb) which is no problem but I use some
> > analysis code which results in about 500-700 mb ram usage, also not
> > a real problem.
> > The results are matrixes of (50x14 to 1000x14) so they are small
> > enough to work with them afterwards: create plots, or make some more
> > analysis.
> > So i wrote a function which do the analysis one file after another
> > and keep only the results in a list. But after some about 2-4 files
> > my R process uses about 1500MB and then the troubles begin.
>
> Windows?
Yes, I use R 2.9.1 under Windows.
> > [snip]
>
> It is possible to call the garbage collector with gc(). Supposedly
> that should not be necessary, since garbage collection is automatic,
> but I have the impression that it helps prevent situations that
> otherwise lead to virtual memory getting invoked on the Mac (which I
> also thought should not be happening, but I will swear that it does.)
>
> --
> David
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
Thanks for this advice. I tried gc() but it seems that it doesn't do anything or not enough to reduce the process memory in my case.
I called the function "tmp<-load.report(file)", to load the data from the same file several times and called memory.size(), gc(), memory.size() after each function call and here are the results:
before gc() | after gc()
count memory.size() process ram | memory.size() process ram
init 10 20MB |
1 97 220MB | 47 202MB
2 128 363MB | 48 357MB
3 126 466MB | 50 466MB
4 131 629MB | 52 629MB
So it seems that at the beginning it releases some memory but not enough. And also R itself (memory.size()) shows good values after gc() but this values doesn't have anything to do with the real process memory usage. Or there are hugh memory holes in the windows binaries.
The called function is:
load.report <- function( reportname ) {
library(XML)
xml <- xmlTreeParse(reportname, useInternal=TRUE)
globalid <- as.character(getNodeSet(xml, "//@gid"))
sysid <- as.integer(getNodeSet(xml, "//@sid"))
xmldataset = getNodeSet(xml, "/test/data")
free(xml)
xmldata <- sapply(xmldataset, function(el) xmlValue(el))
dftlist <- lapply(1:length(xmldata), function(i)
list( data.frame(gid=globalid[i],sid=sysid[i]),
load.csvTable(xmldata,i)) )
return(dftlist)
}
which uses this helper function, which i used to get rid of these warnings but only reduced them from greater 50 to about 3 or 5 each time the main function is calling.
load.csvTable <- function( xmldata, pos ) {
res = read.table(textConnection(xmldata[pos]),
header=TRUE, sep = c(";"))
closeAllConnections()
return(res)
}
May be you or someone else has some additional advice.
Thanks
Tom
--
Preisknaller: GMX DSL Flatrate für nur 16,99 Euro/mtl.!
http://portal.gmx.net/de/go/dsl02
More information about the R-help
mailing list