[R] memory usage grows too fast

Ping-Hsun Hsieh hsiehp at ohsu.edu
Fri May 15 01:21:48 CEST 2009


Hi All,

I have a 1000x1000000 matrix. 
The calculation I would like to do is actually very simple: for each row, calculate the frequency of a given pattern. For example, a toy dataset is as follows.

Col1	Col2	Col3	Col4
01	02	02	00		=> Freq of “02” is 0.5
02	02	02	01		=> Freq of “02” is 0.75
00	02	01	01		…

My code is quite simple as the following to find the pattern “02”.

OccurrenceRate_Fun<-function(dataMatrix)
{
  tmp<-NULL
  tmpMatrix<-apply(dataMatrix,1,match,"02")
   for ( i in 1: ncol(tmpMatrix))
  {
    tmpRate<-table(tmpMatrix[,i])[[1]]/ nrow(tmpMatrix)
    tmp<-c(tmp,tmpHET)
  }
  rm(tmpMatrix)
  rm(tmpRate)
  return(tmp)
  gc()
}

The problem is the memory usage grows very fast and hard to be handled on machines with less RAM.
Could anyone please give me some comments on how to reduce the space complexity in this calculation?

Thanks,
Mike


More information about the R-help mailing list