[R] memory usage grows too fast

hadley wickham h.wickham at gmail.com
Fri May 15 01:55:06 CEST 2009


On Thu, May 14, 2009 at 6:21 PM, Ping-Hsun Hsieh <hsiehp at ohsu.edu> wrote:
> Hi All,
>
> I have a 1000x1000000 matrix.
> The calculation I would like to do is actually very simple: for each row, calculate the frequency of a given pattern. For example, a toy dataset is as follows.
>
> Col1    Col2    Col3    Col4
> 01      02      02      00              => Freq of “02” is 0.5
> 02      02      02      01              => Freq of “02” is 0.75
> 00      02      01      01              …
>
> My code is quite simple as the following to find the pattern “02”.
>
> OccurrenceRate_Fun<-function(dataMatrix)
> {
>  tmp<-NULL
>  tmpMatrix<-apply(dataMatrix,1,match,"02")
>   for ( i in 1: ncol(tmpMatrix))
>  {
>    tmpRate<-table(tmpMatrix[,i])[[1]]/ nrow(tmpMatrix)
>    tmp<-c(tmp,tmpHET)
>  }
>  rm(tmpMatrix)
>  rm(tmpRate)
>  return(tmp)
>  gc()
> }
>
> The problem is the memory usage grows very fast and hard to be handled on machines with less RAM.
> Could anyone please give me some comments on how to reduce the space complexity in this calculation?

rowMeans(dataMatrix == "02")  ?

Hadley


-- 
http://had.co.nz/




More information about the R-help mailing list