[R] Regarding the memory allocation problem

Milan Bouchet-Valat nalimilan at club.fr
Fri Oct 26 15:44:56 CEST 2012


Le jeudi 25 octobre 2012 à 15:02 +0530, Purna chander a écrit :
> Dear All,
> 
> 
> My main objective was to compute the distance of 100000 vectors from a
> set having 900 other vectors. I've a file named "seq_vec" containing
> 100000 records and 256 columns.
> While computing, the memory was not sufficient and resulted in error
> "cannot allocate vector of size 152.1Mb"
> 
> So I've approached the problem in the following:
> Rather than reading the data completely at a time, I read the data in
> chunks of 20000 records using scan() function. After reading each
> chunk, I've computed distance of each of these vectors with a set of
> another vectors.
> 
> Even though I was successful in computing the distances for first 3
> chunks, I obtained similar error (cannot allocate vector of size
> 102.3Mb).
> 
> Q) Here what I could not understand is, how come memory become
> insufficient when dealing with 4th chunk?
> Q) Suppose if i computed a matrix 'm' during calculation associated
> with chunk1, then is this matrix not replaced when I again compute 'm'
> when dealing with chunk 2?
R's memory management is relatively complex, i.e. objects are not always
replaced in memory, they are only garbage collected from time to time.
You may try to call gc() after each chunk to limit memory fragmentation,
which help reducing allocation problems a little.

But please tell us how many RAM you have on the machine you're using,
and post the output of sessionInfo().


Regards




More information about the R-help mailing list