[Rd] .Call and to reclaim the memory by allocVector

Yongchao Ge Yongchao.Ge at mssm.edu
Fri Aug 24 15:43:54 CEST 2007


Dear Prof. Ripley

I am using 32bit Ubuntu 7.04 on Dual Core Intel Xeon Processor 5140. I do 
not think that it is the OS's problem in recognizing the memory released by 
free(), as the Calloc() and Free() pair works perfectly well in my 
C program. I'm assuming that the free() in your post does not mean
the standard C library function, but the Free() in the R extension, as 
recommended to release the memory back to the OS by the R extension 
manual.

It was not the 150MB that bothers me. I used the toy example to 
isolate the problem. My actual program needs to allocate around 660M bytes 
(maybe more, depending on the actual dataset) for a return from .Call. 
This return object is stored in R and will be used by many other 
functions, which also uses .Call to wrap the C code. I found that my 
program reaches the memory limit (3G) very quickly even though at most 
1.8G bytes of data should be in the memory in the C and R codes combined 
(potentially two copies of the same R object and a copy in the C 
program). The memory problem in .Call means that my program 
can run once or twice, and it fails the third time. I need to run the 
same program more than twice.

Why am I storing a large dataset in the R? My program consist of two 
parts. The first part is to get the intermediate results, the computation 
of which takes a lot of time. The second part contains many 
different functions to manipulate the the intermediate 
results.

My current solution is to save intermediate result in a temporary file, 
but my final goal is to to save it as an R object. The "memory leak" in 
.Call stops me from doing this and I'd like to know if I can have a clean 
solution for the R package I am writing.

Yongchao

On Fri, 24 Aug 2007, Prof Brian Ripley wrote:

> Please do not post to multiple lists! I've removed R-help.
>
> You have not told us your OS ('linux', perhaps but what CPU), nor how you 
> know 'the memory was still not reclaimed back to the operating system'. But 
> that is how many OSes work: their malloc maintains a pool of memory pages, 
> and free() does not return the memory to the OS kernel, just to the process' 
> pool.  It depends on what you meant by 'the operating system'.
>
> Why does this bother you?  150Mb of virtual memory is nothing these days.
>
>
> On Thu, 23 Aug 2007, Yongchao Ge wrote:
>
>> Hi,
>> 
>> I am not sure if this is a bug and I apologize if it is something I
>> didn't read carefully in the R extension manual. My initial search on the
>> R help and R devel list archive didn't find useful information.
>
> Exactly this topic was thrashed to death under the misleading title of 
> 'Suspected memory leak' earlier this month in a thread that started on R-help 
> and moved to R-devel. See e.g.
>
> https://stat.ethz.ch/pipermail/r-devel/2007-August/046669.html
>
> from the author of the R memory allocator.
>
>
>> I am using .Call (as written in the R extension manual) for the C code
>> and have found that the .Call didn't release the memory claimed by
>> allocVector. Even after applying gc() function and removing the R object
>> created by the .Call function, the memory was still not reclaimed back to
>> the operating system.
>> 
>> Here is an example. It was modified from the convolve2 example from the R
>> extension manual. Now I am computing the crossproduct of a and b, which
>> returns a vector of size length(a)*length(b).
>> 
>> The C code is at the end of this message with the modification commented.
>> The R code is here
>> ----------------------------
>> dyn.load("crossprod2.so")
>> cp <- function(a, b) .Call("crossprod2", a, b)
>> gctorture()
>> a<-1:10000
>> b<-1:1000
>> gc() #i
>> 
>> c<-cp(a,b)
>> rm(c)
>> gc() #ii
>> --------------
>> 
>> When I run the above code in a fresh start R (version 2.5.0)
>> the gc() inforamation is below. I report the last column ("max
>> used (Mb)" ) here, which agrees the linux command "ps aux". Apparently
>> even after I removing the object "c", we still have un-reclaimed 70M bytes
>> of memory, which is approximately the memory size for the object "c".
>> 
>> If I run the command "c<-cp(a,b)" for three or four times and then remove 
>> the
>> object "c" and apply gc() function, the unclaimed memory can reach 150M
>> bytes. I tried gc(reset=TRUE), and it doesn't seem to make difference.
>> 
>> Can someone suggest what caused this problem and what the solution will
>> be?  When you reply the email, please cc to me as I am not on the help
>> list.
>> 
>> Thanks,
>> 
>> Yongchao
>> 
>> ------------------------------------------------
>>> dyn.load("crossprod2.so")
>>> cp <- function(a, b) .Call("crossprod2", a, b)
>>> gctorture()
>>> a<-1:10000
>>> b<-1:1000
>> 
>> 
>>> gc() #i
>>          used (Mb) gc trigger (Mb) max used (Mb)
>> Ncells 173527  4.7     467875 12.5   350000  9.4
>> Vcells 108850  0.9     786432  6.0   398019  3.1
>>> 
>>> c<-cp(a,b)
>>> rm(c)
>>> gc() #ii
>>          used (Mb) gc trigger (Mb) max used (Mb)
>> Ncells 233998  6.3     467875 12.5   350000  9.4
>> Vcells 108866  0.9   12089861 92.3 10119856 77.3
>>> 
>> -----------------------------------------------
>> 
>> 
>> 
>> 
>> 
>> 
>> --------------------------------------------
>> #include "R.h"
>> #include "Rinternals.h"
>> #include "Rdefines.h"
>> SEXP crossprod2(SEXP a, SEXP b);
>> //modified from convolve2 in the R extension
>> //R CMD SHLIB crossprod2.c
>> 
>> #include <R.h>
>> #include <Rinternals.h>
>> SEXP crossprod2(SEXP a, SEXP b)
>> {
>>      R_len_t i, j, na, nb, nab;
>>      double *xa, *xb, *xab;
>>      SEXP ab;
>>
>>      PROTECT(a = coerceVector(a, REALSXP));
>>      PROTECT(b = coerceVector(b, REALSXP));
>>      na = length(a); nb = length(b);
>>
>>      //nab = na + nb - 1;
>>      nab=na*nb;// we are doing the cross product
>>      PROTECT(ab = allocVector(REALSXP, nab));
>>      xa = REAL(a); xb = REAL(b);
>>      xab = REAL(ab);
>>      for(i = 0; i < nab; i++) xab[i] = 0.0;
>>      for(i = 0; i < na; i++)
>> 	  for(j = 0; j < nb; j++) //xab[i + j] += xa[i] * xb[j];
>> 	       xab[i*nb + j] += xa[i] * xb[j];//we are computing crossproduct
>>      UNPROTECT(3);
>>      return(ab);
>> }
>> 
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Yongchao Ge                                  Yongchao.Ge at mssm.edu
Mount Sinai School of Medicine               office: 212-241-3536
Department of Neurology
One Gustave L. Levy Place, Box 1137     New York, NY, 10029, USA
web url: www.mssm.edu/faculty/yongchao-ge
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



More information about the R-devel mailing list