[Rd] .Call and to reclaim the memory by allocVector
Yongchao Ge
Yongchao.Ge at mssm.edu
Fri Aug 24 15:43:54 CEST 2007
Dear Prof. Ripley
I am using 32bit Ubuntu 7.04 on Dual Core Intel Xeon Processor 5140. I do
not think that it is the OS's problem in recognizing the memory released by
free(), as the Calloc() and Free() pair works perfectly well in my
C program. I'm assuming that the free() in your post does not mean
the standard C library function, but the Free() in the R extension, as
recommended to release the memory back to the OS by the R extension
manual.
It was not the 150MB that bothers me. I used the toy example to
isolate the problem. My actual program needs to allocate around 660M bytes
(maybe more, depending on the actual dataset) for a return from .Call.
This return object is stored in R and will be used by many other
functions, which also uses .Call to wrap the C code. I found that my
program reaches the memory limit (3G) very quickly even though at most
1.8G bytes of data should be in the memory in the C and R codes combined
(potentially two copies of the same R object and a copy in the C
program). The memory problem in .Call means that my program
can run once or twice, and it fails the third time. I need to run the
same program more than twice.
Why am I storing a large dataset in the R? My program consist of two
parts. The first part is to get the intermediate results, the computation
of which takes a lot of time. The second part contains many
different functions to manipulate the the intermediate
results.
My current solution is to save intermediate result in a temporary file,
but my final goal is to to save it as an R object. The "memory leak" in
.Call stops me from doing this and I'd like to know if I can have a clean
solution for the R package I am writing.
Yongchao
On Fri, 24 Aug 2007, Prof Brian Ripley wrote:
> Please do not post to multiple lists! I've removed R-help.
>
> You have not told us your OS ('linux', perhaps but what CPU), nor how you
> know 'the memory was still not reclaimed back to the operating system'. But
> that is how many OSes work: their malloc maintains a pool of memory pages,
> and free() does not return the memory to the OS kernel, just to the process'
> pool. It depends on what you meant by 'the operating system'.
>
> Why does this bother you? 150Mb of virtual memory is nothing these days.
>
>
> On Thu, 23 Aug 2007, Yongchao Ge wrote:
>
>> Hi,
>>
>> I am not sure if this is a bug and I apologize if it is something I
>> didn't read carefully in the R extension manual. My initial search on the
>> R help and R devel list archive didn't find useful information.
>
> Exactly this topic was thrashed to death under the misleading title of
> 'Suspected memory leak' earlier this month in a thread that started on R-help
> and moved to R-devel. See e.g.
>
> https://stat.ethz.ch/pipermail/r-devel/2007-August/046669.html
>
> from the author of the R memory allocator.
>
>
>> I am using .Call (as written in the R extension manual) for the C code
>> and have found that the .Call didn't release the memory claimed by
>> allocVector. Even after applying gc() function and removing the R object
>> created by the .Call function, the memory was still not reclaimed back to
>> the operating system.
>>
>> Here is an example. It was modified from the convolve2 example from the R
>> extension manual. Now I am computing the crossproduct of a and b, which
>> returns a vector of size length(a)*length(b).
>>
>> The C code is at the end of this message with the modification commented.
>> The R code is here
>> ----------------------------
>> dyn.load("crossprod2.so")
>> cp <- function(a, b) .Call("crossprod2", a, b)
>> gctorture()
>> a<-1:10000
>> b<-1:1000
>> gc() #i
>>
>> c<-cp(a,b)
>> rm(c)
>> gc() #ii
>> --------------
>>
>> When I run the above code in a fresh start R (version 2.5.0)
>> the gc() inforamation is below. I report the last column ("max
>> used (Mb)" ) here, which agrees the linux command "ps aux". Apparently
>> even after I removing the object "c", we still have un-reclaimed 70M bytes
>> of memory, which is approximately the memory size for the object "c".
>>
>> If I run the command "c<-cp(a,b)" for three or four times and then remove
>> the
>> object "c" and apply gc() function, the unclaimed memory can reach 150M
>> bytes. I tried gc(reset=TRUE), and it doesn't seem to make difference.
>>
>> Can someone suggest what caused this problem and what the solution will
>> be? When you reply the email, please cc to me as I am not on the help
>> list.
>>
>> Thanks,
>>
>> Yongchao
>>
>> ------------------------------------------------
>>> dyn.load("crossprod2.so")
>>> cp <- function(a, b) .Call("crossprod2", a, b)
>>> gctorture()
>>> a<-1:10000
>>> b<-1:1000
>>
>>
>>> gc() #i
>> used (Mb) gc trigger (Mb) max used (Mb)
>> Ncells 173527 4.7 467875 12.5 350000 9.4
>> Vcells 108850 0.9 786432 6.0 398019 3.1
>>>
>>> c<-cp(a,b)
>>> rm(c)
>>> gc() #ii
>> used (Mb) gc trigger (Mb) max used (Mb)
>> Ncells 233998 6.3 467875 12.5 350000 9.4
>> Vcells 108866 0.9 12089861 92.3 10119856 77.3
>>>
>> -----------------------------------------------
>>
>>
>>
>>
>>
>>
>> --------------------------------------------
>> #include "R.h"
>> #include "Rinternals.h"
>> #include "Rdefines.h"
>> SEXP crossprod2(SEXP a, SEXP b);
>> //modified from convolve2 in the R extension
>> //R CMD SHLIB crossprod2.c
>>
>> #include <R.h>
>> #include <Rinternals.h>
>> SEXP crossprod2(SEXP a, SEXP b)
>> {
>> R_len_t i, j, na, nb, nab;
>> double *xa, *xb, *xab;
>> SEXP ab;
>>
>> PROTECT(a = coerceVector(a, REALSXP));
>> PROTECT(b = coerceVector(b, REALSXP));
>> na = length(a); nb = length(b);
>>
>> //nab = na + nb - 1;
>> nab=na*nb;// we are doing the cross product
>> PROTECT(ab = allocVector(REALSXP, nab));
>> xa = REAL(a); xb = REAL(b);
>> xab = REAL(ab);
>> for(i = 0; i < nab; i++) xab[i] = 0.0;
>> for(i = 0; i < na; i++)
>> for(j = 0; j < nb; j++) //xab[i + j] += xa[i] * xb[j];
>> xab[i*nb + j] += xa[i] * xb[j];//we are computing crossproduct
>> UNPROTECT(3);
>> return(ab);
>> }
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> --
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Yongchao Ge Yongchao.Ge at mssm.edu
Mount Sinai School of Medicine office: 212-241-3536
Department of Neurology
One Gustave L. Levy Place, Box 1137 New York, NY, 10029, USA
web url: www.mssm.edu/faculty/yongchao-ge
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
More information about the R-devel
mailing list