[Rd] Portability and Memory Issues for R-package
KNygren@us.imshealth.com
KNygren at us.imshealth.com
Tue Dec 27 07:46:05 CET 2005
I was able to get the memory issues resolved, so no need to post a response in that regards. When it comes to the portability issues, I would still like to understand how to best deal with it in regards to the gsl library.
-----Original Message-----
From: r-devel-bounces at r-project.org
[mailto:r-devel-bounces at r-project.org]On Behalf Of Nygren, Kjell (Union
Meeting)
Sent: Sunday, December 25, 2005 2:35 PM
To: r-devel at r-project.org
Subject: [Rd] Portability and Memory Issues for R-package
I have an upcoming JASA paper with an iid sampling algorithm for Bayesian Generalized Linear models (e.g., Logit, Poisson Regression, and Conditional Logit models with multivariate normal priors). At this point, I have implemented the algorithms in C and hope to make the functions and corresponding source code available through an R package. I have successfully created the code necessary to create and install a package with most of the functions on my local machine (using R CMD check,R CMD build, and R CMD INSTALL). As my code makes extensive use of the GSL matrix library, however, I have some questions regarding portability of my package. I am also running into some memory issues when making repeated calls to my functions which I would hope to be able to fix before making a formal distribution of the package. More specifically, the issues are the following:
I. Portability-
Since I make extensive use of the gsl library in my C code, I have the gsl library installed (within the MinGw directory so it is included in the path) on my local machine. Within the package, I am then including a Makevars file with the following code in order to link to the gsl library:
PKG_LIBS=-lgsl -lgslcblas
I also know that there is an R package (gsl) making use of some gsl functions which contains a Makevars.win file with the following code:
PKG_LIBS=-LF:/MinGW/usr/local/lib -lgsl -lgslcblas
# CPPFLAGS=-I$(R_HOME)/include -IF:/MinGW/usr/local/include
PKG_CPPFLAGS=-IF:/MinGW/usr/local/include
For my package to install properly on other machines, however, I take it they would have to have the gsl library files already installed in the proper location (or am I mistaken here?). In order to make it fully portable on other machines, it thus seems like I would need to either include instructions for how to first install the gsl library prior to installation (which would have to be platform specific), or to somehow have the gsl library files installed during the R package installation. Is the latter even possible? If so, how could it be done (the key files are likely the two library files)? I believe the gsl package requires the user to have the gsl library preinstalled.
I guess long-term, an option is for me to rework my C code to eliminate the dependence on the gsl library. This could, however, be a time consuming effort. In the meantime would it be possible to contribute the package with the existing dependence (as I think is the case for the gsl library).
II. Memory Issue-
The functions in my package are generally fast and seem to work well if I make a limited number of calls to them from my R code. If I try to make use of them as part of an R MCMC implementation (say updating each Gibbs block 10,000 times in an R loop), I run into memory issues. Despite the fact that my underlying C code frees memory to all pointers, it does not seem like windows recognizes that the memory has been freed. This is apparent as the Mem Usage for RGUI.exe in the windows task manager keeps growing throughout the loop and the code slows down and eventually makes virtually no progress. I have noticed similar issues in the past when calling Winbugs repeatedly using Gelmans functions, so it is likely not an issue that is coming just from my code.
I suspect that the memory issues could have something to do with the fact that my C code makes repeated use of the gsl_matrix_alloc and gsl_matrix_free functions rather than the R_alloc function (I suspect that the memory is not Garbage collected). I searched the web and found the following suggestion from Bryan Gouch in response to a similar question posted on the gsl discussion forum.
"If you want to return an R object containing a gsl_matrix which can be garbage collected then you could use a C++ wrapper, as the C++ interface in R allows the use of separate constructors and destructors. "
Would this be a possible solution? If so, how can I find information on how to write such wrapper functions that will work for gsl matrices? I must admit that I am not familiar with how the use of separate constructors and destructors would work. If that is not the solution, would anyone have any other ideas as to how I can solve the memory issues.
Kjell Nygren
Kjell Nygren, Ph.D.
Director Pricing and Advanced Analytics
Statistical Services
IMS Health®
960 Harvest Drive, Building A
Blue Bell, PA 19422 USA
voice: 610.832.5586 * fax: 610.832.5850
email: <mailto:knygren at us.imshealth.com>
www.imshealth.com
The information contained in this communication is confident...{{dropped}}
More information about the R-devel
mailing list