[R] Writing own simulation function in C

Romain Francois romain at r-enthusiasts.com
Sun Mar 7 09:50:37 CET 2010


On 03/06/2010 05:51 AM, Sharpie wrote:
> TheSavageSam wrote:
>>
>> I am wishing to write my own random distribution simulation function using
>> C programmin language(for speed) via R. I am familiar with R programming
>> but somewhat new to C programming. I was trying to understand "Writing R
>> extensions" -guide and its part 6.16, but I found it hard to
>> understand(http://cran.r-project.org/doc/manuals/R-exts.html#Standalone-Mathlib).
>> I also tried to get familiar with the example codes, but it didn't make me
>> any wiser.
>>
>> The biggest problem seems to be how to get(what code to write in C) random
>> uniform numbers using Rmath. That seems to be too complicated to me at the
>> moment. So if someone of you could give a small example and it would
>> definitely help me a lot. All I wish to do first is to finally write(and
>> understand) my own function similar to what you run in R Command line via
>> command "runif".
>>
>> And all this I am hoping to do without recompiling my whole R. Instead of
>> that I think that it is possible to use dyn.load("code.so") in R. (Yes, I
>> use linux)
>>
>
> If the code in this message gets mangled by the mailing list, try viewing it
> on nabble:
>
> http://n4.nabble.com/Writing-own-simulation-function-in-C-td1580190.html#a1580190
>
> ...
>
> Actually, funny story... I wrote this post using Nabble, but the Spam
> Detector refuses to let me post it claiming that there are "too many sex
> words" due some constructs in the R language.  So I guess this confirms that
> using R is the quickest way to increase your statistical manhood/womanhood.
>
> As a result of this, all code chunks have been moved to a gist at GitHub--
> links are provided.
>
> ...
>
>
> I would suggest first learning how to implement callbacks to R functions
> from C functions rather than diving straight into calling the C
> implementations of R functions.  The reasons are:
>
>    1.  You already know how the R functions is supposed to be called.
>
>    2.  You don't have to search through the R source to track down the C
> implementation of the function you are after.
>
> I would also bet that the overhead of calling an R function that calls it's
> compiled implementation is not that significant-- but I haven't done any
> profiling.
>
> The way I would approach this problem would be to create a new package since
> the package structure helps manage the steps involved in loading and testing
> the compiled code.
>
> So, here's an example of calling the R function runif() from C (tested on
> Linux Mint 8 with R 2.10.1):
>
> Start an R terminal and execute the following:
>
>    setwd( 'path/to/wherever/you/like/to/work' )
>
>    # Create a dummy wrapper function to start the package with
>    myRunif<- function( n, min = 0, max = 1 ){}
>
>    # Start the package
>    package.skeleton( 'RtoC' )

And if you are willing to use C++ instead of C, then install Rcpp and 
use the function Rcpp.package.skeleton.

In addition to what package.skeleton does, it also creates C++ files and 
an R function that calls them.

Romain

> Now, quit R.  You'll notice that a folder called RtoC appeared in the
> directory where R was working.  Edit RtoC/R/myRunif.R to contain the
> following:
>
>
>    see:
>    http://gist.github.com/323498#file_my_runif.r
>
>
> The above R function is a wrapper for our compiled C function.  It states
> that we will be calling a C routine named "myRuinf" and passing the
> parameters n, min and max.  The .Call() interface will pass these arguments
> as separate SEXP objects, which are the C-level representation of R objects.
> The alternatives to .Call are .External which gathers all the arguments up
> and passes them as a single SEXP and .C() which passes them directly as
> basic C variables (care must be used with .C(), you may need to wrap things
> in as.double(), as.integer(), etc.).
>
> I am using .Call() over .C() because .Call() allows an SEXP (i.e. native R
> object) to be returned directly.   .External() also returns a SEXP, but we
> are calling back to an R function so we would have to split each argument
> into a separate SEXP anyway.  With .C() we would have to coerce the C types
> to separate SEXP variables. Furthermore, with .C(), an additional variable
> to hold the return value would have to be included as an argument the
> function as we are expecting to return a vector and all the arguments are
> scalars.
>
> So, now the task is to write the C implementation of myRunif.  Create the
> following folder in the package:
>
>    mkdir  RtoC/src
>
> Any C or Fortran code placed in the /src folder of an R package will be
> automagically compiled to a shared library that will be included when the
> package is installed.  So, let's add a C routine, RtoC/src/myRunif.c:
>
>
>    see:
>    http://gist.github.com/323498#file_m_runif.c
>
>
> The first thing that myRunif.c does is to locate the namespace of the stats
> function.  This is important as runif() lives in that namespace and we will
> be executing a call to it later.  The first thing to note is that results
> returned by R are assigned inside of a PROTECT() statement.  This is because
> SEXPs are *pointers* to data and we don't want the R garbage collecter to
> munch the data we're pointing to before we get a chance to use it.
>
> After finding the stats namespace, the second operation is to set up the
> call back to the R function ruinf()-- this is started by creating a vector
> of type "language" that has a length of 4-- one slot for the function name
> and then three more slots because runif takes three arguments.
>
> Next, the slots of the LANGSXP are filled- first by locating the name of the
> function we wish to call using findFun() and then by filling in the
> arguments.  The assignment of each argument is followed by a call to SET_TAG
> which tags the argument with it's name in the R function.  Basically, lines
> 18-29 construct an R function call of the form:
>
>    runif( n = n, min = min, max = max )
>
> Next, a SEXP is created to hold the results of the function call that was
> constructed.  The variable is then assigned the results of the function call
> inside a PROTECT()
>
> Finally, UNPROTECT(3) is called before the results are returned to R.  This
> is don because we called PROTECT() 3 times.  If you don't call UNPROTECT(),
> you will see warnings about a "Stack Imbalance" when running your code in R.
>
> It is worth noting that I am being incredibly verbose with lines 15-36, I
> could have replaced them with:
>
>
>    see:
>    http://gist.github.com/323498#file_my_runif_concise.c
>
> In the case above, lang4() provides a shortcut for building a LANGSXP with 4
> slots and the call arguments are matched by position.  I showed the long
> version in case you want to execute a call with more than 3 arguments or you
> need to execute a call using key=value matching.
>
> Finally, you need to add one more routine that causes the dynamic library to
> be loaded when the R package is loaded.  Since the package doesn't have a
> namespace, we'll use the function .First.lib() and place it in RtoC/R/zzz.R:
>
>    see:
>    http://gist.github.com/323498#file_zzz.r
>
> If you add a namespace to your package, you will need to use .onLoad()
> instead of .First.lib().
>
> Now build and install the package:
>
>    R CMD build RtoC
>    R CMD INSTALL RtoC_0.1.tar.gz
>
> You should be able to execute the following in R:
>
>    library( RtoC )
>
>    myRunif( 10 )
>
> [1] 0.8072035 0.3978969 0.6492380 0.3614537 0.9150126 0.7287221 0.4242656
> [8] 0.9401839 0.8275597 0.7270001
>
>    myRunif( 10, -10, 10 )
>
> [1]  0.669904 -7.603356 -6.632470  6.753585 -9.424831  8.758971 -4.430569
> [8]  3.737836 -5.297374 -9.968060
>
>
> Happy hacking!
>
> -Charlie
>
> -----
> Charlie Sharpsteen
> Undergraduate-- Environmental Resources Engineering
> Humboldt State University


-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/OIXN : raster images and RImageJ
|- http://tr.im/OcQe : Rcpp 0.7.7
`- http://tr.im/O1wO : highlight 0.1-5



More information about the R-help mailing list