[Bioc-devel] C++ code performance issues

Martin Morgan mtmorgan at fhcrc.org
Thu Mar 21 20:36:36 CET 2013


On 03/21/2013 11:30 AM, Peter Glaus wrote:
> Hi,
> I am working on BitSeq package, which has both command line C++ version and
> Bioconductor version in which R calls the same C++ code with .C function. While
> testing the development version of package on R 3.0.0 I noticed that the "R
> version" runs much slower: 2-3 TIMES slower than the pure C++ implementation.
> Interestingly, the stable release of the "R version" seems to be as fast as C++
> version. (The underlying code has changed slightly but there shouldn't be much
> difference)
> Is there any reason for such behavior? Has anyone encountered similar issue? Is
> there a way to make the C++ code called from R faster?
>
> More details:
> I compiled the C++ code with same g++ flags (... -O3 -pipe -fpic -g... ) and
> removed OpenMP support from both.
> The functions take exactly the same input (input is read from a file), and
> produce exactly same output (using same seed). A specific computation that took
> the C++ version 12minutes, took the R(C++) version 47minutes. There is no IO
> during that part of the code and there was just one R_CheckUserInterrupt() call
> during this time (I changed the code, so that there would not be many of these
> calls.).
> There are just few differences in the last stable release and that seems to run
> even faster than current C++ (10m). (The stable release uses -O2 while compiling
> the c++ code.)

Can you narrow this down to something more reproducible, e.g., a particular call 
that causes problems, including the platform(s) on which you are seeing issues?

Maybe you're running out of memory (because R is holding memory that the command 
line does not access)?

Probably you spend most of your time 'in C' or 'in R', rather than moving 
between them?

You could try, on linux / mac, a cheap C-level guesstimate of where time is 
spent by running under gdb

   R -d gdb
   (gdb) run

and then periodically breaking with cntrl-C and looking where you are

   (gdb) backtrace
    ## stack trace
    (gdb) continue

and comparing the same under the commandline

   > gdb ./bitseq

or doing some more serious profiling as outlines in section 3.4 of 'Writing R 
Extensions"; probably you would start by getting a short reproducible example.

Martin

>
> Thanks,
> Peter.
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-devel mailing list