[Bioc-devel] C++ code performance issues

Peter Glaus glaus at cs.man.ac.uk
Fri Mar 22 16:11:34 CET 2013


Hi Martin,
thanks for the tips. I did a bit more investigation and it showed up 
that the development version of R is not compiling with optimization 
flags while installing the packages.
I am not sure whether this was also the case initially, but I know for 
sure that it was using -O3 when running CMD check, maybe I just got 
confused and never noticed that it's not using it during the installation.

Is it safe to assume that optimization flags will be used in the stable 
release version, or is it better to specify the in the package's Makevars?

Peter.


On 21/03/13 19:36, Martin Morgan wrote:
> On 03/21/2013 11:30 AM, Peter Glaus wrote:
>> Hi,
>> I am working on BitSeq package, which has both command line C++ 
>> version and
>> Bioconductor version in which R calls the same C++ code with .C 
>> function. While
>> testing the development version of package on R 3.0.0 I noticed that 
>> the "R
>> version" runs much slower: 2-3 TIMES slower than the pure C++ 
>> implementation.
>> Interestingly, the stable release of the "R version" seems to be as 
>> fast as C++
>> version. (The underlying code has changed slightly but there 
>> shouldn't be much
>> difference)
>> Is there any reason for such behavior? Has anyone encountered similar 
>> issue? Is
>> there a way to make the C++ code called from R faster?
>>
>> More details:
>> I compiled the C++ code with same g++ flags (... -O3 -pipe -fpic 
>> -g... ) and
>> removed OpenMP support from both.
>> The functions take exactly the same input (input is read from a 
>> file), and
>> produce exactly same output (using same seed). A specific computation 
>> that took
>> the C++ version 12minutes, took the R(C++) version 47minutes. There 
>> is no IO
>> during that part of the code and there was just one 
>> R_CheckUserInterrupt() call
>> during this time (I changed the code, so that there would not be many 
>> of these
>> calls.).
>> There are just few differences in the last stable release and that 
>> seems to run
>> even faster than current C++ (10m). (The stable release uses -O2 
>> while compiling
>> the c++ code.)
>
> Can you narrow this down to something more reproducible, e.g., a 
> particular call that causes problems, including the platform(s) on 
> which you are seeing issues?
>
> Maybe you're running out of memory (because R is holding memory that 
> the command line does not access)?
>
> Probably you spend most of your time 'in C' or 'in R', rather than 
> moving between them?
>
> You could try, on linux / mac, a cheap C-level guesstimate of where 
> time is spent by running under gdb
>
>   R -d gdb
>   (gdb) run
>
> and then periodically breaking with cntrl-C and looking where you are
>
>   (gdb) backtrace
>    ## stack trace
>    (gdb) continue
>
> and comparing the same under the commandline
>
>   > gdb ./bitseq
>
> or doing some more serious profiling as outlines in section 3.4 of 
> 'Writing R Extensions"; probably you would start by getting a short 
> reproducible example.
>
> Martin
>
>>
>> Thanks,
>> Peter.
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>



More information about the Bioc-devel mailing list