[Bioc-devel] C++ code performance issues

Martin Morgan mtmorgan at fhcrc.org
Fri Mar 22 21:34:06 CET 2013


On 03/22/2013 08:11 AM, Peter Glaus wrote:
> Hi Martin,
> thanks for the tips. I did a bit more investigation and it showed up that the
> development version of R is not compiling with optimization flags while
> installing the packages.
> I am not sure whether this was also the case initially, but I know for sure that
> it was using -O3 when running CMD check, maybe I just got confused and never
> noticed that it's not using it during the installation.
>
> Is it safe to assume that optimization flags will be used in the stable release
> version, or is it better to specify the in the package's Makevars?

a 2- or 3x speed-up due to compiler flags would be surprisingly (to me) large; 
maybe more likely a few percent...

By default R uses the same compiler flags for package installation as were used 
to build R itself; perhaps your development version of R has been compiled with 
CXXFLAGS="-O0"; I believe that the Bioc builders use 'default' values for these, 
and that these remain unchanged in the R distribution at -O2; this could be 
platform (Linux / Mac / Windows) or compiler-specific, though. Probably the 
intention is that R would be compiled to use -O2 'out of the box'. These can be 
checked at

   http://bioconductor.org/checkResults/devel/bioc-LATEST/

by clicking on the different machine names, george2, moscato2, petty

Martin



>
> Peter.
>
>
> On 21/03/13 19:36, Martin Morgan wrote:
>> On 03/21/2013 11:30 AM, Peter Glaus wrote:
>>> Hi,
>>> I am working on BitSeq package, which has both command line C++ version and
>>> Bioconductor version in which R calls the same C++ code with .C function. While
>>> testing the development version of package on R 3.0.0 I noticed that the "R
>>> version" runs much slower: 2-3 TIMES slower than the pure C++ implementation.
>>> Interestingly, the stable release of the "R version" seems to be as fast as C++
>>> version. (The underlying code has changed slightly but there shouldn't be much
>>> difference)
>>> Is there any reason for such behavior? Has anyone encountered similar issue? Is
>>> there a way to make the C++ code called from R faster?
>>>
>>> More details:
>>> I compiled the C++ code with same g++ flags (... -O3 -pipe -fpic -g... ) and
>>> removed OpenMP support from both.
>>> The functions take exactly the same input (input is read from a file), and
>>> produce exactly same output (using same seed). A specific computation that took
>>> the C++ version 12minutes, took the R(C++) version 47minutes. There is no IO
>>> during that part of the code and there was just one R_CheckUserInterrupt() call
>>> during this time (I changed the code, so that there would not be many of these
>>> calls.).
>>> There are just few differences in the last stable release and that seems to run
>>> even faster than current C++ (10m). (The stable release uses -O2 while compiling
>>> the c++ code.)
>>
>> Can you narrow this down to something more reproducible, e.g., a particular
>> call that causes problems, including the platform(s) on which you are seeing
>> issues?
>>
>> Maybe you're running out of memory (because R is holding memory that the
>> command line does not access)?
>>
>> Probably you spend most of your time 'in C' or 'in R', rather than moving
>> between them?
>>
>> You could try, on linux / mac, a cheap C-level guesstimate of where time is
>> spent by running under gdb
>>
>>   R -d gdb
>>   (gdb) run
>>
>> and then periodically breaking with cntrl-C and looking where you are
>>
>>   (gdb) backtrace
>>    ## stack trace
>>    (gdb) continue
>>
>> and comparing the same under the commandline
>>
>>   > gdb ./bitseq
>>
>> or doing some more serious profiling as outlines in section 3.4 of 'Writing R
>> Extensions"; probably you would start by getting a short reproducible example.
>>
>> Martin
>>
>>>
>>> Thanks,
>>> Peter.
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-devel mailing list