[Bioc-devel] C++ parallel computing

Martin Morgan mtmorg@n@b|oc @end|ng |rom gm@||@com
Tue May 25 19:39:43 CEST 2021

If the BAM files are each processed independently, and each processing task takes a while, then it is probably 'good enough' to use R-level parallel evaluation using BiocParallel (currently the recommendation for Bioconductor packages) or other evaluation framework. Also, presumably you will use Rhtslib, which provides C-level access to the hts library. This will requiring writing C / C++ code to interface between R and the hts library, and will of course be a significant underataking.

It might be worth outlining in a bit more detail what your task is and how (not too much detail!) you've tried to implement this in Rsamtools.

Martin Morgan

On 5/24/21, 10:01 AM, "Bioc-devel on behalf of Oleksii Nikolaienko" <bioc-devel-bounces using r-project.org on behalf of oleksii.nikolaienko using gmail.com> wrote:

    Dear Bioc team,
    I'd like to ask for your advice on the parallelization within a Bioc
    package. Please point me to a better place if this mailing list is not
    After a bit of thinking I decided that I'd like to parallelize processing
    at the level of C++ code. Would you strongly recommend not to and use an R
    approach instead (e.g. "future")?
    If parallel C++ is ok, what would be the best solution for all major OSs?
    My initial choice was OpenMP, but then it seems that Apple has something
    against it (https://mac.r-project.org/openmp/). My own dev environment is
    mostly Big Sur/ARM64, but I wouldn't want to drop its support anyway.

    (On the actual task: loading and specific processing of very large BAM
    files, ideally significantly faster than by means of Rsamtools as a backend)

    Oleksii Nikolaienko

    	[[alternative HTML version deleted]]

    Bioc-devel using r-project.org mailing list

More information about the Bioc-devel mailing list