[Bioc-devel] Question about external algorithms to Bioconductor package

Ioannis Vardaxis ioannis.vardaxis at ntnu.no
Fri Nov 24 15:57:36 CET 2017


I tried the Rsubread package you suggested and the mapping is running.
However it takes like forever to end. Even in parallel it needs some days
to run while bowtie for example needs only a couple of hours in 4 cores.
Is there any way of speeding up Rsubread? Or else I don¹t see any reason
using it, and this is a big problem if I cannot use bowtie inside a
bioconductor package.

Ioannis Vardaxis

Stipendiat IMF

On 12/11/2017, 23:54, "A.E.S." <adrian.salatino at conicet.gov.ar> wrote:

>On Sun, 12 Nov 2017 22:22:56 +0000
>Ryan Thompson <rct at thompsonclan.org> wrote:
>> Hi,
>> I don't know the Bioconductor policy for packages that rely on
>> external tools, but for the specific features you mention, there are
>> Bioconductor packages to accomplish most or all of them. You can use
>> samtools via Rsamtools, you can use the Rsubread package in place of
>> bowtie for alignment, and you can use the SRAdb package for For SRA
>> access. (I believe there are also several other alignment methods
>> available in Bioconductor, if Rsubread doesn't do what you need.)
>> Using these packages should ensure that biocLite() can fully satisfy
>> all the requirements for your package without the need for separate
>> installation of other command-line tools.
>> Regards,
>> Ryan Thompson
>> On Sun, Nov 12, 2017 at 2:12 PM Ioannis Vardaxis
>> <ioannis.vardaxis at ntnu.no> wrote:
>> > Hi,
>> > I have developed a package and is current under review from
>> > Bioconductor. In the future I am considering of making some changes
>> > to the package, basically adding more functions etc.
>> > My package is currently a peak calling algorithm where the input it
>> > gets is either a BAM or SAM format. Because in general a user which
>> > runs such analysis needs to, for example, map the DNA sequences to
>> > the reference genome and obtaining the BAM/SAM file and then turn
>> > to my algorithm for the rest. I was wondering if I am allowed to
>> > add those processes to my package as preliminary stages such that
>> > it becomes easier for the user to have everything in one place.
>> > To do so I will need my package to make use of: SRAtoolkit, bowtie
>> > and SAMtools. Which  I could run in terminal (using system() in R).
>> > For running those stages need the user to have installed those
>> > algorithms off course.
>> > I was wondering if I am allowed to make use of those algorithms  in
>> > my bioconductor package, with the appropriate references off course.
>> > Best,
>> > --
>> > Ioannis Vardaxis
>> > Stipendiat IMF
>> > NTNU
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > Bioc-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >  
>> 	[[alternative HTML version deleted]]
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>I will quote the "dependencies" part of the package guidelines. I
>recommend you to read it all, including the whole developer section
>which has plenty of information...
>Package Dependencies
>Packages you depend on must be available via Bioconductor or CRAN;
>users and the automated build system have no way to install packages
>from other sources. Reuse, rather than re-implement or duplicate,
>well-tested functionality from other packages. Specify package
>dependencies in the DESCRIPTION file, listed as follows Imports: is for
>packages that provide functions, methods, or classes that are used
>inside your package name space. Most packages are listed here. Depends:
>is for packages that provide essential functionality for users of your
>package, e.g., the GenomicRanges package is listed in the Depends:
>field of GenomicAlignments. It is unusual for more than three packages
>to be listed as ŒDepends:¹. Suggests: is for packages used in vignettes
>or examples, or in conditional code. Enhances: is for packages such as
>Rmpi or parallel that enhance the performance of your package, but are
>not strictly needed for its functionality. SystemRequirements: is for
>listing any external software which is required, but not automatically
>installed by the normal package installation process. If the
>installation process is non-trivial, a top-level README file should be
>included to document the process. A package may rarely offer optional
>functionality, e.g., visualization with rgl when that package is
>available. Authors then list the package in the Suggests field, and use
>requireNamespace() (or loadNamespace()) to condition code execution.
>Functions from the loaded namespace should be accessed using ::
>notation, e.g., x <- sort(rnorm(1000)) y <- rnorm(1000) z <-
>rnorm(1000) + atan2(x,y) if (requireNamespace("rgl", quietly=TRUE))
>{ rgl::plot3d(x, y, z, col=rainbow(1000)) } else { ## code when "rgl"
>is not available } This approach does not alter the user search() path,
>and ensures that the necessary function (plot3d(), from the rgl
>package) is used. Such conditional code increases complexity of the
>package and frustrates users who do not understand why behavior differs
>between installations, so is often best avoided.

More information about the Bioc-devel mailing list