[Bioc-devel] External software dependencies, bioc builders, and docs maintenance

Spencer Nystrom ny@tromdev @end|ng |rom gm@||@com
Wed Aug 17 15:37:03 CEST 2022

Hi Vince,

Thanks for the thoughtful reply. I agree about the license in this case
being an issue, too.

I agree having builder images available could be very helpful for dev
purposes. What I have done for my own testing is build a container from the
base bioc docker image and install meme and it's system deps on top of
that, then use this image for testing. My architecture for that has gone a
little stale, but the temporary solution I was planning on was a gh-action
to rebuild weekly after the bioc docker image is built, install a few
different MEME versions, then run some rounds of integration testing across
versions & release/dev branches. I know the bioc docker image isn't a build
system mirror, but it seems to do pretty well for most things. As for the
MEME official docker images, the source for those is closed and I'm not
sure how the images are licensed (I imagine restrictively). This whole
license mess around their tooling was really an oversight on my part. If
this turns out to be incompatible with Bioconductor's principles, I'm happy
to pull the package and host on R-universe.

As for your issues with deps, I don't think I've ever passed all `make
check` or `make test`'s from MEME installs on any system I've tried.
Although the tests I have issue with are ones that check the webserver
utilities, and I just use the binaries on the cli so failing these hasn't
been an issue. There are a series of extra system dependencies needed for
the webserver that are dispensable for the cli to work. In case it helps,
here is the base Dockerfile I use, including some of the cpanm deep magic
required to get the install working & most tests passing:

Finally, to your point about integration testing, I completely agree. For
what it's worth, I only skip tests on builders that call out to the MEME
binaries (this is implemented by skipping if the package can't detect a
local meme install, so should that resolve on builders, the tests will
auto-enable). I packaged a lot of pre-generated results so that I can run
as many unit & integration tests as possible that don't require actually
running MEME tools, but this too has its shortcomings.


On Sat, Aug 13, 2022 at 6:44 AM Vincent Carey <stvjc using channing.harvard.edu>

> Thanks for this note Spencer.  I started looking at the meme
> infrastructure to start to learn how to proceed.
> An initial concern is the licensing of meme, with which I was
> unfamiliar.  The license includes
>     Those  desiring to  incorporate this  software into commercial
>     products  or use for  commercial  purposes  should contact the
>     Technology  Transfer  Office,  University of California,   San
>     Diego,  9500 Gilman Drive,  La Jolla,  California, 92093-0910,
>     Phone: (858) 534-5815.
> Let's put that to one side for now.  Then there's the issue of
> installability of meme-5.4.1.  After
> updating my xml/xslt infrastructure, make succeeded, but make check gave
> make[5]: Entering directory '/home/stvjc/meme-5.4.1/tests/scripts'
> Can't locate Sys/Info.pm in @INC (you may need to install the
> Sys::Info module) (@INC contains: /home/stvjc/meme
> This was not remedied with a simple cpanm operation, so some dev ops
> expertise is needed.
> What about the dockerized meme?  Could we use that?
> ---
> I looked at the specifics of meme to get somewhat concrete about the
> tradeoffs between  expansion of build system infrastructure and its
> maintenance, and full
> testability of all contributed packages in the build system.  Frequent
> integrated testing is an
> important component of the Bioconductor ecosystem,
> and blocking package code from being checked in the build system should be
> minimized.  I would like to have an arrangement in which images of the
> "build systems"
> (for linux, mac and windows) are accessible for
> experimentation and enhancement by contributor-developers.  This would
> allow contributor-developers to become very precise about the "System
> requirements" of their
> packages, to define precisely the configurations needed for successful
> integrated testing, and to obtain
> data on resource consumption useful in determining the cadence of
> testing (choosing between
> long-running and daily-running tests). "Containerization" of the linux
> build system could form a step in this
> direction.
> Further discussion is most welcome.
> On Tue, Aug 2, 2022 at 10:34 AM Spencer Nystrom <nystromdev using gmail.com>
> wrote:
> >
> > Hi all,
> >
> > Related to the recent discussion of packages building "fake" docs, I
> wanted
> > to call myself out here as being guilty of it and discuss how to mediate
> > the issue.
> >
> > In my case, the {memes} package (
> > https://bioconductor.org/packages/release/bioc/html/memes.html) has core
> > functionality that depends on an external install of the meme suite
> family
> > of tools. This is currently missing on the builders, so there are a few
> > vignettes and examples that use a `NOT_CRAN` style check if the software
> is
> > missing at build time to skip those executions. In order to build more
> > useful docs, I render them with a github action using a container with
> the
> > software installed and post them to my own site, which I also point users
> > towards. I completely agree this is suboptimal, and it has haunted me for
> > some time now.
> >
> > I see 3 potential solutions to the problem:
> >
> > 1. Install the latest Meme Suite version on bioc builders. Without
> > enumerating all the ways this could cause issues, I foresee mismatches
> > between the version I support vs installed on the builders could make
> this
> > a recurring maintenance burden for the core team.
> >
> > 2. Use {basilisk} to provide a conda env with the current supported MEME
> > Suite version within the {memes} package. This has its own issues because
> > the current conda envs for the MEME Suite don't really work right, and
> I'm
> > not sure I can take on the maintenance burden of something that
> extensive.
> > There have historically been a few cryptic bugs in their configs.
> >
> > 3. Some kind of {memes}-driven installer. I've seen other bioc packages
> > provide helper functions that install software for the user with their
> > permission & in a user-provided location. Provided the builders have the
> > system dependencies installed, I guess the package could install the
> > software to a tmpdir and clean up after itself on bioc builders. This
> also
> > seems pretty hacky and error prone from  a builder perspective.
> >
> > Anyway, I'd appreciate any feedback here. In the course of writing this
> > I've talked myself into doing another round of due diligence with the
> conda
> > versions but curious to hear others thoughts. In particular this pattern
> of
> > relying on an external dependency is one I've seen a few times in other
> > packages and have seen it solved in a variety of creative ways.
> >
> > Best,
> >   -Spencer
> >
> >         [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioc-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> --
> The information in this e-mail is intended only for th...{{dropped:18}}

More information about the Bioc-devel mailing list