[R-pkg-devel] Bioconductor reverse dependency checks for a CRAN package

Martin Morgan mtmorg@n@xyz @end|ng |rom gm@||@com
Tue Jan 30 17:24:40 CET 2024


BiocManager (the recommended way to install Bioconductor packages) at the end of the day does essentially install.packages(repos = BiocManager::repositories()), ensuring that the right versions of Bioconductor packages are installed for the version of R in use. Maybe this is moot though if the goal is to emulate CRAN, since I believe they use utils:::.BioC_version_associated_with_R_version().

There is a Bioconductor docker container, extending rocker https://github.com/Bioconductor/bioconductor_docker ; I don't know whether this would be more or less work than alternatives. And an exploratory https://github.com/Bioconductor/bioc2u; I don't know the status of that project.

At some point the scale tips from installing packages from standard repositories to using a local clone https://bioconductor.org/about/mirrors/mirror-how-to/.

Martin Morgan

From: R-package-devel <r-package-devel-bounces using r-project.org> on behalf of Ivan Krylov via R-package-devel <r-package-devel using r-project.org>
Date: Tuesday, January 30, 2024 at 10:57 AM
To: r-package-devel using r-project.org <r-package-devel using r-project.org>
Subject: [R-pkg-devel] Bioconductor reverse dependency checks for a CRAN package
Hello R-package-devel,

What would you recommend in order to run reverse dependency checks for
a package with 182 direct strong dependencies from CRAN and 66 from
Bioconductor (plus 3 more from annotations and experiments)?

Without extra environment variables, R CMD check requires the Suggested
packages to be available, which means installing...

revdepdep <- package_dependencies(revdep, which = 'most')
revdeprest <- package_dependencies(
 unique(unlist(revdepdep)),
 which = 'strong', recursive = TRUE
)
length(setdiff(
 unlist(c(revdepdep, revdeprest)),
 unlist(standard_package_names())
))

...up to 1316 packages. 7 of these suggested packages aren't on CRAN or
Bioconductor (because they've been archived or have always lived on
GitHub), but even if I filter those out, it's not easy. Some of the
Bioconductor dependencies are large; I now have multiple gigabytes of
genome fragments and mass spectra, but also a 500-megabyte arrow.so in
my library. As long as a data package declares a dependency on your
package, it still has to be installed and checked, right?

Manually installing the SystemRequirements is no fun at all, so I've
tried the rocker/r2u container. It got me most of the way there, but
there were a few remaining packages with newer versions on CRAN. For
these, I had to install the system packages manually in order to build
them from source.

Someone told me to try the rocker/r-base container together with pak.
It was more proactive at telling me about dependency conflicts and
would have got me most of the way there too, except it somehow got me a
'stringi' binary without the corresponding libicu*.so*, which stopped
the installation process. Again, nothing that a bit of manual work
wouldn't fix, but I don't feel comfortable setting this up on a CI
system. (Not on every commit, of course - that would be extremely
wasteful - but it would be nice if it was possible to run these checks
before release on a different computer and spot more problems this way.)

I can't help but notice that neither install.packages() nor pak() is
the recommended way to install Bioconductor packages. Could that
introduce additional problems with checking the reverse dependencies?

Then there's the check_packages_in_dir() function itself. Its behaviour
about the reverse dependencies is not very helpful: they are removed
altogether or at least moved away. Something may be wrong with my CRAN
mirror, because some of the downloaded reverse dependencies come out
with a size of zero and subsequently fail the check very quickly.

I am thinking of keeping a separate persistent library with all the
1316 dependencies required to check the reverse dependencies and a
persistent directory with the reverse dependencies themselves. Instead
of using the reverse=... argument, I'm thinking of using the following
scheme:

1. Use package_dependencies() to determine the list of packages to test.
2. Use download.packages() to download the latest version of everything
if it doesn't already exist. Retry if got zero-sized or otherwise
damaged tarballs. Remove old versions of packages if a newer version
exists.
3. Run check_packages_in_dir() on the whole directory with the
downloaded reverse dependencies.

For this to work, I need a way to run step (3) twice, ensuring that one
of the runs is performed with the CRAN version of the package in the
library and the other one is performed with the to-be-released version
of the package in the library. Has anyone already come up with an
automated way to do that?

No wonder nobody wants to maintain the XML package.

--
Best regards,
Ivan

______________________________________________
R-package-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list