[R-pkg-devel] Bioconductor reverse dependency checks for a CRAN package

Ivan Krylov |kry|ov @end|ng |rom d|@root@org
Tue Jan 30 16:56:28 CET 2024


Hello R-package-devel,

What would you recommend in order to run reverse dependency checks for
a package with 182 direct strong dependencies from CRAN and 66 from
Bioconductor (plus 3 more from annotations and experiments)?

Without extra environment variables, R CMD check requires the Suggested
packages to be available, which means installing...

revdepdep <- package_dependencies(revdep, which = 'most')
revdeprest <- package_dependencies(
 unique(unlist(revdepdep)),
 which = 'strong', recursive = TRUE
)
length(setdiff(
 unlist(c(revdepdep, revdeprest)),
 unlist(standard_package_names())
))

...up to 1316 packages. 7 of these suggested packages aren't on CRAN or
Bioconductor (because they've been archived or have always lived on
GitHub), but even if I filter those out, it's not easy. Some of the
Bioconductor dependencies are large; I now have multiple gigabytes of
genome fragments and mass spectra, but also a 500-megabyte arrow.so in
my library. As long as a data package declares a dependency on your
package, it still has to be installed and checked, right?

Manually installing the SystemRequirements is no fun at all, so I've
tried the rocker/r2u container. It got me most of the way there, but
there were a few remaining packages with newer versions on CRAN. For
these, I had to install the system packages manually in order to build
them from source.

Someone told me to try the rocker/r-base container together with pak.
It was more proactive at telling me about dependency conflicts and
would have got me most of the way there too, except it somehow got me a
'stringi' binary without the corresponding libicu*.so*, which stopped
the installation process. Again, nothing that a bit of manual work
wouldn't fix, but I don't feel comfortable setting this up on a CI
system. (Not on every commit, of course - that would be extremely
wasteful - but it would be nice if it was possible to run these checks
before release on a different computer and spot more problems this way.)

I can't help but notice that neither install.packages() nor pak() is
the recommended way to install Bioconductor packages. Could that
introduce additional problems with checking the reverse dependencies?

Then there's the check_packages_in_dir() function itself. Its behaviour
about the reverse dependencies is not very helpful: they are removed
altogether or at least moved away. Something may be wrong with my CRAN
mirror, because some of the downloaded reverse dependencies come out
with a size of zero and subsequently fail the check very quickly.

I am thinking of keeping a separate persistent library with all the
1316 dependencies required to check the reverse dependencies and a
persistent directory with the reverse dependencies themselves. Instead
of using the reverse=... argument, I'm thinking of using the following
scheme:

1. Use package_dependencies() to determine the list of packages to test.
2. Use download.packages() to download the latest version of everything
if it doesn't already exist. Retry if got zero-sized or otherwise
damaged tarballs. Remove old versions of packages if a newer version
exists.
3. Run check_packages_in_dir() on the whole directory with the
downloaded reverse dependencies.

For this to work, I need a way to run step (3) twice, ensuring that one
of the runs is performed with the CRAN version of the package in the
library and the other one is performed with the to-be-released version
of the package in the library. Has anyone already come up with an
automated way to do that?

No wonder nobody wants to maintain the XML package.

-- 
Best regards,
Ivan



More information about the R-package-devel mailing list