[Rd] binary R packages for GNU/Linux
Tobias Verbeke
tob|@@@verbeke @end|ng |rom open@n@|yt|c@@eu
Tue Feb 11 23:18:56 CET 2025
----- Original Message -----
> From: "Dirk Eddelbuettel" <edd using debian.org>
> To: "Jeroen Ooms" <jeroenooms using gmail.com>
> Cc: "Simon Urbanek" <simon.urbanek using r-project.org>, "r-devel using r-project.org" <r-devel using r-project.org>, "Dirk Eddelbuettel"
> <edd using debian.org>
> Sent: Tuesday, February 11, 2025 12:07:19 AM
> Subject: Re: [Rd] binary R packages for GNU/Linux
> On 10 February 2025 at 23:19, Jeroen Ooms wrote:
>| The "naked binaries" are widely used, and therefore probably useful to
>| many folks, me included. Some standardisation on paths would be
>| incredibly useful, even if CRAN would not offer such binaries, other
>| repositories could.
>
> Sure, but naked binaries can (and do) break more often i.e. real example from
> just last week in another project p3m served a stringr dependency via a
> stringi binary with a wrong libicu* outside of the particular release,
> breaking. That simply cannot and hence will not happen at r2u.
>
> But choice is good. If people want to manually manage their system
> dependencies they surely, they can also outsource it to another layer. As you
> say, all trade-offs and if you are happy with naked binaries more power to
> you. I prefer the proper integration and that's what I built.
Thank you, all, for the valuable input and reactions. Let me summarize.
I. choice is good
Many people find 'naked binaries' useful and many people find 'distribution binaries' useful. In our own work, we see a myriad of R deployments and installations with very different processes in place and in certain cases 'naked binaries' are more convenient than 'distribution binaries'. In other cases, it is the other way around. Both deserve their place in the R world.
II. outsource to another layer
In case of 'naked binaries' dependency management indeed needs to be outsourced to another layer. People typically use either a data store (e.g. the Github repo Iñaki mentioned - https://github.com/cran4linux/sysreqs or others e.g. https://github.com/r-hub/r-system-requirements). Other people add a REST API on top, which is e.g. what the PPM does (sysreqs resources in https://packagemanager.posit.co/__api__/swagger/index.html#/). To give a taste of how outsourcing to another layer looks like: to build reproducible environments we automatically generate Dockerfiles starting from the R code of a project. During the process of generating the Dockerfile, the package dependencies of the relevant packages are queried (using one of the sources above) and used to automagically insert the appropriate 'apt install' commands above and prior to the relevant install.packages commands in the Dockerfile.
III. goal in itself or ingredient for distribution packages
The building of 'naked binaries' are a necessary step in the process of building the 'distribution packages'. To some the naked binary is the end goal (often used in combination with a data source on dependencies). To others it is part of the ingredients for the end goal (distribution package). In both scenarios, it makes sense to store these naked binaries somewhere in order not to duplicate the builds. I personally think CRAN would be a good place to store these since it would be a neutral, central place that immediately gives a seal of quality and trust.
IV. repository layout
Many organizations have internal repositories mimicking a CRAN layout. Also many tool developers make certain assumptions on the repository layout. E.g. if one wants versioned package installs of source packages, one goes to src/contrib/Archive/$PACKAGE_NAME/ to retrieve the version. This is not explicitly documented and I am glad to hear Jeroen thinks such standardization would be
incredibly useful. Currently, the hosting of naked binaries actually makes use of a trick to put the packages under /src/contrib as in https://cran.r-universe.dev/bin/linux/noble/4.4/src/contrib/BayesERtools_0.2.0.tar.gz
While handy, it is a bit odd and may be confusing to some. We can further work out the specs, but a generalization of the repository layout that allows to also have an Archive for binary packages (for Windows, MacOS, naked binaries) would be very useful to be able to have fast versioned package installation. This is important if you want to combine performance and reproducible environments (which are impossible to achieve with the latest binary versions - even when using distribution binaries). This is not a request for CRAN to offer all versions of binary packages, but if the layout is clearly specified, other package manager infrastructure (public or internal to organizations) can fill this gap.
My message makes me think of the New Year Wish List mails Gabor Grothendieck tended to send out ~two decades ago (and maybe more recently) and I hope it triggers further discussion and ideas.
Thanks again!
Kind regards,
Tobias
More information about the R-devel
mailing list