[R] when installing packages for R on Linux, is it better to use my distro's package manager, or install.packages()?

Ivan Krylov |kry|ov @end|ng |rom d|@root@org
Sun Sep 29 11:16:39 CEST 2024


В Sat, 28 Sep 2024 18:05:08 -0400
"Christopher W. Ryan" <cwr using agencystatistical.com> пишет:

> To install a new R package, is it better to use Linux Mint's pacakge
> manager (e.g. synaptic, apt-get, or similar), or to install it within
> R with install.packages("some_new_package")?

Since you've already installed R itself from the same package manager,
installing the packages from there will bring you a lot of convenience:

 - the packages needing compilation are pre-built, so you won't spend
   time compiling their C/C++/Fortran/... source code
 - the packages that depend on third-party libraries have these
   dependencies specified, so you won't spend time hunting around
   ./configure failures
 - when you update your Mint to the next version, R and the packages
   installed from the package manager will be updated too (assuming the
   package still exists in its .deb form!)

As for the downsides:

 - The choice of CRAN packages available as part of your Linux distro
   is limited, and their versions may be outdated compared to CRAN.

   Fortunately, since you're running R-4.4.1 on Linux Mint 20.04, you
   must have installed it from CRAN, for which there are compatible
   builds of latest versions of all CRAN packages as well [1], as
   already suggested by Ben Bolker.

 - You can only install one version of an R package as a system
   package. While well-designed packages with lean dependency trees
   will stay compatible to old scripts (and package-level dependencies)
   for a long time, there will eventually come a time when you need
   package <P> version exactly <V>, and general dependency management
   is one of the scariest spectres haunting the IT industry.

   A pure-R package will probably install cleanly, you just need to
   download it (and compatible versions of its dependencies!) from the
   CRAN archive. The 'renv' package already mentioned by Jeff Newmiller
   may help with this.

   Some old versions of R packages will fail with a new R version, so
   now you need an old version of R. On Linux, you will probably
   compile it from source instead and then install all packages from
   old sources as well.

   Some old versions of R packages will fail with modern versions of
   their third-party library dependencies. (For example, some old
   versions of R itself fail to check the 'zlib' version properly,
   which can give you very fun crashes when they mix two versions of
   'zlib' in one process.) Some may even fail to compile altogether and
   require emergency patching or an old compiler, and so on.

   The solutions used nowadays include containers (Docker, podman,
   OCI), which accept the fact that a program depends on the whole
   operating system and ship that operating system together with the
   program; and special-purpose reproducible distributions (Nix, GNU
   Guix, some HPC-centred tools) that install every package in a
   versioned directory, enabling multiple versions of the same package
   to coexist. You will just need to make sure that some years from
   now, you will still be able to download the container image from the
   registry, or at least procure all the sources required to install
   package <P> version <V>: old versions of package and R itself, an
   old compiler, old third-party dependencies, and so on.

To summarise, for Linux Mint, binary *.deb packages (especially r2u)
offer great convenience, but the more dependencies your project has,
the more effort it will take to reproduce it later, when the binaries
are no longer available.

-- 
Best regards,
Ivan

[1] https://eddelbuettel.github.io/r2u/



More information about the R-help mailing list