[R-pkg-devel] suggestion: conda for third-party software

Kevin Ushey kev|nu@hey @end|ng |rom gm@||@com
Wed Jan 8 19:24:52 CET 2020


It would also be worth looking at the basilisk package:

https://github.com/LTLA/basilisk

where the approach used there is to instead embed a Conda installation
as part of the R package itself. This comes with the benefit that it's
now the package author's responsibility to maintain the Conda
installation (not CRAN nor the users), but does have the drawback that
installing or upgrading that Conda environment may become more
challenging.

One other large benefit of this approach is that it forces R package
authors who want to use Python through reticulate to standardize on
the same environment. Note that reticulate can only bind to a single
Python session per R session, so attempting to have R packages which
use incompatible Python dependencies could quickly become an issue.
(Python packages tend to rely on virtual environments, and so Python
packages tend to declare more narrow dependency version requirements.)
Hence, having a "standardized" Python environment that can be used by
R packages through reticulate (or other Python-wrapping packages)
should be very useful.

If you're curious, there's a more detailed discussion here:

https://github.com/LTLA/basilisk/issues/2

Best,
Kevin

On Wed, Jan 8, 2020 at 8:34 AM Kevin Ushey <kevinushey using gmail.com> wrote:
>
> On Tue, Jan 7, 2020 at 10:42 PM Sokol Serguei <serguei.sokol using gmail.com> wrote:
> >
> > Thanks for this hint.
> >
> > Le 07/01/2020 à 20:47, Kevin Ushey a écrit :
> > > The newest version of reticulate does something very similar: R
> > > packages can declare their Python package dependencies in the
> > > Config/reticulate field of a DESCRIPTION file, and reticulate can read
> > > and use those dependencies to provision a Python environment for the
> > > user when requested (currently using Miniconda).
> >
> > If miniconda is used, does it mean that not only Python but any conda
> > package can be indicated in dependency ?
>
> In theory yes, but reticulate only accepts Python package dependencies
> since its primary goal is interoperation with Python.
>
> > And another question, do you know if miniconda is installed on testing
> > CRAN machines? (Without this I cannot see how your packages with conda
> > dependencies could be tested during their submission.)
>
> I don't think so. I can't speak for CRAN, but their time is precious
> and it seems unlikely to me that they would be willing to expend the
> time needed to maintain Conda installations across their fleet of CRAN
> machines.
>
> Packages using Miniconda in this way could still run their tests on
> different types of infrastructure, though (e.g. Travis CI).
>
> > Best,
> >
> > Serguei.
> >
> > >
> > > Similarly, rather than having this part of SystemRequirements, package
> > > authors could declare these in a separate field called e.g.
> > > Config/conda. Then, you could have an R package that knows how to read
> > > and parse these configuration requests, and install those packages for
> > > the user.
> > >
> > > That said, maintaining a Conda installation and its environments is
> > > non-trivial, and things do not always work as expected when mixing
> > > Conda applications with non-Conda applications. Most notably, Conda
> > > installations bundle their own copies of libraries; e.g. the C++
> > > standard library, Qt, OpenSSL, and so on. If an application tries to
> > > mix and match both system-provided and Conda-provided libraries in the
> > > same process, bad things often happen. This was still the
> > > lowest-friction way forward for us with reticulate, but it's worth
> > > being aware that Conda is not a total panacea.
> > >
> > > Best,
> > > Kevin
> > >
> > > On Tue, Jan 7, 2020 at 6:50 AM Serguei Sokol <serguei.sokol using gmail.com> wrote:
> > >> Best wishes for 2020!
> > >>
> > >> I would like to suggest a new feature for R package management. Its aim
> > >> is to enable package developers and end-users to rely on conda (
> > >> https://docs.conda.io/en/latest/ ) for managing third-party software
> > >> (TPS) on major platforms: linux64, win64 and osx64. Currently, many R
> > >> packages include TPS as part of them thus bloating their sizes and often
> > >> duplicating files on a given system.  And even when TPS is not included
> > >> in an R package but is just installed on a system, it is not so obvious
> > >> to get the right path to it. Sometimes pkg-config helps but it is not
> > >> always present.
> > >>
> > >> So, the new feature would be to let R package developers to write in
> > >> DESCRIPTION/SystemRequirements field something like
> > >> 'conda:boost-cpp>=1.71' where 'boost-cpp' is an example of a conda
> > >> package and '>=1.71' is an optional version requirement. Having this
> > >> could allow install.packages() to install TPS on a testing CRAN machine
> > >> or on an end-user's one. (There is just one line to execute in a shell:
> > >> conda install <pkg-name>. It will install the package itself as well as
> > >> all its dependencies).
> > >>
> > >> To my mind, this feature would have the following advantages:
> > >>    - on-disk size economy as the same TPS does not have to be included in
> > >> R package itself and can be shared with other language wrappers, e.g.
> > >> Python;
> > >>    - an easy flag configuring in Makevars as paths to TPS will be well
> > >> known in advance;
> > >>    - CRAN machines could test packages relying on a wide panel of TPS
> > >> without bothering with their manual installation;
> > >>    - TPS installation can become transparent for the end-user on major
> > >> platforms;
> > >>
> > >> Note that even R is part of conda (
> > >> https://anaconda.org/conda-forge/r-base ), it is not mandatory to use
> > >> the conda's R version for this feature. Here, conda is just meant to
> > >> facilitate access to TPS. However, a minimal requirement is obviously to
> > >> have conda itself.
> > >>
> > >> Does it look reasonable? appealing?
> > >> Best,
> > >> Serguei.
> > >>
> > >> ______________________________________________
> > >> R-package-devel using r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >
> >



More information about the R-package-devel mailing list