[R-pkg-devel] Error in creating virtual environment on Debian machines

Ivan Krylov |kry|ov @end|ng |rom d|@root@org
Fri Sep 6 12:45:03 CEST 2024


Hello Taeyong and welcome to R-package-devel!

В Thu, 5 Sep 2024 23:40:00 +0300
Taeyong Park <taeyongp using andrew.cmu.edu> пишет:

>   # Define paths
>   venv_path <- file.path(Sys.getenv("HOME"), ".virtualenvs",
> "pytrends-in-r-new")
>   python_path <- file.path(venv_path, "bin", "python")

Please don't require placing the virtual environments under
~/.virtualenvs. 'reticulate' will use this path as a default if you only
provide the name of the virtual environment, but will also let the user
configure virtualenv_root() using environment variables.

>   TrendReq <<- reticulate::import("pytrends.request", delay_load =
> TRUE)$TrendReq
>   ResponseError <<- reticulate::import("pytrends.exceptions",
> delay_load = TRUE)$ResponseError

I'm afraid this defeats the purpose of 'delay_load' by immediately
accessing the module object and forcing 'reticulate' to load it.

I understand the desire to get everything working automatically right
when the package is loaded, but Python dependency management is a
complex topic and not all of it is safe to perform from .onLoad. In
particular, if .onLoad fails, you don't get to let the user call
PytrendsLongitudinalR::install_pytrends() because the namespace will
not be available. Try following the guidance in the 'reticulate'
vignettes [1]:

1. In .onLoad, only express a soft preference for a named virtual
environment and create but do not access lazy-load Python module
objects:

.onLoad <- function(libname, pkgname) {
 use_virtualenv("pytrends-in-r-new", required = FALSE)
 pytrends.request <<- reticulate::import("pytrends.request", delay_load
= TRUE)
 pd <<- reticulate::import("pandas", delay_load = TRUE)
 # and so on
}

2. Move all the installation work into a separate function:

install_pytrends <- function(envname = "pytrends-in-r-new", ...)
 # the vignette suggests wiping the "pytrends-in-r-new" venv here,
 # just in case
 py_install(
  c("pandas", "requests", "pytrends", "rich"),
  envname = envname, ...
 )

3. In tests and examples, wrap all uses of Python in checks for
py_module_available(...). In regular code, you can suggest running
install_pytrends(), but don't run it for the user. _Automatically_
installing additional software, whether Python modules or Python
itself, is prohibited by the CRAN policy [2]:

>> Packages should not write in the user’s home filespace <...> nor
>> anywhere else on the file system apart from the R session’s
>> temporary directory (or during installation in the location pointed
>> to by TMPDIR: and such usage should be cleaned up). <...>
>> Limited exceptions may be allowed in interactive sessions if the
>> package obtains confirmation from the user.

Admittedly, this complicates the tests and the examples for your
package with boilerplate. I see that 'tensorflow', for example, wraps
all its examples in \dontrun{}, but it's an exceptional package. A few
other packages that depend on 'reticulate' that I've just taken a look
at do wrap their examples in checks for the Python packages being
available.

-- 
Best regards,
Ivan

[1]
https://cran.r-project.org/package=reticulate/vignettes/package.html
https://cran.r-project.org/package=reticulate/vignettes/python_dependencies.html

[2]
https://cran.r-project.org/web/packages/policies.html



More information about the R-package-devel mailing list