[Bioc-devel] Python module "tensorflow_probability" not found

Charlotte Soneson ch@r|otte@one@on @end|ng |rom gm@||@com
Thu Jul 6 13:42:55 CEST 2023


Hi, 

in case it's useful: we have a package (orthos) in review (https://github.com/Bioconductor/Contributions/issues/3042) which uses basilisk to set up a conda environment with tensorflow and keras. It builds and runs fine both on GitHub Actions (GitHub repo here: https://github.com/fmicompbio/orthos) and on the Single Package Builder. We have also tested (locally) that it will use GPUs if available, and that the GPU configuration can be controlled from the R session (outside of the package), e.g. by setting the CUDA_VISIBLE_DEVICES environment variable. 

Charlotte

> On 5 Jul 2023, at 23:12, Kasper Daniel Hansen <kasperdanielhansen using gmail.com> wrote:
> 
> So I think Kim is interfacing to tensorflow by using the keras package from
> CRAN (partly authored by the Rstudio people). This package leaves it to the
> user to install tensorflow, which is a highly non-trivial installation
> task. There is some partly helpful instructions for using conda together
> with reticulate (see the macOS tab on
> https://tensorflow.rstudio.com/install/local_gpu.html). This is the job
> that basilisk handles for you. In essence, basilisk allows the developer to
> specify an R-package-specific conda. Tensorflow can be run on a CPUor a
> GPU. Getting it to run on a user-GPU is extra complicated and I am not sure
> basilisk can handle this.
> 
> Going forward, we (Bioc) want to decide if we want to support keras on our
> build system. This will require some work, because it is definitely not
> trivial to get to work (but much more possible if we limit ourselves to
> running on CPU). If we decide to support keras, we should try to figure out
> how to wrap keras into a basilisk container; perhaps something like
> creating a keras-basilisk R package, because IF we decide to support keras,
> this is going to be a major headache (to add to the frustration, tensorflow
> often rearranges everything so I future issues keeping it operational).
> 
> For Kim: I think you should consider if there are any alternatives to
> keras. Even if we get it to work on our build system, users will have major
> headache getting this to work (I think). If there are no alternatives to
> keras, you should perhaps think about doing the keras-basilisk option I
> outline above (assuming that is feasible; I don't know how keras interfaces
> with tensorflow. You might also have major headaches in your future: I have
> found quite big differences in convergence and optimizers over time in TF
> and you're basically saying it should work with every version of tensorflow
>> = 2.2. That's a .... strong promise considering also the difference
> between GPU and CPU.
> 
> In the meantime, in case the core build team wants some input, I have some
> experience with tensorflow, although so far my experience is mostly
> frustration.
> 
> Best,
> Kasper
> 
> 
> On Wed, Jul 5, 2023 at 3:12 PM Vincent Carey <stvjc using channing.harvard.edu>
> wrote:
> 
>> I'll try to get clearer on the basilisk situation; I forked your repo
>> and will plunge in soon.
>> It may take a while.  In the mean time I hope the BBS python stack can
>> be looked at
>> to see what the issue might be.  @Jennifer Wokaty is it possible with
>> the recent R upgrades that
>> tensorflow and tensorflow-probability might need to be installed/updated?
>> 
>> On Wed, Jul 5, 2023 at 3:01 PM Kim Philipp Jablonski
>> <kim.philipp.jablonski using gmail.com> wrote:
>>> 
>>> Thanks a lot for your response!
>>> 
>>> The Config/reticulate approach in DESCRIPTION looks very neat. I hope we
>> can use it at some point.
>>> 
>>> Could the issue you are facing with the conda install be, that the
>> package is called "tensorflow-probability" instead of
>> "tensorflow_probability" (
>> https://anaconda.org/conda-forge/tensorflow-probability)?
>>> 
>>> I have followed the basilisk advice and incorporated it into my package
>> here: https://github.com/cbg-ethz/pareg/tree/feature-basilisk.
>>> The basilisk docs say "Any R functions that use Python code should do so
>> via basiliskRun()".
>>> This does not seem possible in my case as I am using
>> tensorflow(probability) wrappers instead of calling them directly.
>>> Building the vignettes with Github Actions fails because tensorflow
>> could not be found:
>> https://github.com/cbg-ethz/pareg/actions/runs/5466980003/jobs/9952690137#step:15:39
>> (I might have to activate basilisk somehow, but using basiliskStart seems
>> to only make sense with basiliskRun). I could also not find any advice on
>> this elsewhere.
>>> Do you have a suggestion how to best handle this (or should it *just*
>> work)?
>>> 
>>> 
>>> On Tue, Jul 4, 2023 at 1:04 PM Vincent Carey <stvjc using channing.harvard.edu>
>> wrote:
>>>> 
>>>> Sorry to hear about this.  Our informal outlook on python
>>>> interoperation in Bioconductor packages is that
>>>> the basilisk discipline should be used.  This allows you to pin
>>>> specific versions of all python dependencies
>>>> and use python for your R functions from an insulated conda environment.
>>>> 
>>>> There is also information about specifying python dependencies in the
>>>> DESCRIPTION file at
>>>> https://rstudio.github.io/reticulate/articles/python_dependencies.html
>>>> I do not know if our build
>>>> system or R CMD install take effective advantage of that approach at
>> this time.
>>>> 
>>>> I was surprised to see that my attempt to use reticulate::py_install()
>>>> to install tensorflow_probabiliity
>>>> on my laptop failed:
>>>> 
>>>> '/home/stvjc/.local/share/r-miniconda/bin/conda' 'install' '--yes'
>>>> '--prefix' '/home/stvjc/.local/share/r-miniconda/envs/r-reticulate'
>>>> '-c' 'conda-forge' 'tensorflow_probability'
>>>> Retrieving notices: ...working... done
>>>> Collecting package metadata (current_repodata.json): done
>>>> Solving environment: unsuccessful initial attempt using frozen solve.
>>>> Retrying with flexible solve.
>>>> Collecting package metadata (repodata.json): done
>>>> Solving environment: unsuccessful initial attempt using frozen solve.
>>>> Retrying with flexible solve.
>>>> 
>>>> PackagesNotFoundError: The following packages are not available from
>>>> current channels:
>>>> 
>>>>  - tensorflow_probability
>>>> 
>>>> Current channels:
>>>> 
>>>>  - https://conda.anaconda.org/conda-forge/linux-64
>>>>  - https://conda.anaconda.org/conda-forge/noarch
>>>>  - https://repo.anaconda.com/pkgs/main/linux-64
>>>>  - https://repo.anaconda.com/pkgs/main/noarch
>>>>  - https://repo.anaconda.com/pkgs/r/linux-64
>>>>  - https://repo.anaconda.com/pkgs/r/noarch
>>>> 
>>>> To search for alternate channels that may provide the conda package
>> you're
>>>> looking for, navigate to
>>>> 
>>>>    https://anaconda.org
>>>> 
>>>> and use the search bar at the top of the page.
>>>> 
>>>> 
>>>> Error: one or more Python packages failed to install [error code 1]
>>>> 
>>>> Enter a frame number, or 0 to exit
>>>> 
>>>> 1: py_install("tensorflow_probability")
>>>> 
>>>> So the "current channels" used by reticulate in my pretty stock
>>>> installation of R/reticulate seem flawed
>>>> for this purpose.  I had hoped to write a bit of code that would check
>>>> for the desired module and install
>>>> it if missing, that you could include in your package.
>>>> 
>>>> get_pippath = function() {
>>>>   pypath = reticulate::conda_list() |>
>> (\(x)x[x$name=="r-reticulate",])()
>>>>   gsub("python$", "pip", pypath$python)
>>>> }
>>>> 
>>>> check_tfp = function() {
>>>>   pippath = get_pippath()
>>>>   peek = system(paste0(pippath, " list | grep tensor"), intern=TRUE)
>>>>   peek = gsub(" +", " ", peek)
>>>>   tfdf = do.call(rbind, lapply(strsplit(peek, " "), function(x)
>>>> data.frame(pkg=x[1], version=x[2])))
>>>>   "tensorflow_probability" %in% tfdf$pkg
>>>> }
>>>> 
>>>> install_tfp = function() {
>>>>   pippath = get_pippath()
>>>>   chk = system(paste(pippath, "install tensorflow_probability"),
>> intern=TRUE)
>>>>   chk2 = check_tfp()
>>>>   list(sysout=chk, tfp_installed=chk2)
>>>> }
>>>> 
>>>> The first two functions seem to do what I want, the latter does not.
>>>> 
>>>> Suggestion -- read the basilisk vignettes, use it if at all possible.
>>>> Certainly we can take
>>>> care of this in our build system when the holiday ends, but it would
>>>> be great to have the software
>>>> arrange the solution when necessary, and I don't see a way to
>>>> accomplish this at the moment.
>>>> 
>>>> 
>>>> 
>>>> On Tue, Jul 4, 2023 at 5:32 AM Kim Philipp Jablonski
>>>> <kim.philipp.jablonski using gmail.com> wrote:
>>>>> 
>>>>> Hello,
>>>>> 
>>>>> The latest builds of my package 'pareg' started failing on BioC 3.17
>> for
>>>>> Linux and macOS but not Windows:
>>>>> 
>> https://master.bioconductor.org/checkResults/3.17/bioc-LATEST/pareg/nebbiolo1-buildsrc.html
>>>>> .
>>>>> In both cases, the error messages is "Python module
>> tensorflow_probability
>>>>> was not found". This wasn't an issue in earlier releases.
>>>>> 
>>>>> Do you know what the reason for this is and how I can fix it?
>>>>> Thanks a lot for your help!
>>>>> 
>>>>> Best regards,
>>>>> Kim
>>>>> 
>>>>>        [[alternative HTML version deleted]]
>>>>> 
>>>>> _______________________________________________
>>>>> Bioc-devel using r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>> 
>>>> --
>>>> The information in this e-mail is intended only for the person to whom
>> it
>>>> is
>>>> addressed. If you believe this e-mail was sent to you in error and the
>>>> e-mail
>>>> contains patient information, please contact the Partners Compliance
>>>> HelpLine at
>>>> http://www.partners.org/complianceline
>>>> <http://www.partners.org/complianceline> . If the e-mail was sent to
>> you in
>>>> error
>>>> but does not contain patient information, please contact the sender
>>>> and properly
>>>> dispose of the e-mail.
>> 
>> --
>> The information in this e-mail is intended only for =...{{dropped:7}}



More information about the Bioc-devel mailing list