[Bioc-devel] Python module "tensorflow_probability" not found

Aaron Lun |n||n|te@monkey@@w|th@keybo@rd@ @end|ng |rom gm@||@com
Sat Jul 8 23:09:09 CEST 2023


Not hard to have OS-specific environments, see for example:

https://github.com/alanocallaghan/snifter/blob/devel/R/basilisk.R

-A

On 7/6/23 20:23, Kasper Daniel Hansen wrote:
> This sounds excellent Kim!
> 
> Here you can get 2.10 for Windows: https://anaconda.org/anaconda/tensorflow
> although my experience is that I hate mixing channels on conda. It is also
> quite interesting that this conda package also has Windows at an older
> version (but just 2.10 vs. 2.12)
> 
> This really speaks to a potential need for having basilisk dependencies
> being platform specific. This would also come in handy for macOS. But
> AFAIK, this is not supported by basilisk currently. Might be something we
> need to address.
> 
> Best,
> Kasper
> 
> 
> 
> On Thu, Jul 6, 2023 at 5:38 PM Kim Philipp Jablonski <
> kim.philipp.jablonski using gmail.com> wrote:
> 
>> Thank you all so much for your input and the references!
>>
>> @Kasper: I mostly rely on tensorflow and tensorflow-probability, so I might
>> somehow get rid of the keras dependency but it would require some work.
>>
>> After being inspired by the lovely orthos package (thanks Charlotte!), I
>> decided to play around further in the basilisk direction and updated my
>> project (https://github.com/cbg-ethz/pareg/tree/feature-basilisk).
>> The hardest part was figuring out a set of package versions which satisfy
>> conda's package manager (@Vincent, I feel you!).
>> But then it just magically worked on my local machine.
>>
>> When testing with GitHub Actions, the windows runner crashes with
>> PackagesNotFoundError: The following packages are not available from
>> current channels:
>>    - tensorflow=2.11.1
>> A look at conda-forge (https://anaconda.org/conda-forge/tensorflow)
>> reveals
>> that for win-64, there's only v1.14.0 available... I guess ignoring the
>> windows build is not an option for my bioc package?
>>
>> For the ubuntu runner, my vignettes were created successfully ("* creating
>> vignettes ... OK"). My tests still fail, but that is expected because I
>> have not wrapped them with basiliskRun. Do I have to do this manually for
>> every function call which may interact with tensorflow (so much
>> boilerplate), or can I somehow implicitly use the created conda env for
>> every function in my package?
>>
>> On Thu, Jul 6, 2023 at 2:08 PM Vincent Carey <stvjc using channing.harvard.edu>
>> wrote:
>>
>>> That's great news.  FWIW I am finding that the advice at
>>> https://rstudio.github.io/reticulate/articles/python_dependencies.html
>>> can work to produce properly resolved python dependencies.  Just don't
>>> follow the example literally; the requested
>>> scipy version may not exist.  Version 1.11.1 does. Stay tuned.
>>>
>>> On Thu, Jul 6, 2023 at 7:43 AM Charlotte Soneson <
>>> charlottesoneson using gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> in case it's useful: we have a package (orthos) in review (
>>>> https://github.com/Bioconductor/Contributions/issues/3042) which uses
>>>> basilisk to set up a conda environment with tensorflow and keras. It
>>> builds
>>>> and runs fine both on GitHub Actions (GitHub repo here:
>>>> https://github.com/fmicompbio/orthos) and on the Single Package
>> Builder.
>>>> We have also tested (locally) that it will use GPUs if available, and
>>> that
>>>> the GPU configuration can be controlled from the R session (outside of
>>> the
>>>> package), e.g. by setting the CUDA_VISIBLE_DEVICES environment
>> variable.
>>>>
>>>> Charlotte
>>>>
>>>>> On 5 Jul 2023, at 23:12, Kasper Daniel Hansen <
>>>> kasperdanielhansen using gmail.com> wrote:
>>>>>
>>>>> So I think Kim is interfacing to tensorflow by using the keras
>> package
>>>> from
>>>>> CRAN (partly authored by the Rstudio people). This package leaves it
>> to
>>>> the
>>>>> user to install tensorflow, which is a highly non-trivial
>> installation
>>>>> task. There is some partly helpful instructions for using conda
>>> together
>>>>> with reticulate (see the macOS tab on
>>>>> https://tensorflow.rstudio.com/install/local_gpu.html). This is the
>>> job
>>>>> that basilisk handles for you. In essence, basilisk allows the
>>> developer
>>>> to
>>>>> specify an R-package-specific conda. Tensorflow can be run on a
>> CPUor a
>>>>> GPU. Getting it to run on a user-GPU is extra complicated and I am
>> not
>>>> sure
>>>>> basilisk can handle this.
>>>>>
>>>>> Going forward, we (Bioc) want to decide if we want to support keras
>> on
>>>> our
>>>>> build system. This will require some work, because it is definitely
>> not
>>>>> trivial to get to work (but much more possible if we limit ourselves
>> to
>>>>> running on CPU). If we decide to support keras, we should try to
>> figure
>>>> out
>>>>> how to wrap keras into a basilisk container; perhaps something like
>>>>> creating a keras-basilisk R package, because IF we decide to support
>>>> keras,
>>>>> this is going to be a major headache (to add to the frustration,
>>>> tensorflow
>>>>> often rearranges everything so I future issues keeping it
>> operational).
>>>>>
>>>>> For Kim: I think you should consider if there are any alternatives to
>>>>> keras. Even if we get it to work on our build system, users will have
>>>> major
>>>>> headache getting this to work (I think). If there are no alternatives
>>> to
>>>>> keras, you should perhaps think about doing the keras-basilisk
>> option I
>>>>> outline above (assuming that is feasible; I don't know how keras
>>>> interfaces
>>>>> with tensorflow. You might also have major headaches in your future:
>> I
>>>> have
>>>>> found quite big differences in convergence and optimizers over time
>> in
>>> TF
>>>>> and you're basically saying it should work with every version of
>>>> tensorflow
>>>>>> = 2.2. That's a .... strong promise considering also the difference
>>>>> between GPU and CPU.
>>>>>
>>>>> In the meantime, in case the core build team wants some input, I have
>>>> some
>>>>> experience with tensorflow, although so far my experience is mostly
>>>>> frustration.
>>>>>
>>>>> Best,
>>>>> Kasper
>>>>>
>>>>>
>>>>> On Wed, Jul 5, 2023 at 3:12 PM Vincent Carey <
>>> stvjc using channing.harvard.edu
>>>>>
>>>>> wrote:
>>>>>
>>>>>> I'll try to get clearer on the basilisk situation; I forked your
>> repo
>>>>>> and will plunge in soon.
>>>>>> It may take a while.  In the mean time I hope the BBS python stack
>> can
>>>>>> be looked at
>>>>>> to see what the issue might be.  @Jennifer Wokaty is it possible
>> with
>>>>>> the recent R upgrades that
>>>>>> tensorflow and tensorflow-probability might need to be
>>>> installed/updated?
>>>>>>
>>>>>> On Wed, Jul 5, 2023 at 3:01 PM Kim Philipp Jablonski
>>>>>> <kim.philipp.jablonski using gmail.com> wrote:
>>>>>>>
>>>>>>> Thanks a lot for your response!
>>>>>>>
>>>>>>> The Config/reticulate approach in DESCRIPTION looks very neat. I
>> hope
>>>> we
>>>>>> can use it at some point.
>>>>>>>
>>>>>>> Could the issue you are facing with the conda install be, that the
>>>>>> package is called "tensorflow-probability" instead of
>>>>>> "tensorflow_probability" (
>>>>>> https://anaconda.org/conda-forge/tensorflow-probability)?
>>>>>>>
>>>>>>> I have followed the basilisk advice and incorporated it into my
>>> package
>>>>>> here: https://github.com/cbg-ethz/pareg/tree/feature-basilisk.
>>>>>>> The basilisk docs say "Any R functions that use Python code should
>> do
>>>> so
>>>>>> via basiliskRun()".
>>>>>>> This does not seem possible in my case as I am using
>>>>>> tensorflow(probability) wrappers instead of calling them directly.
>>>>>>> Building the vignettes with Github Actions fails because tensorflow
>>>>>> could not be found:
>>>>>>
>>>>
>>>
>> https://github.com/cbg-ethz/pareg/actions/runs/5466980003/jobs/9952690137#step:15:39
>>>>>> (I might have to activate basilisk somehow, but using basiliskStart
>>>> seems
>>>>>> to only make sense with basiliskRun). I could also not find any
>> advice
>>>> on
>>>>>> this elsewhere.
>>>>>>> Do you have a suggestion how to best handle this (or should it
>> *just*
>>>>>> work)?
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 4, 2023 at 1:04 PM Vincent Carey <
>>>> stvjc using channing.harvard.edu>
>>>>>> wrote:
>>>>>>>>
>>>>>>>> Sorry to hear about this.  Our informal outlook on python
>>>>>>>> interoperation in Bioconductor packages is that
>>>>>>>> the basilisk discipline should be used.  This allows you to pin
>>>>>>>> specific versions of all python dependencies
>>>>>>>> and use python for your R functions from an insulated conda
>>>> environment.
>>>>>>>>
>>>>>>>> There is also information about specifying python dependencies in
>>> the
>>>>>>>> DESCRIPTION file at
>>>>>>>>
>>>> https://rstudio.github.io/reticulate/articles/python_dependencies.html
>>>>>>>> I do not know if our build
>>>>>>>> system or R CMD install take effective advantage of that approach
>> at
>>>>>> this time.
>>>>>>>>
>>>>>>>> I was surprised to see that my attempt to use
>>> reticulate::py_install()
>>>>>>>> to install tensorflow_probabiliity
>>>>>>>> on my laptop failed:
>>>>>>>>
>>>>>>>> '/home/stvjc/.local/share/r-miniconda/bin/conda' 'install' '--yes'
>>>>>>>> '--prefix'
>> '/home/stvjc/.local/share/r-miniconda/envs/r-reticulate'
>>>>>>>> '-c' 'conda-forge' 'tensorflow_probability'
>>>>>>>> Retrieving notices: ...working... done
>>>>>>>> Collecting package metadata (current_repodata.json): done
>>>>>>>> Solving environment: unsuccessful initial attempt using frozen
>>> solve.
>>>>>>>> Retrying with flexible solve.
>>>>>>>> Collecting package metadata (repodata.json): done
>>>>>>>> Solving environment: unsuccessful initial attempt using frozen
>>> solve.
>>>>>>>> Retrying with flexible solve.
>>>>>>>>
>>>>>>>> PackagesNotFoundError: The following packages are not available
>> from
>>>>>>>> current channels:
>>>>>>>>
>>>>>>>>   - tensorflow_probability
>>>>>>>>
>>>>>>>> Current channels:
>>>>>>>>
>>>>>>>>   - https://conda.anaconda.org/conda-forge/linux-64
>>>>>>>>   - https://conda.anaconda.org/conda-forge/noarch
>>>>>>>>   - https://repo.anaconda.com/pkgs/main/linux-64
>>>>>>>>   - https://repo.anaconda.com/pkgs/main/noarch
>>>>>>>>   - https://repo.anaconda.com/pkgs/r/linux-64
>>>>>>>>   - https://repo.anaconda.com/pkgs/r/noarch
>>>>>>>>
>>>>>>>> To search for alternate channels that may provide the conda
>> package
>>>>>> you're
>>>>>>>> looking for, navigate to
>>>>>>>>
>>>>>>>>     https://anaconda.org
>>>>>>>>
>>>>>>>> and use the search bar at the top of the page.
>>>>>>>>
>>>>>>>>
>>>>>>>> Error: one or more Python packages failed to install [error code
>> 1]
>>>>>>>>
>>>>>>>> Enter a frame number, or 0 to exit
>>>>>>>>
>>>>>>>> 1: py_install("tensorflow_probability")
>>>>>>>>
>>>>>>>> So the "current channels" used by reticulate in my pretty stock
>>>>>>>> installation of R/reticulate seem flawed
>>>>>>>> for this purpose.  I had hoped to write a bit of code that would
>>> check
>>>>>>>> for the desired module and install
>>>>>>>> it if missing, that you could include in your package.
>>>>>>>>
>>>>>>>> get_pippath = function() {
>>>>>>>>    pypath = reticulate::conda_list() |>
>>>>>> (\(x)x[x$name=="r-reticulate",])()
>>>>>>>>    gsub("python$", "pip", pypath$python)
>>>>>>>> }
>>>>>>>>
>>>>>>>> check_tfp = function() {
>>>>>>>>    pippath = get_pippath()
>>>>>>>>    peek = system(paste0(pippath, " list | grep tensor"),
>> intern=TRUE)
>>>>>>>>    peek = gsub(" +", " ", peek)
>>>>>>>>    tfdf = do.call(rbind, lapply(strsplit(peek, " "), function(x)
>>>>>>>> data.frame(pkg=x[1], version=x[2])))
>>>>>>>>    "tensorflow_probability" %in% tfdf$pkg
>>>>>>>> }
>>>>>>>>
>>>>>>>> install_tfp = function() {
>>>>>>>>    pippath = get_pippath()
>>>>>>>>    chk = system(paste(pippath, "install tensorflow_probability"),
>>>>>> intern=TRUE)
>>>>>>>>    chk2 = check_tfp()
>>>>>>>>    list(sysout=chk, tfp_installed=chk2)
>>>>>>>> }
>>>>>>>>
>>>>>>>> The first two functions seem to do what I want, the latter does
>> not.
>>>>>>>>
>>>>>>>> Suggestion -- read the basilisk vignettes, use it if at all
>>> possible.
>>>>>>>> Certainly we can take
>>>>>>>> care of this in our build system when the holiday ends, but it
>> would
>>>>>>>> be great to have the software
>>>>>>>> arrange the solution when necessary, and I don't see a way to
>>>>>>>> accomplish this at the moment.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jul 4, 2023 at 5:32 AM Kim Philipp Jablonski
>>>>>>>> <kim.philipp.jablonski using gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> The latest builds of my package 'pareg' started failing on BioC
>>> 3.17
>>>>>> for
>>>>>>>>> Linux and macOS but not Windows:
>>>>>>>>>
>>>>>>
>>>>
>>>
>> https://master.bioconductor.org/checkResults/3.17/bioc-LATEST/pareg/nebbiolo1-buildsrc.html
>>>>>>>>> .
>>>>>>>>> In both cases, the error messages is "Python module
>>>>>> tensorflow_probability
>>>>>>>>> was not found". This wasn't an issue in earlier releases.
>>>>>>>>>
>>>>>>>>> Do you know what the reason for this is and how I can fix it?
>>>>>>>>> Thanks a lot for your help!
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Kim
>>>>>>>>>
>>>>>>>>>         [[alternative HTML version deleted]]
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Bioc-devel using r-project.org mailing list
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>>>>>
>>>>>>>> --
>>>>>>>> The information in this e-mail is intended only for the person to
>>> whom
>>>>>> it
>>>>>>>> is
>>>>>>>> addressed. If you believe this e-mail was sent to you in error and
>>> the
>>>>>>>> e-mail
>>>>>>>> contains patient information, please contact the Partners
>> Compliance
>>>>>>>> HelpLine at
>>>>>>>> http://www.partners.org/complianceline
>>>>>>>> <http://www.partners.org/complianceline> . If the e-mail was sent
>>> to
>>>>>> you in
>>>>>>>> error
>>>>>>>> but does not contain patient information, please contact the
>> sender
>>>>>>>> and properly
>>>>>>>> dispose of the e-mail.
>>>>>>
>>>>>> --
>>>>>> The information in this e-mail is intended only for
>>> th...{{dropped:27}}
>>>>>
>>>>> _______________________________________________
>>>>> Bioc-devel using r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>
>>>>
>>>
>>> --
>>> The information in this e-mail is intended only for th...{{dropped:10}}
>>
>> _______________________________________________
>> Bioc-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
> 
>



More information about the Bioc-devel mailing list