[Bioc-devel] Tensorflow support for bioconductor packages

Kieran Campbell kieranrcampbell at gmail.com
Thu Mar 29 18:44:30 CEST 2018


Hi Hervé, Michael,

Thanks for your feedback. I will add in the reticulate check to ensure
tensorflow is installed prior to running and appropriate sections in
the vignettes. We have one package essentially ready for submission to
bioc, so is the best route forward to submit now or wait until
tensorflow is installed on the build servers?

Many thanks

Kieran


On 28 March 2018 at 15:10, Hervé Pagès <hpages at fredhutch.org> wrote:
> On 03/28/2018 02:41 PM, Hervé Pagès wrote:
>>
>> Hi Kieran,
>>
>> Note that you can execute arbitrary code at load time by defining
>> an .onLoad() hook in your package. So you *could* put something
>> like this in your package:
>>
>>    .onUnload <- function(libpath)
>>    {
>>      if (!reticulate::py_module_available("tensorflow"))
>>          tensorflow::install_tensorflow()
>>    }
>
>
> should be .onLoad() in the above code
>
> more below...
>
>>
>> However, having things being automatically downloaded/installed
>> on the user machine at package load-time is not a good idea. There
>> are just too many things that can go wrong.
>>
>> For example, I just tried to run tensorflow::install_tensorflow()
>> on my laptop (Ubuntu 16.04) and was successful only after the 3rd
>> attempt (I had to make some changes/adjustments to my system between
>> each attempt). And Debian Linux is probably the easiest target!
>>
>> Also note that install.packages() tries to load the package at the
>> end of the installation when installing from source so if the
>> .onUnload() hook fails, install.packages() considers that
>
>   ^^^^^^^^^^^
>    .onLoad()
>
> same here, sorry
>
> H.
>
>
>> the installation of the package failed and it removes it.
>>
>> Finally note that this installation needs to download hundreds of
>> Mb of Python stuff.
>>
>> So this is probably the reasons why the authors of the tensorflow
>> CRAN package chose to separate installation of the tensorflow Python
>> module from the installation of the package itself. There are plenty
>> of good reasons for doing that.
>>
>> What I would suggest instead is that you start your vignette with a
>> note reminding the user to run tensorflow::install_tensorflow() if
>> s/he didn't already do it. As a side note: I couldn't find a way to
>> programmatically figure out whether the tensorflow Python module is
>> already installed in the man page for tensorflow::install_tensorflow(),
>> I had to dig in the source code of the unit tests to find
>> reticulate::py_module_available("tensorflow")).
>>
>> In addition, you could also start each of your functions that rely on
>> the tensorflow Python module with a check to see whether the module is
>> available, and fail gracefully (with an informative error message) if
>> it's not.
>>
>> We'll figure out a way to install the tensorflow Python module on our
>> build machines.
>>
>> Hope this helps,
>> H.
>>
>>
>> On 03/28/2018 09:23 AM, Kieran Campbell wrote:
>>>
>>> Hi all,
>>>
>>> Rstudio have released the Tensorflow package for R -
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__tensorflow.rstudio.com_tensorflow_&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=w2p-VnxwECq9u90RNv_B6yCOpXxDkcIPAjcgcpbEeBE&s=AchAIWmKzcnyw9VXJ7eH5M4dqnTAS0SACVMigCPusHk&e=
>>> - and we have started
>>> incorporating it into some of our genomics packages for the heavy
>>> numerical computation.
>>>
>>> We would ideally like these to be submitted to Bioconductor, but
>>> there's a custom line required for Tensorflow installation in that
>>> after calling
>>>
>>> install.packages("tensorflow")
>>>
>>> then Tensorflow must be installed via
>>>
>>> tensorflow::install_tensorflow()
>>>
>>> which would break package testing if tensorflow was simply imported
>>> into the R package and wasn't already installed. Is there any way to
>>> customise a package installation within Bioconductor to trigger the
>>> tensorflow::install_tensorflow() ?
>>>
>>> As more people use tensorflow / deep learning in genomics I can see
>>> this being a problem so it would be good to have a solution in place.
>>>
>>> Many thanks,
>>>
>>> Kieran Campbell
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=w2p-VnxwECq9u90RNv_B6yCOpXxDkcIPAjcgcpbEeBE&s=RS0haeXXw_GuGbzVJJuh_ZJKHuYhliDfLjtojgmqKFc&e=
>>>
>>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fredhutch.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319



More information about the Bioc-devel mailing list