[R-pkg-devel] Question on best approach to develop R package that wraps old complex Python2 software

Vladimir Dergachev vo|ody@ @end|ng |rom m|nd@pr|ng@com
Tue Jan 25 21:12:07 CET 2022



On Tue, 25 Jan 2022, Stefan McKinnon Høj-Edwards wrote:

> If the Python2 package is mainly system() calls, I would write an R package
> that essentially did the same, without relaying calls via the Python
> routines. I.e. let R call the commands directly.
>
> The only downside to this approach, is that R doesn't handle multithreading
> as well as Python does. In Python, you have the subcommand (subprocess?)
> module, I believe, with which you can call an external command and send it
> input, check its output, send new input, or just leave it be, until you
> want to check in on your subcommand. But to my knowledge, no similar method
> exists in R.

In R, you can use Tcl (tcltk) to do the same, which actually works better 
than Python.

The problem with python is that last time I checked there were three 
different interfaces for process control and none of the them were 
complete - it was impossible to reliably control a third-party program, 
while performing other tasks.

In contrast, Tcl has a well-designed interface that you can use in 
non-blocking mode.

best

Vladimir Dergachev

> Which brings us to the point: for how long do these external commands run?
> Regardless of module used in Python or directly called from R, the R
> process will wait. Mostly an issue for interactive uses.
>
> Another approach could be to have your R package handle data formatting,
> setup settings etc. and compiling a command with arguments, that the user
> may call at their leisure, whether on their laptop, cloud or HPC. When
> results have been factualised, they can return to your package to analyse
> the results.
> I used this approach for my badly named R package Siccuracy, for aiding
> with the imputation software AlphaImpute.
>
> Kindly,
> Stefan
>
> tir. 25. jan. 2022 16.27 skrev Andrew Simmons <akwsimmo using gmail.com>:
>
>> I would suggest the reticulate library in R. The few most important for
>> your case are reticulate::use_python_version and reticulate::import.
>> For example, in your R package, you should start with:
>>
>>
>> # change this to the name of the module you need
>> numpy <- NULL
>>
>>
>> .onLoad <- function (libname, pkgname)
>> {
>>     reticulate::use_python_version("2.7")  # change this as you need to
>>
>>
>>     # .onLoad happens before the namespace is locked, so this is legitimate
>>     numpy <<- reticulate::import("numpy", delay_load = list(
>>         on_error = function(c) stop(
>>             "unable to import 'numpy', try ",
>>             sQuote("reticulate::py_install(\"numpy\")"),
>>             " if it is not installed:\n  ",
>>             conditionMessage(c)
>>         )
>>     ))
>> }
>>
>>
>> when your package's namespace is loaded, this will load the version of
>> python you need to use, and will lazy-import the module you need for your
>> python session.
>>
>> On Tue, Jan 25, 2022 at 8:52 AM Alexandru Voda <
>> alexandru.voda using seh.ox.ac.uk>
>> wrote:
>>
>>> Hi!
>>>
>>> How would one best write an R wrapper package over a complex Python2
>>> software (such as https://github.com/bulik/ldsc), that is still very
>>> widely used in statistical genetics?
>>>
>>> I'm writting an R package (that currently passes all --as-cran checks)
>> for
>>> multiple other C++ softwares on the same topic as the one above, but this
>>> Python2 one I've difficulties with - it just looks like a bunch of
>> hackish
>>> system() calls... And while it works on Linux and Mac, I've no idea
>> whether
>>> it'd work on Windows.
>>>
>>> While it may seem easy to dismiss, actually LDSC is widely used in the
>>> statistical genetics field, and lots of people find it difficult to work
>>> with because of all the dependency files and weirdly documented commands,
>>> and because... well... Python2...
>>>
>>> Any tips? Or do you know anyone that I should contact/ask?
>>>
>>> Best wishes,
>>> Alexandru
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-package-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-package-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>


More information about the R-package-devel mailing list