[Rd] Including a binary Python Interpreter into a binary R-package for MS Windows

gvsteen at yahoo.com gvsteen at yahoo.com
Tue Sep 1 23:41:54 CEST 2009


2009/8/30 Uwe Ligges <ligges at statistik.tu-dortmund.de>:
[snip]
> Guido van Steen wrote:
[snip]
>> Something that interests me too: What about R's policy with respect to
>> including binary files? I saw that developers should include a file
[snip]
> Please do not include binary files and carefully watch for licenses of those
> files (e.g. if GPL'ed, you need to ship sources!). If pyhthon is required, I
> highly suggest to state it in the SystemDependencies and be fine with it -
> users can learn to install phython themselves, I'm pretty sure.

Hi Uwe, 

Note: I will send this email cc. to the R-devel list, which I joined today. I think it may be of interest to other people as well. 

Thank you for your answer, although it disappointed me a bit. I had already spent quite some time building a stand-alone windows binary of a new package "write2xls". This package provides the same R interface to Python as the other package "dataframes2xls". As you know it enable users to create xls files. The special thing about "write2xls" is that it does not have any dependencies at all. It is so-to-speak a turn-key solution. 

Of course I should have read a bit more before I started. Only after your mail I read the pdf-file "Writing R Extensions". It says "A source package if possible should not contain binary executable files: they are not portable, and a security risk if they are of the appropriate architecture. R CMD check will warn about them unless they are listed (one filepath per line) in a file 'BinaryFiles' at the top level of the package or bundle. Note that CRAN will no longer accept submissions containing binary files even if they are listed." 

So, yes, you are right. I was actually hoping that CRAN could make some exceptions, but after some thinking I fully understand that many people would object to this for good reasons: R code depending on a C compiler will not work without a C Compiler either. For security reasons we cannot allow packages to install a binary C compiler. So, yes, I understand the reasons but still it is a pity. 

The current situation is that many MS Windows users can not easily use "dataframes2xls". There are a few reasons:  

* Most users of MS Windows will be unfamiliar with Python, which will make them reluctant to install Python. 

* Installing Python will be impossible on many MS Windows platforms due to limited user rights. 

* Downloading a standard Python installer takes about 15 Megabytes. My newly created "write2xls" package just contains 1.3 MB. 

So only few R users can benefit from "dataframes2xls". An alternative to "dataframes2xls" is "write.xls". "dataframes2xls" is technically superior, as it allows the specification of proper formatting and fonts. "dataframes2xls" also exists longer. However, "write.xls" is available to many more R users because it depends on Perl, which is installed as a part of the R-tools. 

So, I think it would be a pity not to provide "write2xls", since I have it readily available now. Therefore, I will probably be hosting "write2xls" on a different repository, as long as no Python Interpreter is included in the R-tools. Does anyone know of a alternative repository, which does accept "trustworthy" R packages with a binary Python Interpreter. 

Thanks! 

Best wishes, 

Guido van Steen 

P.S. For those who are interested or who would like to test it, at the moment "write2xls" can be downloaded as "http://www.heppel.net/write2xls_0.4.4.9.zip". The "source" package is available as "http://www.heppel.net/write2xls_0.4.4.9.tar.gz". 

P.P.S. I think that on MS Windows the combination of R and the R tools is just as much a potential security risk as allowing to include a Python Interpreter in a binary package. The R website should pay more serious attention to this.  

P.P.P.S. Uwe also brings up the issue of licensing. However, this is not a problem at all. The Python license is one of the most permissive licenses around. For the Python Interpreter that I included in the "write2xls" package, I used pyMingW, which is distributed under an MIT license. It is a version of Python compiled by the MinGW compiler. Thanks to this pyMingW distribution I also avoid the need of any Microsoft-owned dlls. "dataframes2xls" and "write2xls" are also distributed under a MIT license.



More information about the R-devel mailing list