[R] Protecting R code
spencer.graves at structuremonitoring.com
Mon Jul 4 17:05:59 CEST 2011
On 7/4/2011 7:28 AM, Uwe Ligges wrote:
> On 04.07.2011 09:47, Vaishali Sadaphal wrote:
>> Hi All,
>> I need to give my R code to my client to use. I would like to protect
>> logic/algorithms that have been coded in R. This means that I would not
>> like anyone to be able to read the code.
>> I am searching for ways to protect R code. I would like to create a .exe
>> kind of file which could be executed without using R or requiring to
>> install R. I would not like the R code to be loaded in R. This is so
>> because, after R loads a function, if you type the function name on the
>> command prompt, you can see the complete code. I would not like to give
>> this type of access to the R code.
>> I explored the option of creating .bat file (using command: R CMD
>> BAT) and
>> byte code (using command: compile). These are not useful since they open
>> R, load these functions and then the R code is visible.
>> Is there any other way to protect the R code which would help me package
>> all my files/source files and give me an executable file which would be
>> run without opening R? Another problem is that R is freely downloadable.
>> Is it somehow possible to protect the code from being loaded in R and
>> being seen.
> Hmmmm, R is open source software under the GPL (which is infective)
> and designed as such. Good luck it is almost impossible to hide the
> source code in R. And people who tried to generate C based binary
> packages found those can only be used under a small subset of
> platforms with few versions of R.
> Since R is distributed under the GPL: When you write code and make it
> available to others, you should be aware of this fact that you may
> have to distribute the sources under GPL as well - under some
> circumstances your lawyer can explain much better than I.
Linux is distributed under the GPL, and people distribute
software implemented in Linux without having to release their source
code. There are different versions of the GPL. You should read them
carefully and consult with an attorney. However, if you honestly read
the GPL verbiage, you may find that you know more than your attorney --
but you still need the attorney. I'm not an attorney and I haven't read
GPL verbiage in a while, but as I recall a key issue is whether your
code is your creation or a modification of some other GPL code. If the
latter, you could lose in court if challenged.
I see two options:
1. Write the proprietary portion of your code in a
compiled language like C, C++, or Fortran, and link from R to your
compiled subroutines. If you do not already write R packages, I
strongly urge you to first learn how to produce and use R packages.
Documentation on "Creating R Packages" is available from any standard
CRAN mirror. I suggest you create separate R packages (with different
names) complete with documentation for your internal only version in R
only and for your public version that uses compiled code. This allows
you to prototype your new ideas quickly in R before you spend the money
to convert them to compiled code. It also encourages you to build test
cases in a way that increases software quality. Then you can distribute
the public R package in its standard compiled format, which your users
can install using the standard procedure to "Install package(s) from
local zip file" (available on the "Packages" menu in Rgui). This is
arguably the cleanest legally, because then it's clear that your
proprietary code has an existence independent of R. You can distribute
your package with an appropriate end user license agreement and
instructions for how to install R and any CRAN packages you use plus
your own code.
2. You can write something to encrypt your R code. I know
someone who has done this. However, the legal status is not as clean as
if you wrote you proprietary algorithm in a compiled language, because
if someone with a larger budget for attorneys wants to take you to court
demanding your source code, you might lose. I doubt if that would
happen, but I'm not an attorney, so I don't know. I do know that people
often lose legal battles just because their opponents have much better
attorneys. The advantage of this is that you could then distribute your
latest changes immediately after you get them working. Another
disadvantage is that your code will have to decrypt the R code prior to
running it, which means that your code might still be available to
anyone clever enough to interrupt your code while it's running. Thus,
it's not as secure as writing compiled code, in addition to not having
as strong a claim to having an existence independent of R. You could
also combine this with the first, where your latest release would
encrypt your latest enhancements while you are working to translate
those into compiled code.
Few people with university appointments have to worry about these
issues, because they get paid for generating new knowledge and sharing
it with the world. The rest of us must find different answers for how
to provide for ourselves and our families without a university salary.
Hope this helps.
>> Notice: The information contained in this e-mail
>> message and/or attachments to it may contain
>> confidential or privileged information.
> If it is "confidential or privileged information", you should not send
> it to a mailing list where the archives are published.
> Uwe Ligges
>> If you are
>> not the intended recipient, any dissemination, use,
>> review, distribution, printing or copying of the
>> information contained in this e-mail message
>> and/or attachments to it are strictly prohibited. If
>> you have received this communication in error,
>> please notify us by reply e-mail or telephone and
>> immediately and permanently delete the message
>> and any attachments. Thank you
>> [[alternative HTML version deleted]]
>> R-help at r-project.org mailing list
>> PLEASE do read the posting guide
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help