[Rd] On implementing zero-overhead code reuse

Martin Morgan martin.morgan at roswellpark.org
Tue Oct 4 00:17:12 CEST 2016


On 10/03/2016 01:51 PM, Kynn Jones wrote:
> Thank you all for your comments and suggestions.
>
> @Frederik, my reason for mucking with environments is that I want to
> minimize the number of names that import adds to my current
> environment.  For instance, if module foo defines a function bar, I
> want my client code to look like this:
>
>   import("foo")
>   foo$bar(1,2,3)
>
> rather than
>
>   import("foo")
>   bar(1,2,3)
>
> (Just a personal preference.)
>
> @Dirk, @Kasper, as I see it, the benefit of scripting languages like
> Python, Perl, etc., is that they allow very quick development, with
> minimal up-front cost.  Their main strength is precisely that one can,
> without much difficulty, *immediately* start *programming
> productively*, without having to worry at all about (to quote Dirk)
> "repositories.  And package management.  And version control (at the
> package level).  And ... byte compilation.  And associated
> documentation.  And unit tests.  And continuous integration."
>
> Of course, *eventually*, and for a fraction of one's total code base
> (in my case, a *very small* fraction), one will want to worry about
> all those things, but I see no point in burdening *all* my code with
> all those concerns from the start.  Again, please keep in mind that
> those concerns come into play for at most 5% of the code I write.
>
> Also, I'd like to point out that the Python, Perl, etc. communities
> are no less committed to all the concerns that Dirk listed (version
> control, package management, documentation, testing, etc.) than the R
> community is.  And yet, Python, Perl, etc. support the "zero-overhead"
> model of code reuse.  There's no contradiction here.  Support for
> "zero-overhead" code reuse does not preclude forms of code reuse with
> more overhead.
>
> One benefit the zero-overhead model is that the concerns of
> documentation, testing, etc. can be addressed with varying degrees of
> thoroughness, depending on the situation's demands.  (For example,
> documentation that would be perfectly adequate for me as the author of
> a function would not be adequate for the general user.)
>
> This means that the transition from writing private code to writing
> code that can be shared with the world can be made much more
> gradually, according to the programmer's needs and means.
>
> Currently, in the R world, the choice for programmers is much starker:
> either stay writing little scripts that one sources from an
> interactive session, or learn to implement packages.  There's too
> little in-between.

I know it's flogging the same horse, but for the non-expert I create and 
attach a complete package

   devtools::create("myutils")
   library(myutils)

Of course it doesn't do anything, so I write my code by editing a plain 
text file myutils/R/foo.R to contain

   foo = function() "hello wirld"

then return to my still-running R session and install the updated 
package and use my new function

   devtools::install("myutils")
   foo()
   myutils::foo()  # same, but belt-and-suspenders

I notice my typo, update the file, and use the updated package

   devtools::install("myutils")
   foo()

The transition from here to a robust package can be gradual, updating 
the DESCRIPTION file, adding roxygen2 documentation, unit tests, using 
version control, etc... in a completely incremental way. At the end of 
it all, I'll still install and use my package with

   devtools::install("myutils")
   foo()

maybe graduating to

   devtools::install_github("mtmorgan/myutils")
   library(myutils)
   foo()

when it's time to share my work with the wirld.

Martin

>
> Of course, from the point of view of someone who has already written
> several packages, the barrier to writing a package may seem too small
> to fret over, but adopting the expert's perspective is likely to
> result in excluding the non-experts.
>
> Best, kj
>
>
> On Mon, Oct 3, 2016 at 12:06 PM, Kasper Daniel Hansen
> <kasperdanielhansen at gmail.com> wrote:
>>
>>
>> On Mon, Oct 3, 2016 at 10:18 AM, <frederik at ofb.net> wrote:
>>>
>>> Hi Kynn,
>>>
>>> Thanks for expanding.
>>>
>>> I wrote a function like yours when I first started using R. It's
>>> basically the same up to your "new.env()" line, I don't do anything
>>> with environmentns. I just called my function "mysource" and it's
>>> essentially a "source with path". That allows me to find code I reuse
>>> in standard locations.
>>>
>>> I don't know why R does not have built-in support for such a thing.
>>> You can get it in C compilers with CPATH, and as you say in Perl with
>>> PERL5LIB, in Python, etc. Obviously when I use my "mysource" I have to
>>> remember that my code is now not portable without copying over some
>>> files from other locations in my home directory. However, as a
>>> beginner I find this tool to be indispensable, as R lacks several
>>> functions which I use regularly, and I'm not necessarily ready to
>>> confront the challenges associated with creating a package.
>>
>>
>> I can pretty much guarantee that when you finally confront the "challenge"
>> of making your own package you'll realize (1) it is pretty easy if the
>> intention is only to use it yourself (and perhaps a couple of collaborators)
>> - by easy I mean I can make a package in 5m max. (2) you'll ask yourself
>> "why didn't I do this earlier?".  I still get that feeling now, when I have
>> done it many times for internal use.  Almost every time I think I should
>> have made an internal package earlier in the process.
>>
>> Of course, all of this is hard to see when you're standing in the middle of
>> your work.
>>
>> Best,
>> Kasper
>>
>>
>>
>>
>>
>>>
>>> However, I guess since we can get your functionality pretty easily
>>> using some lines in .Rprofile, that makes it seem less important to
>>> have it built-in. In fact, if everyone has to implement their own
>>> version of your "import", this almost guarantees that the function
>>> won't appear by accident in any public code. My choice of name
>>> "mysource" was meant to serve as a more visible lexical reminder that
>>> the function is not meant to be seen by the public.
>>>
>>> By the way, why do you do the stuff with environments in your "import"
>>> function?
>>>
>>> Dirk's take is interesting. I don't use version control for my
>>> personal projects, just backing-up. Obviously not all R users are
>>> interested in becoming package maintainers, in fact I think it would
>>> clutter things a bit if this were the case. Or maybe it would be good
>>> to have everyone publish their personal utility functions, who knows?
>>> Anyway I appreciate Dirk's arguments, but I'm also a bit surprised
>>> that Kynn and I seem to be the only ones who have written personal
>>> functions to do what Kynn calls "zero-overhead code reuse". FWIW.
>>>
>>> Cheers,
>>>
>>> Frederick
>>>
>>> On Sun, Oct 02, 2016 at 08:01:58PM -0400, Kynn Jones wrote:
>>>> Hi Frederick,
>>>>
>>>> I described what I meant in the post I sent to R-help
>>>> (https://stat.ethz.ch/pipermail/r-help/2016-September/442174.html),
>>>> but in brief, by "zero overhead" I mean that the only thing needed for
>>>> library code to be accessible to client code is for it to be located
>>>> in a designated directory.  No additional meta-files,
>>>> packaging/compiling,
>>>> etc. are required.
>>>>
>>>> Best,
>>>>
>>>> G.
>>>>
>>>> On Sun, Oct 2, 2016 at 7:09 PM,  <frederik at ofb.net> wrote:
>>>>> Hi Kynn,
>>>>>
>>>>> Do you mind defining the term "zero-overhead model of code reuse"?
>>>>>
>>>>> I think I understand what you're getting at, but not sure.
>>>>>
>>>>> Thank you,
>>>>>
>>>>> Frederick
>>>>>
>>>>> On Sun, Oct 02, 2016 at 01:29:52PM -0400, Kynn Jones wrote:
>>>>>> I'm looking for a way to approximate the "zero-overhead" model of
>>>>>> code
>>>>>> reuse available in languages like Python, Perl, etc.
>>>>>>
>>>>>> I've described this idea in more detail, and the motivation for this
>>>>>> question in an earlier post to R-help
>>>>>> (https://stat.ethz.ch/pipermail/r-help/2016-September/442174.html).
>>>>>>
>>>>>> (One of the responses I got advised that I post my question here
>>>>>> instead.)
>>>>>>
>>>>>> The best I have so far is to configure my PROJ_R_LIB environment
>>>>>> variable to point to the directory with my shared code, and put a
>>>>>> function like the following in my .Rprofile file:
>>>>>>
>>>>>>     import <- function(name){
>>>>>>         ## usage:
>>>>>>         ## import("foo")
>>>>>>         ## foo$bar()
>>>>>>         path <- file.path(Sys.getenv("PROJ_R_LIB"),paste0(name,".R"))
>>>>>>         if(!file.exists(path)) stop('file "',path,'" does not exist')
>>>>>>         mod <- new.env()
>>>>>>         source(path,local=mod)
>>>>>>         list2env(setNames(list(mod),list(name)),envir=parent.frame())
>>>>>>         invisible()
>>>>>>     }
>>>>>>
>>>>>> (NB: the idea above is an elaboration of the one I showed in my first
>>>>>> post.)
>>>>>>
>>>>>> But this is very much of an R noob's solution.  I figure there may
>>>>>> already be more solid ways to achieve "zero-overhead" code reuse.
>>>>>>
>>>>>> I would appreciate any suggestions/critiques/pointers/comments.
>>>>>>
>>>>>> TIA!
>>>>>>
>>>>>> kj
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-devel at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>
>>>>
>>>
>>> On Sun, Oct 02, 2016 at 08:05:53PM -0400, Kynn Jones wrote:
>>>> On Sun, Oct 2, 2016 at 8:01 PM, Kynn Jones <kynnjo at gmail.com> wrote:
>>>>> Hi Frederick,
>>>>>
>>>>> I described what I meant in the post I sent to R-help
>>>>> (https://stat.ethz.ch/pipermail/r-help/2016-September/442174.html),
>>>>> but in brief, by "zero overhead" I mean that the only thing needed for
>>>>> library code to be accessible to client code is for it to be located
>>>>> in designed directory.  No additional meta-files, packaging/compiling,
>>>>      ^^^^^^^^
>>>>
>>>> Sorry, I meant to write "designated".
>>>>
>>>>> etc. are required.
>>>>
>>>
>>> On Sun, Oct 02, 2016 at 07:18:41PM -0500, Dirk Eddelbuettel wrote:
>>>>
>>>> Kynn,
>>>>
>>>> How much homework have you done researching any other "alternatives" to
>>>> the
>>>> package system?  I know of at least one...
>>>>
>>>> In short, just about everybody here believes in packages. And
>>>> repositories.
>>>> And package management.  And version control (at the package level). And
>>>> maybe byte compilation.  And associated documentation.  And unit tests.
>>>> And
>>>> continuous integration.
>>>>
>>>> You don't have to -- that's cool.  Different strokes for different
>>>> folks.
>>>>
>>>> But if think you need something different you may just have to build
>>>> that
>>>> yourself.
>>>>
>>>> Cheers, Dirk
>>>>
>>>> --
>>>> http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
>>>>
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


This email message may contain legally privileged and/or...{{dropped:2}}



More information about the R-devel mailing list