[Rd] R vs. C
Patrick Burns
pburns at pburns.seanet.com
Tue Jan 18 12:54:39 CET 2011
I think we agree.
Having the examples run in the
tests is a good thing, I think.
They might strengthen the tests
some (especially if there are
no other tests). But mainly if
examples don't work, then it's
hard to have much faith in the
On 18/01/2011 11:36, Claudia Beleites wrote:
> On 01/18/2011 10:53 AM, Patrick Burns wrote:
>> I'm not at all a fan of thinking
>> of the examples as being tests.
>> Examples should clarify the thinking
>> of potential users. Tests should
>> clarify the space in which the code
>> is correct. These two goals are
>> generally at odds.
> Patrick, I completely agree with you that
> - Tests should not clutter the documentation and go to their proper place.
> - Examples are there for the user's benefit - and must be written
> accordingly.
> - Often, test should cover far more situations than good examples.
> Yet it seems to me that (part of the) examples are justly considered a
> (small) subset of the tests:
> As a potential user, I reqest two things from good examples that have an
> implicit testing message/side effect:
> - I like the examples to roughly outline the space in which the code
> works: they should tell me what I'm supposed to do.
> - Depending on the function's purpose, I like to see a demonstration of
> the correctness for some example calculation.
> (I don't want to see all further tests - I can look them up if I feel
> the need)
> The fact that the very same line of example code serves a testing (side)
> purpose doesn't mean that it should be copied into the tests, does it?
> Thus, I think of the "public" part (the "preface") of the tests living
> in the examples.
> My 2 ct,
> Best regards,
> Claudia
>> On 17/01/2011 22:15, Spencer Graves wrote:
>>> Hi, Paul:
>>> The "Writing R Extensions" manual says that *.R code in a "tests"
>>> directory is run during "R CMD check". I suspect that many R programmers
>>> do this routinely. I probably should do that also. However, for me, it's
>>> simpler to have everything in the "examples" section of *.Rd files. I
>>> think the examples with independently developed answers provides useful
>>> documentation.
>>> Spencer
>>> On 1/17/2011 1:52 PM, Paul Gilbert wrote:
>>>> Spencer
>>>> Would it not be easier to include this kind of test in a small file in
>>>> the tests/ directory?
>>>> Paul
>>>> -----Original Message-----
>>>> From: r-devel-bounces at r-project.org
>>>> [mailto:r-devel-bounces at r-project.org] On Behalf Of Spencer Graves
>>>> Sent: January 17, 2011 3:58 PM
>>>> To: Dominick Samperi
>>>> Cc: Patrick Leyshock; r-devel at r-project.org; Dirk Eddelbuettel
>>>> Subject: Re: [Rd] R vs. C
>>>> For me, a major strength of R is the package development
>>>> process. I've found this so valuable that I created a Wikipedia entry
>>>> by that name and made additions to a Wikipedia entry on "software
>>>> repository", noting that this process encourages good software
>>>> development practices that I have not seen standardized for other
>>>> languages. I encourage people to review this material and make
>>>> additions or corrections as they like (or sent me suggestions for me to
>>>> make appropriate changes).
>>>> While R has other capabilities for unit and regression testing, I
>>>> often include unit tests in the "examples" section of documentation
>>>> files. To keep from cluttering the examples with unnecessary material,
>>>> I often include something like the following:
>>>> A1<- myfunc() # to test myfunc
>>>> A0<- ("manual generation of the correct answer for A1")
>>>> \dontshow{stopifnot(} # so the user doesn't see "stopifnot("
>>>> all.equal(A1, A0) # compare myfunc output with the correct answer
>>>> \dontshow{)} # close paren on "stopifnot(".
>>>> This may not be as good in some ways as a full suite of unit
>>>> tests, which could be provided separately. However, this has the
>>>> distinct advantage of including unit tests with the documentation in a
>>>> way that should help users understand "myfunc". (Unit tests too
>>>> detailed to show users could be completely enclosed in "\dontshow".
>>>> Spencer
>>>> On 1/17/2011 11:38 AM, Dominick Samperi wrote:
>>>>> On Mon, Jan 17, 2011 at 2:08 PM, Spencer Graves<
>>>>> spencer.graves at structuremonitoring.com> wrote:
>>>>>> Another point I have not yet seen mentioned: If your code is
>>>>>> painfully slow, that can often be fixed without leaving R by
>>>>>> experimenting
>>>>>> with different ways of doing the same thing -- often after using
>>>>>> profiling
>>>>>> your code to find the slowest part as described in chapter 3 of
>>>>>> "Writing R
>>>>>> Extensions".
>>>>>> If I'm given code already written in C (or some other language),
>>>>>> unless it's really simple, I may link to it rather than recode it
>>>>>> in R.
>>>>>> However, the problems with portability, maintainability,
>>>>>> transparency to
>>>>>> others who may not be very facile with C, etc., all suggest that
>>>>>> it's well
>>>>>> worth some effort experimenting with alternate ways of doing the
>>>>>> same thing
>>>>>> in R before jumping to C or something else.
>>>>>> Hope this helps.
>>>>>> Spencer
>>>>>> On 1/17/2011 10:57 AM, David Henderson wrote:
>>>>>>> I think we're also forgetting something, namely testing. If you
>>>>>>> write
>>>>>>> your
>>>>>>> routine in C, you have placed additional burden upon yourself to
>>>>>>> test your
>>>>>>> C
>>>>>>> code through unit tests, etc. If you write your code in R, you
>>>>>>> still need
>>>>>>> the
>>>>>>> unit tests, but you can rely on the well tested nature of R to
>>>>>>> allow you
>>>>>>> to
>>>>>>> reduce the number of tests of your algorithm. I routinely tell
>>>>>>> people at
>>>>>>> Sage
>>>>>>> Bionetworks where I am working now that your new C code needs to
>>>>>>> experience at
>>>>>>> least one order of magnitude increase in performance to warrant the
>>>>>>> effort
>>>>>>> of
>>>>>>> moving from R to C.
>>>>>>> But, then again, I am working with scientists who are not
>>>>>>> primarily, or
>>>>>>> even
>>>>>>> secondarily, coders...
>>>>>>> Dave H
>>>>> This makes sense, but I have seem some very transparent algorithms
>>>>> turned
>>>>> into vectorized R code
>>>>> that is difficult to read (and thus to maintain or to change). These
>>>>> chunks
>>>>> of optimized R code are like
>>>>> embedded assembly, in the sense that nobody is likely to want to mess
>>>>> with
>>>>> it. This could be addressed
>>>>> by including pseudo code for the original (more transparent)
>>>>> algorithm as a
>>>>> comment, but I have never
>>>>> seen this done in practice (perhaps it could be enforced by R CMD
>>>>> check?!).
>>>>> On the other hand, in principle a well-documented piece of C/C++ code
>>>>> could
>>>>> be much easier to understand,
>>>>> without paying a performance penalty...but "coders" are not likely to
>>>>> place
>>>>> this high on their
>>>>> list of priorities.
>>>>> The bottom like is that R is an adaptor ("glue") language like Lisp
>>>>> that
>>>>> makes it easy to mix and
>>>>> match functions (using classes and generic functions), many of
>>>>> which are
>>>>> written in C (or C++
>>>>> or Fortran) for performance reasons. Like any object-based system
>>>>> there can
>>>>> be a lot of
>>>>> object copying, and like any functional programming system, there can
>>>>> be a
>>>>> lot of function
>>>>> calls, resulting in poor performance for some applications.
>>>>> If you can vectorize your R code then you have effectively found a
>>>>> way to
>>>>> benefit from
>>>>> somebody else's C code, thus saving yourself some time. For
>>>>> operations other
>>>>> than pure
>>>>> vector calculations you will have to do the C/C++ programming
>>>>> yourself (or
>>>>> call a library
>>>>> that somebody else has written).
>>>>> Dominick
>>>>>>> ----- Original Message ----
>>>>>>> From: Dirk Eddelbuettel<edd at debian.org>
>>>>>>> To: Patrick Leyshock<ngkbr8es at gmail.com>
>>>>>>> Cc: r-devel at r-project.org
>>>>>>> Sent: Mon, January 17, 2011 10:13:36 AM
>>>>>>> Subject: Re: [Rd] R vs. C
>>>>>>> On 17 January 2011 at 09:13, Patrick Leyshock wrote:
>>>>>>> | A question, please about development of R packages:
>>>>>>> |
>>>>>>> | Are there any guidelines or best practices for deciding when and
>>>>>>> why to
>>>>>>> | implement an operation in R, vs. implementing it in C? The
>>>>>>> "Writing R
>>>>>>> | Extensions" recommends "working in interpreted R code . . .
>>>>>>> this is
>>>>>>> normally
>>>>>>> | the best option." But we do write C-functions and access them
>>>>>>> in R -
>>>>>>> the
>>>>>>> | question is, when/why is this justified, and when/why is it NOT
>>>>>>> justified?
>>>>>>> |
>>>>>>> | While I have identified helpful documents on R coding standards,
>>>>>>> I have
>>>>>>> not
>>>>>>> | seen notes/discussions on when/why to implement in R, vs. when to
>>>>>>> implement
>>>>>>> | in C.
>>>>>>> The (still fairly recent) book 'Software for Data Analysis:
>>>>>>> Programming
>>>>>>> with
>>>>>>> R' by John Chambers (Springer, 2008) has a lot to say about this.
>>>>>>> John
>>>>>>> also
>>>>>>> gave a talk in November which stressed 'multilanguage'
>>>>>>> approaches; see
>>>>>>> e.g.
>>>>>>> http://blog.revolutionanalytics.com/2010/11/john-chambers-on-r-and-multilingualism.html
>>>>>>> In short, it all depends, and it is unlikely that you will get a
>>>>>>> coherent
>>>>>>> answer that is valid for all circumstances. We all love R for how
>>>>>>> expressive
>>>>>>> and powerful it is, yet there are times when something else is
>>>>>>> called for.
>>>>>>> Exactly when that time is depends on a great many things and you
>>>>>>> have not
>>>>>>> mentioned a single metric in your question. So I'd start with John's
>>>>>>> book.
>>>>>>> Hope this helps, Dirk
>>>>>> ______________________________________________
>>>>>> R-devel at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>> ______________________________________________
>>>> R-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>> ====================================================================================
>>>> La version française suit le texte anglais.
>>>> ------------------------------------------------------------------------------------
>>>> This email may contain privileged and/or confidential ...{{dropped:25}}
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
Patrick Burns
pburns at pburns.seanet.com
twitter: @portfolioprobe
(home of 'Some hints for the R beginner'
and 'The R Inferno')
More information about the R-devel
mailing list