[Rd] NA warnings for r<distr>() {aka "patch for random.c"}
Martin Maechler
maechler at stat.math.ethz.ch
Fri Mar 7 15:01:26 CET 2008
Hi Berwin (and "listeners"),
>>>>> "BAT" == Berwin A Turlach <statba at nus.edu.sg>
>>>>> on Wed, 5 Mar 2008 20:26:24 +0800 writes:
BAT> G'day Martin, On Mon, 3 Mar 2008 10:16:45 +0100 Martin
BAT> Maechler <maechler at stat.math.ethz.ch> wrote:
>> >>>>> "BAT" == Berwin A Turlach <statba at nus.edu.sg> >>>>>
>> on Fri, 29 Feb 2008 18:19:40 +0800 writes:
>>
BAT> while looking for some inspiration of how to organise
BAT> some code, I studied the code of random.c and noticed
BAT> that for distributions with 2 or 3 parameters the user
BAT> is not warned if NAs are created while such a warning
BAT> is issued for distributions with 1 parameter. [...] The
BAT> attached patch rectifies this. [...]
>>
>> I cannot imagine a design reason for that. If there was,
>> it should have been mentioned as a comment in the C code.
>>
>> I'll commit your patch (if it passes the checks).
BAT> Sorry, I was a bit in a hurry when writing the e-mail,
BAT> so I forgot to mention that the source code modified by
BAT> this patch compiles and passes "make check FORCE=FORCE"
BAT> on my machine.
ok.
BAT> And in my hurry, I also posted from my NUS account,
BAT> without realising it, which forced you to intervene as
BAT> moderator and to approve the posting. My apologies for
BAT> the extra work. But this gave me the idea to also
BAT> subscribe to r-devel with my NUS account and configure
BAT> the subscriptions so that I only receive e-mail at my
BAT> UWA account. Thus, hopefully, you will not have to
BAT> intervene again. (Which this e-mail should test.)
(and it succeeded)
BAT> BTW, there are other places in the code were NAs can be
BAT> created but no warning is issued. E.g:
>>
>> >> rexp(2, rate=numeric())
BAT> [1] NA NA
>> >> rnorm(2, mean=numeric())
BAT> [1] NA NA
>>
BAT> I wonder whether a warning should be issued in this
BAT> case too.
>>
>> Yes, "should in principle".
>>
>> If you feel like finding another elegant patch...
BAT> Well, elegance is in the eye of the beholder. :-)
BAT> I attach two patches. One that adds warning messages at
BAT> the other places where NAs can be generated.
ok. The result is very simple ``hence elegant''.
But actually, part of the changed behavior may be considered
undesirable:
rnorm(2, mean = NA)
which gives two NaN's would now produce a warning,
where I could argue that
'arithmetic with NAs should give NAs without a warning'
since
1:2 + NA
also gives NAs without a warning.
So we could argue that a warning should *only* be produced in a
case where the parameters of the distribution are not NA.
What do others (particularly R-core !) think?
BAT> The second one in additiona rearranges the code a bit
BAT> such that in the case when all the "vectors" that
BAT> contain the parameter values of the distribution, from
BAT> which one wants to simulate, are of length one some
BAT> unnecessary calculations is taken out of the for loop.
BAT> I am not sure how much time is actually saved in this
BAT> situation, but I belong to the school that things such
BAT> kind of optimisation should be done. :)
I understand, but I think most of R-core are "from the school"
that teaches
"if you want to optimize, first *measure*"
BAT> If you think it bloats the code too much (or duplicates
BAT> things too much leading to hard to maintain code), then
BAT> feel free to ignore this second patch.
For now, I will ignore this second patch.
- it does bloat the code slightly (as you conceded)
- it uses an if(<Simple>) test which makes the code slightly
*slower* for the (probably rare) case when the <Simple> is false.
but most importantly:
- we have no idea if the speedup (when <Simple> is TRUE)
is in the order of 10%, 1% or 0.1%
My guess would be '0.1%' for rnorm(), and that would
definitely not warrant the extra check.
>> Thank you Berwin, for your contribution!
and thanks again!
Martin
BAT> My pleasure.
BAT> Cheers,
BAT> Berwin
More information about the R-devel
mailing list