[Rd] NA warnings for r<distr>() {aka "patch for random.c"}
Berwin A Turlach
statba at nus.edu.sg
Fri Mar 7 16:54:06 CET 2008
G'day Martin (and "listeners"),
On Fri, 7 Mar 2008 15:01:26 +0100
Martin Maechler <maechler at stat.math.ethz.ch> wrote:
[...]
> >> If you feel like finding another elegant patch...
>
> BAT> Well, elegance is in the eye of the beholder. :-)
>
> BAT> I attach two patches. One that adds warning messages at
> BAT> the other places where NAs can be generated.
>
> ok. The result is very simple ``hence elegant''.
>
> But actually, part of the changed behavior may be considered
> undesirable:
>
> rnorm(2, mean = NA)
>
> which gives two NaN's would now produce a warning,
> where I could argue that
> 'arithmetic with NAs should give NAs without a warning'
> since
> 1:2 + NA
> also gives NAs without a warning.
>
> So we could argue that a warning should *only* be produced in a
> case where the parameters of the distribution are not NA.
>
> What do others (particularly R-core !) think?
I can agree with that point of view. But then, should a warning not
be issued only if one of the parameters of the distribution is NA, or
should it also not be issued if any non-finite parameter is
encountered? After all,
> 1:2 + Inf
[1] Inf Inf
does not create any warning either. In that case, a patch as the
attached should do, it checks all parameters for finiteness and then
checks whether the generated number is not finite.
Thus, with the patch applied I see the following behaviour:
> rnorm(2, mean=NA)
[1] NaN NaN
> rnorm(5, mean=c(0,Inf, -Inf, NA, NaN))
[1] 1.897874 NaN NaN NaN NaN
> rbinom(2, size=20, p=1.2)
[1] NaN NaN
Warning message:
In rbinom(2, size = 20, p = 1.2) : NAs produced
> rexp(2, rate=-2)
[1] NaN NaN
Warning message:
In rexp(2, rate = -2) : NAs produced
Without the patch:
> rnorm(2, mean=NA)
[1] NaN NaN
> rnorm(5, mean=c(0,Inf, -Inf, NA, NaN))
[1] -0.1719657 NaN NaN NaN NaN
> rbinom(2, size=20, p=1.2)
[1] NaN NaN
> rexp(2, rate=-2)
[1] NaN NaN
Warning message:
In rexp(2, rate = -2) : NAs produced
On my machine, "make check FORCE=FORCE" passes with this patch.
[...]
> For now, I will ignore this second patch.
>
> - it does bloat the code slightly (as you conceded)
Fair enough. :) I also proved my point that more complicated code is
harder to maintain. In the two parameter case I was testing twice na
for being one instead of testing na and nb.......
[...]
> but most importantly:
>
> - we have no idea if the speedup (when <Simple> is TRUE)
> is in the order of 10%, 1% or 0.1%
>
> My guess would be '0.1%' for rnorm(), and that would
> definitely not warrant the extra check.
I would not bet against this. Probably even with all the additional
checks for finiteness of parameters there would be not much speed
difference. The question might be whether you want to replace the
(new) R_FINITE()'s rather by ISNA()'s (or something else). One could
also discuss in which order the checks should be made (first generated
number then parameters of distribution or vice versa). But I will
leave this to R-core to decide. :)
> >> Thank you Berwin, for your contribution!
>
> and thanks again!
Still my pleasure. :)
Cheers,
Berwin
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: R-patch4
Url: https://stat.ethz.ch/pipermail/r-devel/attachments/20080307/bdbf223b/attachment.pl
More information about the R-devel
mailing list