[Rd] NA warnings for r<distr>() {aka "patch for random.c"}
Martin Maechler
maechler at stat.math.ethz.ch
Mon Mar 10 12:06:01 CET 2008
>>>>> "BAT" == Berwin A Turlach <statba at nus.edu.sg>
>>>>> on Fri, 7 Mar 2008 23:54:06 +0800 writes:
BAT> G'day Martin (and "listeners"), On Fri, 7 Mar 2008
BAT> 15:01:26 +0100 Martin Maechler
BAT> <maechler at stat.math.ethz.ch> wrote:
BAT> [...]
>> >> If you feel like finding another elegant patch...
>>
BAT> Well, elegance is in the eye of the beholder. :-)
>>
BAT> I attach two patches. One that adds warning messages
BAT> at the other places where NAs can be generated.
>>
>> ok. The result is very simple ``hence elegant''.
>>
>> But actually, part of the changed behavior may be
>> considered undesirable:
>>
>> rnorm(2, mean = NA)
>>
>> which gives two NaN's would now produce a warning, where
>> I could argue that 'arithmetic with NAs should give NAs
>> without a warning' since 1:2 + NA also gives NAs without
>> a warning.
>>
>> So we could argue that a warning should *only* be
>> produced in a case where the parameters of the
>> distribution are not NA.
>>
>> What do others (particularly R-core !) think?
BAT> I can agree with that point of view. But then, should
BAT> a warning not be issued only if one of the parameters
BAT> of the distribution is NA, or should it also not be
BAT> issued if any non-finite parameter is encountered?
BAT> After all,
>> 1:2 + Inf
BAT> [1] Inf Inf
BAT> does not create any warning either.
but it doesn't give NA either.
A bit more relevant (and I'm sure you meant implicitly)
> Inf/Inf
[1] NaN
However, I think we shouldn't go as far.
BAT> In that case, a patch as the attached should do, it
BAT> checks all parameters for finiteness and then checks
BAT> whether the generated number is not finite.
BAT> Thus, with the patch applied I see the following
BAT> behaviour:
>> rnorm(2, mean=NA)
BAT> [1] NaN NaN
>> rnorm(5, mean=c(0,Inf, -Inf, NA, NaN))
BAT> [1] 1.897874 NaN NaN NaN NaN
>> rbinom(2, size=20, p=1.2)
BAT> [1] NaN NaN Warning message: In rbinom(2, size = 20, p
BAT> = 1.2) : NAs produced
>> rexp(2, rate=-2)
BAT> [1] NaN NaN Warning message: In rexp(2, rate = -2) :
BAT> NAs produced
I've committed a bit a different patch for now (svn rev 44715),
because I felt we were getting too sophisticated.
The current (since about 20 minutes) situation is to
warn when the *result* is NA or NaN and not to warn in any other
cases.
This maybe not be perfect, but at least is very consistent and
in particular consistent to what the warning itself "says".
BTW, I've also found that I
rnorm(n, mu=Inf) |--> NA was not ok, and changed that to
rnorm(n, mu=Inf) |--> Inf
More feedback is welcome,
but please now "start" with R-devel rev >= 44715
Martin
BAT> Without the patch:
>> rnorm(2, mean=NA)
BAT> [1] NaN NaN
>> rnorm(5, mean=c(0,Inf, -Inf, NA, NaN))
BAT> [1] -0.1719657 NaN NaN NaN NaN
>> rbinom(2, size=20, p=1.2)
BAT> [1] NaN NaN
>> rexp(2, rate=-2)
BAT> [1] NaN NaN Warning message: In rexp(2, rate = -2) :
BAT> NAs produced
BAT> On my machine, "make check FORCE=FORCE" passes with
BAT> this patch.
BAT> [...]
>> For now, I will ignore this second patch.
>>
>> - it does bloat the code slightly (as you conceded)
BAT> Fair enough. :) I also proved my point that more
BAT> complicated code is harder to maintain. In the two
BAT> parameter case I was testing twice na for being one
BAT> instead of testing na and nb.......
BAT> [...]
>> but most importantly:
>>
>> - we have no idea if the speedup (when <Simple> is TRUE)
>> is in the order of 10%, 1% or 0.1%
>>
>> My guess would be '0.1%' for rnorm(), and that would
>> definitely not warrant the extra check.
BAT> I would not bet against this. Probably even with all
BAT> the additional checks for finiteness of parameters
BAT> there would be not much speed difference. The question
BAT> might be whether you want to replace the (new)
BAT> R_FINITE()'s rather by ISNA()'s (or something else).
BAT> One could also discuss in which order the checks should
BAT> be made (first generated number then parameters of
BAT> distribution or vice versa). But I will leave this to
BAT> R-core to decide. :)
>> >> Thank you Berwin, for your contribution!
>>
>> and thanks again!
BAT> Still my pleasure. :)
BAT> Cheers,
BAT> Berwin Index: src/main/random.c
BAT> ===================================================================
BAT> --- src/main/random.c (revision 44693) +++
BAT> src/main/random.c (working copy) @@ -44,7 +44,7 @@ for
BAT> (i = 0; i < n; i++) { ai = a[i % na]; x[i] = f(ai); -
BAT> if (!R_FINITE(x[i])) naflag = 1; + if (!R_FINITE(x[i])
BAT> && R_FINITE(ai)) naflag = 1; } return(naflag); } @@
BAT> -81,6 +81,7 @@ if (na < 1) { for (i = 0; i < n; i++)
BAT> REAL(x)[i] = NA_REAL; + warning(_("NAs produced")); }
BAT> else { PROTECT(a = coerceVector(CADR(args), REALSXP));
BAT> @@ -116,14 +117,14 @@ ai = a[i % na]; bi = b[i % nb];
BAT> x[i] = f(ai, bi); - if (!R_FINITE(x[i])) naflag = 1; +
BAT> if (!R_FINITE(x[i]) && R_FINITE(ai) && R_FINITE(bi))
BAT> naflag = 1; } return(naflag); }
BAT> #define RAND2(num,name) \ case num: \ - random2(name,
BAT> REAL(a), na, REAL(b), nb, REAL(x), n); \ + naflag =
BAT> random2(name, REAL(a), na, REAL(b), nb, REAL(x), n); \
BAT> break
BAT> /* "do_random2" - random sampling from 2 parameter
BAT> families. */ @@ -155,6 +156,7 @@ if (na < 1 || nb < 1)
BAT> { for (i = 0; i < n; i++) REAL(x)[i] = NA_REAL; +
BAT> warning(_("NAs produced")); } else { PROTECT(a =
BAT> coerceVector(CADR(args), REALSXP)); @@ -200,14 +202,14
BAT> @@ bi = b[i % nb]; ci = c[i % nc]; x[i] = f(ai, bi,
BAT> ci); - if (!R_FINITE(x[i])) naflag = TRUE; + if
BAT> (!R_FINITE(x[i]) && R_FINITE(ai) && R_FINITE(bi) &&
BAT> R_FINITE(ci)) naflag = TRUE; } return(naflag); }
BAT> #define RAND3(num,name) \ case num: \ - random3(name,
BAT> REAL(a), na, REAL(b), nb, REAL(c), nc, REAL(x), n); \ +
BAT> naflag = random3(name, REAL(a), na, REAL(b), nb,
BAT> REAL(c), nc, REAL(x), n); \ break
BAT> @@ -244,6 +246,7 @@ if (na < 1 || nb < 1 || nc < 1) {
BAT> for (i = 0; i < n; i++) REAL(x)[i] = NA_REAL; +
BAT> warning(_("NAs produced")); } else { PROTECT(a =
BAT> coerceVector(a, REALSXP));
BAT> ______________________________________________
BAT> R-devel at r-project.org mailing list
BAT> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list