[Rd] R-devel (r81196) hanging at dchisq(large) (PR#13309)

Tue Nov 16 14:11:37 CET 2021

On Tue, Nov 16, 2021 at 8:43 AM Martin Maechler
<maechler using stat.math.ethz.ch> wrote:
>
> >>>>> Avraham Adler
> >>>>>     on Tue, 16 Nov 2021 02:35:56 +0000 writes:
>
>     > I am building r-devel on Windows 10 64bit using Jeroen's mingw system,
>     > and I am finding that my make check-devel hangs on the above issue.
>     > Everything is vanila except that I am using OpenBLAS 0.3.18. I have
>     > been using OpenBLAS for over a decade and have not had this issue
>     > before. Is there anything I can do to dig deeper into this issue from
>     > my end? Could there be anything that changed in R-devel that may have
>     > triggered this? The bugzilla report doesn't have any code attached to
>     > it.
>
>     > Thank you,
>     > Avi
>
> Hmm.. it would've be nice to tell a bit more, instead of having all
> your readers to search links, etc.
>
> In the bugzilla bug report PR#13309
> https://bugs.r-project.org/show_bug.cgi?id=13309 ,the example was
>
>  dchisq(x=Inf, df=10, ncp=1)
>
> I had fixed the bug 13 years ago, in svn rev 47005
> with regression test in <Rsrc>/tests/d-p-q-r-tests.R :
>
>
> ## Non-central Chi^2 density for large x
> stopifnot(0 == dchisq(c(Inf, 1e80, 1e50, 1e40), df=10, ncp=1))
> ## did hang in 2.8.0 and earlier (PR#13309).
>
>
> and you are seeing your version of R hanging at exactly this
> location?
>
>
> I'd bet quite a bit that the underlying code in these
> non-central chi square computations *never* calls BLAS and hence
> I cannot imagine how openBLAS could play a role.
>
> However, there must be something peculiar in your compiler setup,
> compilation options, ....
> as of course the above regression test has been run 100s of
> 1000s of times also under Windows in the last 13 years ..
>
> Last but not least (but really only vaguely related):
>    There is still a FIXME in the source code (but not about
> hanging, but rather of loosing some accuracy in border cases),
> see e.g. https://svn.r-project.org/R/trunk/src/nmath/dnchisq.c
> and for that reason I had written an R version of that C code
> even back in 2008 which I've made available in  CRAN package
> DPQ  a few years ago (together with many other D/P/Q
> distribution computations/approximations).
>              -> https://cran.r-project.org/package=DPQ
>
> Best,
> Martin
>

Hello, Martin.

Apologies, I thought the PR # was sufficient. Yes, I am seeing this at
this exact location. This is what I saw in d-p-q-r-tst-2.Rout.fail and
I then ran d-p-q-r-tst.R line-by-line and R hung precisely after
`stopifnot(0 == dchisq(c(Inf, 1e80, 1e50, 1e40), df=10, ncp=1))`.

Is it at all possible that this has to do with the recent change from
bd0 to ebd0 (PR #15628) [1]?

For completeness, I ran all the code _beneath_ the call, and while
nothing else cause an infinite loop, I posted what I believe may be
unexpected results below,

Thank you,

Avi

[1]: https://bugs.r-project.org/show_bug.cgi?id=15628

> ## FIXME ?!:  MxM/2 seems +- ok ??
> (dLmM <- dnbinom(xL, mu = 1, size = MxM))  # all NaN but the last
Warning in dnbinom(xL, mu = 1, size = MxM) : NaNs produced
 [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN   0
> (dLpI <- dnbinom(xL, prob=1/2, size = Inf))#  ditto
Warning in dnbinom(xL, prob = 1/2, size = Inf) : NaNs produced
 [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN   0
> (dLpM <- dnbinom(xL, prob=1/2, size = MxM))#  ditto
Warning in dnbinom(xL, prob = 1/2, size = MxM) : NaNs produced
 [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN   0

> d <- dnbinom(x,  mu = mu, size = Inf) # gave NaN (for 0 and L), now all 0
Warning in dnbinom(x, mu = mu, size = Inf) : NaNs produced
> p <- pnbinom(x,  mu = mu, size = Inf) # gave all NaN, now uses ppois(x, mu)
Warning in pnbinom(x, mu = mu, size = Inf) : NaNs produced

> pp <- (0:16)/16
> q <- qnbinom(pp, mu = mu, size = Inf) # gave all NaN
> set.seed(1); NI <- rnbinom(32, mu = mu, size = Inf)# gave all NaN
> set.seed(1); N2 <- rnbinom(32, mu = mu, size = L  )
> stopifnot(exprs = {
+     all.equal(d, c(0.006737947, 0.033689735, 0.0842243375,
0.140373896, 0,0,0,0), tol = 9e-9)# 7.6e-10
+     all.equal(p, c(0.006737947, 0.040427682, 0.1246520195,
0.265025915, 1,1,1,1), tol = 9e-9)# 7.3e-10
+     all.equal(d, dpois(x, mu))# current implementation: even identical()
+     all.equal(p, ppois(x, mu))
+     q == c(0, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6, 6, 6, 7, 8, 9, Inf)
+     q == qpois(pp, mu)
+     identical(NI, N2)
+ })
Error: d and c(0.006737947, 0.033689735, 0.0842243375, 0.140373896, 0,
0,  .... are not equal:
  'is.NA' value mismatch: 0 in current 1 in target

> if(!(onWindows && arch == "x86")) {
+  ## This gave a practically infinite loop (on 64-bit Lnx, Windows;
not in 32-bit)
+     tools::assertWarning(p <- pchisq(1.00000012e200, df=1e200, ncp=100),
+                          "simpleWarning", verbose=TRUE)
+     stopifnot(p == 1)
+ }
Asserted warning: pnchisq(x=1e+200, f=1e+200, theta=100, ..): not
converged in 1000000 iter.

[This may be OK, AA]

> ## Show the (mostly) small differences :
> all.equal( qs, qpU, tol=0)
[1] "Mean relative difference: 1.572997e-16"
> all.equal(-qs, qp., tol=0)
[1] "Mean relative difference: 1.572997e-16"
> all.equal(-qp.,qpU, tol=0) # typically TRUE (<==> exact equality)
[1] "Mean relative difference: 4.710277e-16"
> stopifnot(exprs = {
+     all.equal( qs,  qpU, tol=1e-15)
+     all.equal(-qs,  qp., tol=1e-15)
+     all.equal(-qp., qpU, tol=1e-15)# diff of 4.71e-16 in 4.1.0 w/icc
(Eric Weese)
+ })
> ## both failed very badly in  R <= 4.0.x