[Rd] Calculation of e^{z^2/2} for a normal deviate z

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Mon Jun 24 10:37:54 CEST 2019


>>>>> William Dunlap via R-devel 
>>>>>     on Sun, 23 Jun 2019 10:34:47 -0700 writes:
>>>>> William Dunlap via R-devel 
>>>>>     on Sun, 23 Jun 2019 10:34:47 -0700 writes:

    > include/Rmath.h declares a set of 'logspace' functions for use at the C
    > level.  I don't think there are core R functions that call them.

    > /* Compute the log of a sum or difference from logs of terms, i.e.,
    > *
    > *     log (exp (logx) + exp (logy))
    > * or  log (exp (logx) - exp (logy))
    > *
    > * without causing overflows or throwing away too much accuracy:
    > */
    > double  Rf_logspace_add(double logx, double logy);
    > double  Rf_logspace_sub(double logx, double logy);
    > double  Rf_logspace_sum(const double *logx, int nx);

    > Bill Dunlap
    > TIBCO Software
    > wdunlap tibco.com

Yes, indeed, thank you, Bill!

But they *have* been in use by core R functions for a long time
in pgamma, pbeta and related functions.

[and I have had changes in *hyper.c where logspace_add() is used too, 
 for several years now (since 2015) but I no longer know which
 concrete accuracy problem that addresses, so have not yet committed it]


Martin Maechler
ETH Zurich and R Core Team



    > On Sun, Jun 23, 2019 at 1:40 AM Ben Bolker <bbolker using gmail.com> wrote:

    >> 
    >> I agree with many the sentiments about the wisdom of computing very
    >> small p-values (although the example below may win some kind of a prize:
    >> I've seen people talking about p-values of the order of 10^(-2000), but
    >> never 10^(-(10^8)) !).  That said, there are a several tricks for
    >> getting more reasonable sums of very small probabilities.  The first is
    >> to scale the p-values by dividing the *largest* of the probabilities,
    >> then do the (p/sum(p)) computation, then multiply the result (I'm sure
    >> this is described/documented somewhere).  More generally, there are
    >> methods for computing sums on the log scale, e.g.
    >> 
    >> 
    >> https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.misc.logsumexp.html
    >> 
    >> I don't know where this has been implemented in the R ecosystem, but
    >> this sort of computation is the basis of the "Brobdingnag" package for
    >> operating on very large ("Brobdingnagian") and very small
    >> ("Lilliputian") numbers.
    >> 
    >> 
    >> On 2019-06-21 6:58 p.m., jing hua zhao wrote:
    >> > Hi Peter, Rui, Chrstophe and Gabriel,
    >> >
    >> > Thanks for your inputs --  the use of qnorm(., log=TRUE) is a good point
    >> in line with pnorm with which we devised log(p)  as
    >> >
    >> > log(2) + pnorm(-abs(z), lower.tail = TRUE, log.p = TRUE)
    >> >
    >> > that could do really really well for large z compared to Rmpfr. Maybe I
    >> am asking too much since
    >> >
    >> > z <-20000
    >> >> Rmpfr::format(2*pnorm(mpfr(-abs(z),100),lower.tail=TRUE,log.p=FALSE))
    >> > [1] "1.660579603192917090365313727164e-86858901"
    >> >
    >> > already gives a rarely seen small p value. I gather I also need a
    >> multiple precision exp() and their sum since exp(z^2/2) is also a Bayes
    >> Factor so I  get log(x_i )/sum_i log(x_i) instead. To this point, I am
    >> obliged to clarify, see
    >> https://statgen.github.io/gwas-credible-sets/method/locuszoom-credible-sets.pdf
    >> .
    >> >
    >> > I agree many feel geneticists go to far with small p values which I
    >> would have difficulty to argue againston the other hand it is also expected
    >> to see these in a non-genetic context. For instance the Framingham study
    >> was established in 1948 just got $34m for six years on phenotypewide
    >> association which we would be interesting to see.
    >> >
    >> > Best wishes,
    >> >
    >> >
    >> > Jing Hua
    >> >
    >> >
    >> > ________________________________
    >> > From: peter dalgaard <pdalgd using gmail.com>
    >> > Sent: 21 June 2019 16:24
    >> > To: jing hua zhao
    >> > Cc: Rui Barradas; r-devel using r-project.org
    >> > Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
    >> >
    >> > You may want to look into using the log option to qnorm
    >> >
    >> > e.g., in round figures:
    >> >
    >> >> log(1e-300)
    >> > [1] -690.7755
    >> >> qnorm(-691, log=TRUE)
    >> > [1] -37.05315
    >> >> exp(37^2/2)
    >> > [1] 1.881797e+297
    >> >> exp(-37^2/2)
    >> > [1] 5.314068e-298
    >> >
    >> > Notice that floating point representation cuts out at 1e+/-308 or so. If
    >> you want to go outside that range, you may need explicit manipulation of
    >> the log values. qnorm() itself seems quite happy with much smaller values:
    >> >
    >> >> qnorm(-5000, log=TRUE)
    >> > [1] -99.94475
    >> >
    >> > -pd
    >> >
    >> >> On 21 Jun 2019, at 17:11 , jing hua zhao <jinghuazhao using hotmail.com>
    >> wrote:
    >> >>
    >> >> Dear Rui,
    >> >>
    >> >> Thanks for your quick reply -- this allows me to see the bottom of
    >> this. I was hoping we could have a handle of those p in genmoics such as
    >> 1e-300 or smaller.
    >> >>
    >> >> Best wishes,
    >> >>
    >> >>
    >> >> Jing Hua
    >> >>
    >> >> ________________________________
    >> >> From: Rui Barradas <ruipbarradas using sapo.pt>
    >> >> Sent: 21 June 2019 15:03
    >> >> To: jing hua zhao; r-devel using r-project.org
    >> >> Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
    >> >>
    >> >> Hello,
    >> >>
    >> >> Well, try it:
    >> >>
    >> >> p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
    >> >> z <- qnorm(p/2)
    >> >>
    >> >> pnorm(z)
    >> >> # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
    >> >> # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
    >> >> #[11] 1.110223e-16
    >> >> p/2
    >> >> # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
    >> >> # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
    >> >> #[11] 1.110223e-16
    >> >>
    >> >> exp(z*z/2)
    >> >> # [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
    >> >> # [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
    >> >> #[11] 4.314798e+14
    >> >>
    >> >>
    >> >> p is the smallest possible such that 1 + p != 1 and I couldn't find
    >> >> anything to worry about.
    >> >>
    >> >>
    >> >> R version 3.6.0 (2019-04-26)
    >> >> Platform: x86_64-pc-linux-gnu (64-bit)
    >> >> Running under: Ubuntu 19.04
    >> >>
    >> >> Matrix products: default
    >> >> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
    >> >> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
    >> >>
    >> >> locale:
    >> >>  [1] LC_CTYPE=pt_PT.UTF-8       LC_NUMERIC=C
    >> >>  [3] LC_TIME=pt_PT.UTF-8        LC_COLLATE=pt_PT.UTF-8
    >> >>  [5] LC_MONETARY=pt_PT.UTF-8    LC_MESSAGES=pt_PT.UTF-8
    >> >>  [7] LC_PAPER=pt_PT.UTF-8       LC_NAME=C
    >> >>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
    >> >> [11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
    >> >>
    >> >> attached base packages:
    >> >> [1] stats     graphics  grDevices utils     datasets  methods
    >> >> [7] base
    >> >>
    >> >> other attached packages:
    >> >>
    >> >> [many packages loaded]
    >> >>
    >> >>
    >> >> Hope this helps,
    >> >>
    >> >> Rui Barradas
    >> >>
    >> >> �s 15:24 de 21/06/19, jing hua zhao escreveu:
    >> >>> Dear R-developers,
    >> >>>
    >> >>> I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very
    >> small. I wonder if anyone has experience with this?
    >> >>>
    >> >>> Thanks very much in advance,
    >> >>>
    >> >>>
    >> >>> Jing Hua
    >> >>>
    >> >>>       [[alternative HTML version deleted]]
    >> >>>
    >> >>> ______________________________________________
    >> >>> R-devel using r-project.org mailing list
    >> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> >>>
    >> >>
    >> >>        [[alternative HTML version deleted]]
    >> >>
    >> >> ______________________________________________
    >> >> R-devel using r-project.org mailing list
    >> >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> >
    >> > --
    >> > Peter Dalgaard, Professor,
    >> > Center for Statistics, Copenhagen Business School
    >> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    >> > Phone: (+45)38153501
    >> > Office: A 4.23
    >> > Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com
    >> >
    >> >
    >> >
    >> >
    >> >
    >> >
    >> >
    >> >
    >> >
    >> >
    >> >       [[alternative HTML version deleted]]
    >> >
    >> > ______________________________________________
    >> > R-devel using r-project.org mailing list
    >> > https://stat.ethz.ch/mailman/listinfo/r-devel
    >> >
    >> 
    >> ______________________________________________
    >> R-devel using r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 

    > [[alternative HTML version deleted]]

    > ______________________________________________
    > R-devel using r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list