[Rd] log(i, base=i) not giving 1

Martin Maechler maechler at stat.math.ethz.ch
Wed Sep 3 10:47:15 CEST 2014


>>>>> Prof Brian Ripley <ripley at stats.ox.ac.uk>
>>>>>     on Wed, 3 Sep 2014 06:46:47 +0100 writes:

    > On 02/09/2014 22:43, Ben Bolker wrote:
    >> On 14-09-02 08:48 AM, Martin Maechler wrote:
    >>>>>>>> peter dalgaard <pdalgd at gmail.com>
    >>>>>>>> on Tue, 2 Sep 2014 13:43:21 +0200 writes:
    >>> 
    >>> > Impressive. Never ceases to amaze me what computers can do these days. ;-)
    >>> 
    >>> Indeed,
    >>> 
    >>> > It's even more impressive given that we have
    >>> 
    >>> 
    >>> > static double logbase(double x, double base)
    >>> > {
    >>> > #ifdef HAVE_LOG10
    >>> > if(base == 10) return x > 0 ? log10(x) : x < 0 ? R_NaN : R_NegInf;
    >>> > #endif
    >>> > #ifdef HAVE_LOG2
    >>> > if(base == 2) return x > 0 ? log2(x) : x < 0 ? R_NaN : R_NegInf;
    >>> > #endif
    >>> > return R_log(x) / R_log(base);
    >>> > }
    >>> 
    >>> > which, except possibly for base-10 and base-2, would seem to be quite hard to convince to return anything other than 1 if x == base....
    >>> 
    >>> Amazingly indeed, it does:  From the few platforms I can try
    >>> here, I only see the problem
    >>> on 32 bit Linux, both an (old) ubuntu 12.04.5 and Fedora 19.
    >>> 
    >>>> i <- 2:99; i[log(i, base=i) != 1]
    >>> [1]  5  8 14 18 19 25 58 60 64 65 66 67 75 80 84 86 91
    >>> 
    >>> so '8' is not so special (as Ben suggested) and also not the
    >>> only power of two for which this happens :
    >>> 
    >>>> i <- 2^(1:20); i[log(i, base=i) != 1]
    >>> [1] 8    64   128  4096  8192 16384
    >>> 
    >>> Interestingly, it does not happen on 32-bit Windows, nor on any
    >>> 64-bit version of R I've tried.
    >>> Yes, it is amazing that with computer arithmetic, we can't
    >>> even guarantee that   U / U = 1  exactly.
    >>> 
    >>> Is it basically free to check for x == base in there, so we
    >>> could change to
    >>> 
    >>> return (x == base) ? 1. : R_log(x) / R_log(base);
    >>> 
    >>> ?
    >> 
    >> Should I submit a formal bug report about this, or should I assume it
    >> has now been sufficiently noted by R-core?

    > It is under consideration by the author (not me).

    >> Opinions about whether it is better to fix in logbase (according to
    >> Martin's suggestion) or in version.R (e.g. BDR's suggestion of using
    >> log2(x)/3 or some other way that makes the function less sensitive) or both?

    > Yet again, it is better to write code which does not make false 
    > assumptions about machine arithmetic.  Fixing one very rare case at the 
    > expense of speed for all others is not a good idea.

Well, of course, that's why I asked about the expense. If it is
neglible, as it may well be here,  we have in other circumstances
opted for more platform-independence with very slight penalties,
e.g., when wrapping system library functions by our own.

After all, arithmetic in R *has* been somewhat more portable
between platforms than arithmetic in C (C++, Fortran,..) exactly
because of that.
For the present case, I'm not making a strong argument, and find
it ok to keep the current implementation, though possibly we
should mention the issue in the documentation then. 

    > Studying http://www.validlab.com/goldberg/addendum.html is a good idea 
    > for those unfamiliar with the pitfalls of i86 FPUs.  It has an example 
    > essentially the same as this one right at the top.

Indeed.  
OTOH, with this reasoning:

    > Also, Ben Bolker needs to change his compiler options to ones better 
    > than the defaults in his unstated Linux distro.   Why anyone is using 
    > ix86 Linux nowadays is hard to understand when x86_64 is (on all but the 
    > very smallest machines) faster and more reliable.

one could argue that the platforms where  log(i, base=i) != 1
should become much more rare in the future, and when 99.x % of
implementations provide a feature, chances are very high that user
code will make this assumption {after all, R core code just did
for a while!} in cases that are untested and occuring very rarely,
exactly the dangerous kinds of bugs.

As R code--even R packages--will continue to be written mostly by
people who do not know about possible FP pitfalls (*) and would
attribute them to their "software", i.e. in this case R in this case, 
I still see a reason for special casing, possibly even nicely
#ifdef'ed  , not causing any penalty on "-ffloat-store" or
x86_64 platforms ?

(*) Our 99.x % of R users would lack  too much background
    context to understand the problem even when urged to "learn"
    about it in some way.  They will always blame R in such cases.

Martin


    >> 
    >> Ben Bolker
    >> 
    >>> 
    >>> > On 02 Sep 2014, at 03:27 , Ben Bolker <bbolker at gmail.com> wrote:
    >>> 
    >>> >> log(8, base=8L)-1
    >>> >> log(8, base=8)-1
    >>> >> logvals <- setNames(log(2:25,base=2:25)-1,2:25)
    >>> >> logvals[logvals!=0]  ## 5,8,14,18,19,25 all == .Machine$double.eps/2
    >>> 
    >>> > --
    >>> > Peter Dalgaard, Professor,
    >>> > Center for Statistics, Copenhagen Business School
    >>> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    >>> > Phone: (+45)38153501
    >>> > Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
    >>> 
    >>> > ______________________________________________
    >>> > R-devel at r-project.org mailing list
    >>> > https://stat.ethz.ch/mailman/listinfo/r-devel
    >>> 
    >> 
    >> ______________________________________________
    >> R-devel at r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 


    > -- 
    > Brian D. Ripley,                  ripley at stats.ox.ac.uk
    > Emeritus Professor of Applied Statistics, University of Oxford
    > 1 South Parks Road, Oxford OX1 3TG, UK

    > ______________________________________________
    > R-devel at r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list