[Rd] log(i, base=i) not giving 1
Martin Maechler
maechler at stat.math.ethz.ch
Wed Sep 3 10:47:15 CEST 2014
>>>>> Prof Brian Ripley <ripley at stats.ox.ac.uk>
>>>>> on Wed, 3 Sep 2014 06:46:47 +0100 writes:
> On 02/09/2014 22:43, Ben Bolker wrote:
>> On 14-09-02 08:48 AM, Martin Maechler wrote:
>>>>>>>> peter dalgaard <pdalgd at gmail.com>
>>>>>>>> on Tue, 2 Sep 2014 13:43:21 +0200 writes:
>>>
>>> > Impressive. Never ceases to amaze me what computers can do these days. ;-)
>>>
>>> Indeed,
>>>
>>> > It's even more impressive given that we have
>>>
>>>
>>> > static double logbase(double x, double base)
>>> > {
>>> > #ifdef HAVE_LOG10
>>> > if(base == 10) return x > 0 ? log10(x) : x < 0 ? R_NaN : R_NegInf;
>>> > #endif
>>> > #ifdef HAVE_LOG2
>>> > if(base == 2) return x > 0 ? log2(x) : x < 0 ? R_NaN : R_NegInf;
>>> > #endif
>>> > return R_log(x) / R_log(base);
>>> > }
>>>
>>> > which, except possibly for base-10 and base-2, would seem to be quite hard to convince to return anything other than 1 if x == base....
>>>
>>> Amazingly indeed, it does: From the few platforms I can try
>>> here, I only see the problem
>>> on 32 bit Linux, both an (old) ubuntu 12.04.5 and Fedora 19.
>>>
>>>> i <- 2:99; i[log(i, base=i) != 1]
>>> [1] 5 8 14 18 19 25 58 60 64 65 66 67 75 80 84 86 91
>>>
>>> so '8' is not so special (as Ben suggested) and also not the
>>> only power of two for which this happens :
>>>
>>>> i <- 2^(1:20); i[log(i, base=i) != 1]
>>> [1] 8 64 128 4096 8192 16384
>>>
>>> Interestingly, it does not happen on 32-bit Windows, nor on any
>>> 64-bit version of R I've tried.
>>> Yes, it is amazing that with computer arithmetic, we can't
>>> even guarantee that U / U = 1 exactly.
>>>
>>> Is it basically free to check for x == base in there, so we
>>> could change to
>>>
>>> return (x == base) ? 1. : R_log(x) / R_log(base);
>>>
>>> ?
>>
>> Should I submit a formal bug report about this, or should I assume it
>> has now been sufficiently noted by R-core?
> It is under consideration by the author (not me).
>> Opinions about whether it is better to fix in logbase (according to
>> Martin's suggestion) or in version.R (e.g. BDR's suggestion of using
>> log2(x)/3 or some other way that makes the function less sensitive) or both?
> Yet again, it is better to write code which does not make false
> assumptions about machine arithmetic. Fixing one very rare case at the
> expense of speed for all others is not a good idea.
Well, of course, that's why I asked about the expense. If it is
neglible, as it may well be here, we have in other circumstances
opted for more platform-independence with very slight penalties,
e.g., when wrapping system library functions by our own.
After all, arithmetic in R *has* been somewhat more portable
between platforms than arithmetic in C (C++, Fortran,..) exactly
because of that.
For the present case, I'm not making a strong argument, and find
it ok to keep the current implementation, though possibly we
should mention the issue in the documentation then.
> Studying http://www.validlab.com/goldberg/addendum.html is a good idea
> for those unfamiliar with the pitfalls of i86 FPUs. It has an example
> essentially the same as this one right at the top.
Indeed.
OTOH, with this reasoning:
> Also, Ben Bolker needs to change his compiler options to ones better
> than the defaults in his unstated Linux distro. Why anyone is using
> ix86 Linux nowadays is hard to understand when x86_64 is (on all but the
> very smallest machines) faster and more reliable.
one could argue that the platforms where log(i, base=i) != 1
should become much more rare in the future, and when 99.x % of
implementations provide a feature, chances are very high that user
code will make this assumption {after all, R core code just did
for a while!} in cases that are untested and occuring very rarely,
exactly the dangerous kinds of bugs.
As R code--even R packages--will continue to be written mostly by
people who do not know about possible FP pitfalls (*) and would
attribute them to their "software", i.e. in this case R in this case,
I still see a reason for special casing, possibly even nicely
#ifdef'ed , not causing any penalty on "-ffloat-store" or
x86_64 platforms ?
(*) Our 99.x % of R users would lack too much background
context to understand the problem even when urged to "learn"
about it in some way. They will always blame R in such cases.
Martin
>>
>> Ben Bolker
>>
>>>
>>> > On 02 Sep 2014, at 03:27 , Ben Bolker <bbolker at gmail.com> wrote:
>>>
>>> >> log(8, base=8L)-1
>>> >> log(8, base=8)-1
>>> >> logvals <- setNames(log(2:25,base=2:25)-1,2:25)
>>> >> logvals[logvals!=0] ## 5,8,14,18,19,25 all == .Machine$double.eps/2
>>>
>>> > --
>>> > Peter Dalgaard, Professor,
>>> > Center for Statistics, Copenhagen Business School
>>> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>> > Phone: (+45)38153501
>>> > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
>>>
>>> > ______________________________________________
>>> > R-devel at r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> --
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Emeritus Professor of Applied Statistics, University of Oxford
> 1 South Parks Road, Oxford OX1 3TG, UK
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list