[R] [External] Funky calculations

Avi Gross @v|gro@@ @end|ng |rom ver|zon@net
Wed Feb 2 20:23:47 CET 2022


As discussed, Tim, your version of R has already built-in all kinds of values and limits and even ways to do approximate tests in the variable .Machine such as:

> .Machine$double.eps
[1] 2.220446e-16
> .Machine$sizeof.longdouble
[1] 16
> .Machine$double.eps
[1] 2.220446e-16
> .Machine$longdouble.eps
[1] 1.084202e-19

If you can figure out which ones correspond to what you are looking for, you can stay within bounds or set up your code to catch the rare cases when you are in a gray area where comparisons are not reliable and assume all other areas are valid for tests of equality or inequality.

Consider another common problem. You want to multiply two integers that are stored in integer variables of some capacity ranging from a byte to 4 or even more bytes. On many machines, if the multiplication needs even more digits than are available, the result is to return just the bits that fit and toss the others away. Your result is effectively nonsense. You may also throw an error and have to catch it.

But if your program runs portably on many machines, how can you be sure it worked properly if you multiply say a billion by a billion in an integer that only fits the following number on my machine at this time:

> .Machine$integer.max
[1] 2147483647

The above number is a bit more than 2 billion so clearly multiplying those two numbers will not fit. One solution, and not a great one, is to verify that each number being multiplied is less than the square root of the limit. Another is to do the multiplication and then divide the result by one of the multiplicands and verify the result is the same as the other multiplicand. If not, you had a problem, probably an overflow situation.

It should not be necessary to do this and in some ways it isn't. There are probably R packages supporting larger integers (and I don't mean by converting to floating point) and I know Python integers already are of any length and only limited by available memory.

But it is an example of how real computers using current methods are not mathematically whole and you can not demand they do what is mathematically possible and especially not in areas where things are truly meant to be continuous with no gaps or of infinite duration in various ways. The real world is not the ideal Platonic Realm.

Now I have to think about what may happen with quantum computers and algorithms using them. QBITS have a certain sense of infinity in them and can be a sort of superposition of many quantum states and thus may turn out to allow arbitrary precision, unless they hit some quantum wall.

But that is a topic for some futuristic R.



-----Original Message-----
From: Ebert,Timothy Aaron <tebert using ufl.edu>
To: J C Nash <profjcnash using gmail.com>; r-help using r-project.org <r-help using r-project.org>
Sent: Wed, Feb 2, 2022 1:27 pm
Subject: Re: [R] [External] Funky calculations


Punch cards and reams of green and white striped fanfold. A portable Cromemco we called a boat anchor, and hardly used because the Apple II was more in line with our computing tasks/skills. The relevant part is learning assembly and machine language on the Apple. Trying to describe an infinite value in a finite space results in inaccuracy no matter how you work it. The problem applies to all programs in all languages. The problem, while present, does not always matter. One way to identify the limits of accuracy is something like this:
1 + 1 == 2   #  returns TRUE
1 + 0.9 == 2 # returns FALSE
1 + 0.9999999 == 2 should return FALSE
Keep going and at some point you will get TRUE.
In my version of R, here is the last time I get FALSE.
1+ 0.999999999999999 == 2    #15 digits after decimal place

I can do something similar in Excel. In printing Excel rounds up to 1 when 10 nines are present. The if test quits returning FALSE after 14 decimal places. =if(1 + 0.9999999 = 2, "TRUE","FALSE")

One can play the game another way. This matters because it is NOT the number of digits after the decimal point, it is the total number of digits.

Here is the last correct answer in this game:
999999999999999 == 1000000000000000
That is 15 digits, add another 9 on the left and a zero on the right and you get TRUE.

123456789 + 0.00000001 == 123456789
Here is the last FALSE, but note that I have only added a number with 8 decimal places.

There are many other ways to play this game:
log10(10.00000000000001)==1 # add a zero before the decimal point, how many can one have after?

In statistics it is well know that it is a bad idea to model data that spans many orders of magnitude. At least for me, this is the simplest presentation for why this is the case. For a more exact reason consider the above results when taking the inverse of a matrix with elements that differ by several orders of magnitude.

The issue is well known, and programs like R and SAS and SPSS, (etc...) take care of the issue, up to a point. However, it is not hard to break the system if you try or are not careful. The point at which the system breaks is easy to identify in simple terms. However, just because R's limit is 15 digits does not mean that all packages in R will also have that limit. Just keep in mind that your data range in R needs to be such that the operations on that data all fit within this 15 digit limitation. Other programs may have different limitations, but there is always a limitation.

Tim 



-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of J C Nash
Sent: Wednesday, February 2, 2022 8:36 AM
To: r-help using r-project.org
Subject: Re: [R] [External] Funky calculations

[External Email]

I was one of the 31 names on the 1985 IEEE standard. If anyone thinks things are awkward now, try looking at the swamp we had beforehand.

What I believe IS useful is to provide examples and to explain them in tutorial fashion.
We need to recognize that our computations have limitations. Most common computing platforms use IEEE binary arithmetic, but not all.

This was much more "in our face" when we used slide rules or hand-crank calculators. I still have slide rules and a Monroe "Portable" calculator -- 5 kg! It's worth bringing them out every so often and being thankful for the power and speed of modern computing, while remembering to watch for the cowpads of REAL and REAL*8 arithmetic.

JN

On 2022-02-01 22:45, Avi Gross via R-help wrote:
> This is a discussion forum, Richard, and I welcome requests to clarify what I wrote or to be corrected, especially when my words have been read with an attempt to understand. I do get private responses too and some days i wonder if I am not communicating the way people like!
>
> But let me repeat. The question we started with asked about R. My answer applies to quite a few languages besides R and maybe just about all of them.
>
> I got private email insisting the numbers being added were not irrational so why would they not be represented easily as a sum. I know my answers included parts at various levels of abstraction as well as examples of cases when Decimals notation for a number like 1/7 results in an infinite repeating sequence. So, I think it wise to follow up with what binary looks like and why hardly ANYTHING that looks reasonable is hard to represent exactly.
>
> Consider that binary means POWERS OF TWO. The sequence 1101 before a decimal point means (starting from the right and heading left) that you have one ONES and no TWOS and one FOURS and one EIGHTS. Powers of two ranging from 2 to the zero power to two cubed. You can make any integer whatsoever using as long a sequence of zeros and ones as you like. Compare this to decimal notation where you use powers of ten and of course can use any of 0-9.
>
> But looking at fractional numbers, like 1/7 and 1/10, it gets hard and inexact.
>
> Remember now we are in BINARY. Here are some fractions with everything not shown to the right being zeros and thus not needed to be shown explicitly. Starting with the decimal point, read this from left to right to see the powers in the denominator rising so 1/2 then 1/4 then 1/8 ...:
>
> 0.0 would be 0.
> 0.1 would be 1/2
> 0.101 would be 1/2 + 1/8 or 5/8
> 0.11 would be 1/2 + 1/4 or 3/4
> 0.111 would be 1/2 + 1/4 + 1/8 or 7/8
>
> We are now using negative powers where 2 raised to the minus one power is one over two raised to the plus one power, or 1/2 and so on. As you head to the right you get to fairly small numbers like 1/2048 ...
>
> Every single binary fraction is thus a possibly infinite sum of negative powers of two, or rather the reciprocals of those in positive terms.
>
> If you want to make 1/7, to some number of decimal places, it looks like this up to some point where I stop:
>
> 0.00100100100100100101
>
> So no halves, no quarters, 1/8, no sixteenths, no thirty-seconds, 1/64, and so on. But if you add all that up, and note the sequence was STOPPED before it could continue further, you get this translated into decimal:
>
> 0.142857 55157470703125
>
> Recall 1/7 in decimal notation is
> 0.142857 142857142857142857...
>
> Note the divergence at the seventh digit after the decimal point. I left a space to show where they diverge. If I used more binary digits, I can get as close as I want but computers these days do not allow too many more digits unless you use highly specialized programs. There are packages that give you access such as "mpfr" but generally nothing can give you infinite precision. R will not handle an infinite number of infinitesimals.
>
> The original problem that began our thread was about numbers like 0.1 and 0.2 and so on. In base ten, they look nice but I repeat in base 2 only powers of TWO reign.
>
> 0.1 in base two is about 0.0001100110011001101
>
> that reads as 1/16 + 1/32 + 1/256 + 1/512 + ...
>
> If I convert the above segment, which I repeat was stopped short, I 
> get 0.1000003814697265625 which is a tad over and had I taken the last 
> 1 and changed it to a zero as in 0.0001100110011001100 then we would 
> have a bit under at 0.09999847412109375
>
> So the only way to write 0.1 exactly is to continue infinitely, again. Do the analysis and understand why most rational numbers will not easily convert to a small number of bits. But the advantages of computers doing operations in binary are huge and need not be explained. You may THINK you are entering numbers in decimal form but they rarely remain that way for long before they simply become binary and often remain binary unless and until you ask to print them out, usually in decimal.
>
> BTW, I used a random web site to do the above conversion calculations:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.rapidtables.c
> om_convert_number_binary-2Dto-2Ddecimal.html&d=DwIDaQ&c=sJ6xIWYx-zLMB3
> EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=m7vgw0V3XLMYZ26IDQFrX6s6LeffgRVmWQ
> d4FGv5GUaI_W1jJ_i7QSJKQrB8fON4&s=_HQxpbll2YuLGvrwfc0AJPK4NAQ9xCcqvfl0c
> 1i-8Tg&e=
>
> Since I am writing in plain text, I cannot show what it says in the box on that page further down under Decimal Calculation Steps so I wonder what the rest of this message looks like:
>
> (0.0001100110011001100)₂ = (0 × 2⁰) + (0 × 2⁻¹) + (0 × 2⁻²) + (0 × 
> 2⁻³) + (1 × 2⁻⁴) + (1 × 2⁻⁵) + (0 × 2⁻⁶) + (0 × 2⁻⁷) + (1 × 2⁻⁸) + (1 
> × 2⁻⁹) + (0 × 2⁻¹⁰) + (0 × 2⁻¹¹) + (1 × 2⁻¹²) + (1 × 2⁻¹³) + (0 × 
> 2⁻¹⁴) + (0 × 2⁻¹⁵) + (1 × 2⁻¹⁶) + (1 × 2⁻¹⁷) + (0 × 2⁻¹⁸) + (0 × 2⁻¹⁹) 
> = (0.09999847412109375)₁₀
>
> I think my part in this particular discussion can now finally come to an end. R and everything else can be incomplete. Deal with it!
>
> -----Original Message-----
> From: Richard M. Heiberger <rmh using temple.edu>
> To: Avi Gross <avigross using verizon.net>
> Cc: nboeger using gmail.com <nboeger using gmail.com>; r-help using r-project.org 
> <r-help using r-project.org>
> Sent: Tue, Feb 1, 2022 9:04 pm
> Subject: Re: [External] [R] Funky calculations
>
>
> I apologize if my tone came across wrong.  I enjoy reading your comments on this list.
>
> My goal was to describe what the IEEE and R interpret "careful coding" to be.
>
>
>> On Feb 01, 2022, at 20:42, Avi Gross <avigross using verizon.net> wrote:
>>
>> Richard,
>>
>> I think it was fairly clear I was explaining how people do arithmetic manually and often truncate or round to some number of decimal places. I said nothing about what R does or what the IEEE standards say and I do not particularly care when making MY point.
>>
>> My point is that humans before computers also had trouble writing down any decimals that continue indefinitely. It cannot be expected computer versions of arithmetic can do much better. Different people can opt to do the calculation with the same or different numbers of digits ad when compared to each other they may not match.
>>
>> I do care what it does in my programs, of course. My goal here was to explain to someone that the anomaly found was not really an anomaly and that careful coding may be required in these situations.
>>
>>
>> -----Original Message-----
>> From: Richard M. Heiberger <rmh using temple.edu>
>> To: Avi Gross <avigross using verizon.net>
>> Cc: Nathan Boeger <nboeger using gmail.com>; r-help using r-project.org 
>> <r-help using r-project.org>
>> Sent: Tue, Feb 1, 2022 2:44 pm
>> Subject: Re: [External] [R] Funky calculations
>>
>>
>> RShowDoc('FAQ')
>>
>>
>> then search for 7.31
>>
>>
>> This statement
>> "If you stop at a 5 or 7 or 8 and back up to the previous digit, you round up. Else you leave the previous result alone."
>> is not quite right.  The recommendation in IEEE 754, and this is how R does arithmetic, is to Round Even.
>>
>> I ilustrate here with decimal, even though R and other programs use binary.
>>
>>> x <- c(1.4, 1.5, 1.6, 2.4, 2.5, 2.6, 3.4, 3.5, 3.6, 4.4, 4.5, 4.6) r 
>>> <- round(x) cbind(x, r)
>>           x r
>> [1,] 1.4 1
>> [2,] 1.5 2
>> [3,] 1.6 2
>> [4,] 2.4 2
>> [5,] 2.5 2
>> [6,] 2.6 3
>> [7,] 3.4 3
>> [8,] 3.5 4
>> [9,] 3.6 4
>> [10,] 4.4 4
>> [11,] 4.5 4
>> [12,] 4.6 5
>>>
>>
>> Numbers whose last digit is not 5 (when in decimal) round to the nearest integer.
>> Numbers who last digit is 5 (1.5, 2.5, 3.5, 4.5 above) round to the 
>> nearest EVEN integer.
>> Hence 1.5 and 3.5 round up to the even numbers 2 and 4.
>> 2.5 and 4.5 round down do the even numbers 2 and 4.
>>
>> This way the round ups and downs average out to 0.  If we always went 
>> up from .5 we would have an updrift over time.
>>
>> For even more detail click on the link in FAQ 7.31 to my appendix 
>> https:// 
>> link.springer.com/content/pdf/bbm%3A978-1-4939-2122-5%2F1.pdf
>> and search for "Appendix G".
>>
>> Section G.5 explains Round to Even.
>> Sections G.6 onward illustrate specific examples, such as the one that started this email thread.
>>
>> Rich
>>
>
>
>



More information about the R-help mailing list