[Rd] compairing doubles

Fri Aug 31 16:00:38 CEST 2018

> On Aug 31, 2018, at 9:36 AM, Iñaki Ucar <iucar using fedoraproject.org> wrote:
> 
> El vie., 31 ago. 2018 a las 15:10, Felix Ernst
> (<felix.gm.ernst using outlook.com>) escribió:
>> 
>> Dear all,
>> 
>> I a bit unsure, whether this qualifies as a bug, but it is definitly a strange behaviour. That why I wanted to discuss it.
>> 
>> With the following function, I want to test for evenly space numbers, starting from anywhere.
>> 
>> .is_continous_evenly_spaced <- function(n){
>>  if(length(n) < 2) return(FALSE)
>>  n <- n[order(n)]
>>  n <- n - min(n)
>>  step <- n[2] - n[1]
>>  test <- seq(from = min(n), to = max(n), by = step)
>>  if(length(n) == length(test) &&
>>     all(n == test)){
>>    return(TRUE)
>>  }
>>  return(FALSE)
>> }
>> 
>>> .is_continous_evenly_spaced(c(1,2,3,4))
>> [1] TRUE
>>> .is_continous_evenly_spaced(c(1,3,4,5))
>> [1] FALSE
>>> .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
>> [1] FALSE
>> 
>> I expect the result for 1 and 2, but not for 3. Upon Investigation it turns out, that n == test is TRUE for every pair, but not for the pair of 0.2.
>> 
>> The types reported are always double, however n[2] == 0.1 reports FALSE as well.
>> 
>> The whole problem is solved by switching from all(n == test) to all(as.character(n) == as.character(test)). However that is weird, isn’t it?
>> 
>> Does this work as intended? Thanks for any help, advise and suggestions in advance.
> 
> I guess this has something to do with how the sequence is built and
> the inherent error of floating point arithmetic. In fact, if you
> return test minus n, you'll get:
> 
> [1] 0.000000e+00 0.000000e+00 2.220446e-16 0.000000e+00
> 
> and the error gets bigger when you continue the sequence; e.g., this
> is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):
> 
> [1] 0.000000e+00 0.000000e+00 2.220446e-16 2.220446e-16 4.440892e-16
> [6] 4.440892e-16 4.440892e-16 0.000000e+00
> 
> So, independently of this is considered a bug or not, instead of
> 
> length(n) == length(test) && all(n == test)
> 
> I would use the following condition:
> 
> isTRUE(all.equal(n, test))
> 
> Iñaki
> 
>> 
>> Best regards,
>> Felix

Hi,

This is essentially FAQ 7.31:

  https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f <https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f>

Review that and the references therein to gain some insights into binary representations of floating point numbers.

Rather than the more complicated code you have above, try the following:

evenlyspaced <- function(x) {
  gaps <- diff(sort(x))
  all(gaps[-1] == gaps[1])
}

Note the use of ?diff:

> diff(c(1, 2, 3, 4))
[1] 1 1 1

> diff(c(1, 3, 4, 5))
[1] 2 1 1

> diff(c(1, 1.1, 1.2, 1.3))
[1] 0.1 0.1 0.1

However, in reality, due to the floating point representation issues noted above:

> print(diff(c(1, 1.1, 1.2, 1.3)), 20)
[1] 0.100000000000000088818 0.099999999999999866773
[3] 0.100000000000000088818

So the differences between the numbers are not exactly 0.1.

Using the function above, you get:

> evenlyspaced(c(1, 2, 3, 4))
[1] TRUE

> evenlyspaced(c(1, 3, 4, 5))
[1] FALSE

> evenlyspaced(c(1, 1.1, 1.2, 1.3))
[1] FALSE

As has been noted, if you want the gap comparison to be based upon some margin of error, use ?all.equal rather than the explicit equals comparison that I have in the function above. Something along the lines of:

evenlyspaced <- function(x) {
  gaps <- diff(sort(x))
  all(sapply(gaps[-1], function(x) all.equal(x, gaps[1])))
}

On which case, you now get:

evenlyspaced(c(1, 1.1, 1.2, 1.3))
[1] TRUE

Regards,

Marc Schwartz

	[[alternative HTML version deleted]]