[Rd] Question about quantile fuzz and GPL license

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Wed Sep 15 10:46:45 CEST 2021


>>>>> GILLIBERT, Andre 
>>>>>     on Tue, 14 Sep 2021 16:13:05 +0000 writes:

    > On 9/14/21 9:22 AM, Abel AOUN wrote:
    >> However I don't get why epsilon is multiplied by 4 instead of simply using epsilon.
    >> Is there someone who can explain this 4 ?

    > .Machine$double.eps is the "precision" of floating point values for values close to 1.0 (between 0.5 and 2.0).

    > Using fuzz = .Machine$double.eps would have no effect if nppm is greater than or equal to 2.
    > Using fuzz = 4 * .Machine$double.eps can fix rounding errors for nppm < 8; for greater nppm, it has no effect.

    > Indeed:
    > 2 + .Machine$double.eps == 2
    > 8+ 4*.Machine$double.eps == 8

    > Since nppm is approximatively equal to the quantile multiplied by the sample size, it can be much greater than 2 or 8.

hmm: not "quantile":
 it is approximatively equal to the *'prob'* multiplied by the sample size
 {the quantiles themselves can be on any scale anyway, but they
  don't matter yet fortunately in these parts of the calculations}

but you're right in the main point that they are
approx. proportional to  n.

    > Maybe the rounding errors are only problematic for small nppm; or only that case is taken in account.

    > Moreover, if rounding errors are cumulative, they can be much greater than the precision of the floating point value. I do not know how this constant was chosen and what the use-cases were.

I vaguely remember I've been wondering about this also (back at the time).

Experiential wisdom would tell us to take such  fuzz values as
*relative* to the magnitude of the values they are added to,
here 'nppm' (which is always >= 0, hence no need for  abs(.) as usually).

So, instead of

    j <- floor(nppm + fuzz)
    h <- nppm - j
    if(any(sml <- abs(h) < fuzz, na.rm = TRUE)) h[sml] <- 0

it would be (something like)

    j <- floor(nppm*(1 + fuzz))
    h <- nppm - j
    if(any(sml <- abs(h) < fuzz*nppm, na.rm = TRUE)) h[sml] <- 0

or rather we would define fuzz as
   nppm * (k * .Machine$double.eps) 
for a small k.

- - -

OTOH,  type=7 is the default, and I guess used in 99.9% of
all uses of quantile, *and* does never use any fuzz ....

Martin

    > --
    > Sincerely
    > Andre GILLIBERT


    > [[alternative HTML version deleted]]



More information about the R-devel mailing list