[Rd] Quantile issue
maechler at stat.math.ethz.ch
Thu May 1 22:55:01 CEST 2014
[this was meant as "reply to all" to R-devel :]
>>>>> Therneau, Terry M , Ph D <therneau at mayo.edu>
>>>>> on Wed, 30 Apr 2014 12:00:33 -0500 writes:
> This is likely yet another instance of round off error,
> but it caught me by surprise. tmt% R --vanilla (headers
> skipped, version 3.0.2 on Linux)
>> load('qtest.rda') length(temp)
>  3622
>> max(temp) >= quantile(temp, .98)
> 98% FALSE
> I can send the file to anyone who would like to understand
> this more deeply. The top 3% of the vector is a single
> repeated value.
Hmm... this may be the same as PR#15746 :
on which I commented that it was an example of FAQ 7.31 :
> Martin Maechler 2014-04-10 07:10:41 UTC
> What really happens is simply
> > x0 <- 0.000384478262997113
> > h <- (1:9)/10
> > diff(range((1-h)*x0 + h*x0))
>  1.084202e-19
> and that of course *is* just R FAQ 7.31
> I'm not sure this is something we want to amend; we *could*
> min(x) <= quantile(x, *) <= max(x)
> with a bit of extra code that makes *all* calls of quantile
> slower than
> now... just for the sake of this one very very rare exception ?
> Alternatively we just document it ..
> Maybe this needs public discussion on R-devel
and so I did bring it to R-devel.
If look at the fuller report, you also see that I claim that
this seems to happen only for very rare numbers...,
I had a loop running, searching for such numbers as the above x0
and it did not find one in many hours...,
OTOH, you now found a second example (of the same phenomenon I think).
So, yes, please, send me the qtest.rda...
or to all of us a dput(.) of the top 5% of the data which may
suffice for a reproducible example.
Also, it seems that for these rare examples we may need to rewrite
the code of quantile.default() ... making it less nice, less
transparent, not something I think we should commit lightly.
> Terry Therneau
> R-devel at r-project.org mailing list
More information about the R-devel