quantile() bug (PR#1852)

frohne@mtaonline.net frohne@mtaonline.net
Tue, 30 Jul 2002 10:17:37 +0200 (MET DST)


              
platform i386-pc-mingw32
arch     i386           
os       mingw32        
system   i386, mingw32  
status                  
major    1              
minor    5.0            
year     2002           
month    04             
day      29             
language R

The function quantile, in base, sometimes gives incorrect results
and is unnecessarily complicated.  For example,

> x <- c(-Inf, -Inf, Inf, Inf)
> quantile(x)
  0%  25%  50%  75% 100% 
-Inf -Inf -Inf  Inf  Inf

The correct result is
 0%  25%  50%  75%  100%
-Inf -Inf    NaN    Inf     Inf

Here the calculation at probs = 0.5 is
    50% quantile = x[2] + 0.5 * (x[3] - x[2])
                       = -Inf + 0.5 * (Inf - (-Inf))
                       = -Inf + Inf
                       = NaN

The responsible code is:

n <- length(x)
np <- length(probs)
if(n > 0 && np > 0) {
index <- 1 + (n - 1) * probs
lo <- floor(index)
hi <- ceiling(index)
x <- sort(x, partial = unique(c(lo, hi)))
i <- index > lo
qs <- x[lo]
    i <- seq(along=i)[i & !is.na(i)][qs[i] > -Inf]
    .minus <- function(x,y) ifelse(x == y, 0, x - y)# ok for Inf - Inf
    qs[i] <- qs[i] + .minus(x[hi[i]], x[lo[i]]) * (index[i] - lo[i])
}

I suggest:

n <- length(x)
np <- length(probs)
If(n > 0 && np > 0){
    index <- 1 + (n - 1) * probs
    lo <- pmin(floor(index), n - 1)
    hi <- lo + 1
    x <- sort(x, partial = unique(c(lo, hi)))
    qs <- x[lo]
    dif <- ifelse(x[hi] != qs, x[hi] - qs, 0) # Avoid Inf-Inf NaN.
    qs <- ifelse((fac <- location - k) != 0, qs + fac * dif, qs)
}
              



-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._