[R] quantile() depends on order of probs?
Joshua Wiley
jwiley.psych at gmail.com
Sat Jun 19 15:23:54 CEST 2010
Dear Christos,
Thank you, the code implemented in 2.10.1 is actually slightly
different; the qs variable in the call to quantile is determined by a
series of ifelse() statements:
qs <- ifelse(h == 0, x[j + 2], ifelse(h == 1, x[j + 3], (1 - h) * x[j
+ 2] + h * x[j + 3]))
so if h is neither 0 nor 1, it is (1 - h) * x[j + 2] + h * x[j + 3]).
It was a particularly small example, so I did look at on a sample of
rnorm(500). The differences are less pronouned in the larger sample,
but still present (again in 2.11.1).
Josh
On Sat, Jun 19, 2010 at 4:35 AM, Christos Argyropoulos
<argchris at hotmail.com> wrote:
>
>
> Hi,
> It seems to me that the results are actually the same but they are not returned in the same order (R 2.10.1 in Windows Vista). If you call sort on the output the results will be the same:
>> sort(quantile(c(54, 72, 83, 112), type=6, probs=c(0, .25, .5, .75, 1)))
> 0% 25% 50% 75% 100%
> 54.00 58.50 77.50 104.75 112.00
>> sort(quantile(c(54, 72, 83, 112), type=6, probs=c(.25, .5, .75, 1, 0)))
> 0% 25% 50% 75% 100%
> 54.00 58.50 77.50 104.75 112.00
>
> With such a small sample, the actual quantile values may critically depend on the interpolatory algorithm used in their calculation, so exercise caution:
>
>> sort(quantile(c(54, 72, 83, 112), type=7, probs=c(0, .25, .5, .75, 1)))
> 0% 25% 50% 75% 100%
> 54.00 67.50 77.50 90.25 112.00
>> sort(quantile(c(54, 72, 83, 112), type=7, probs=c(.25, .5, .75, 1, 0)))
> 0% 25% 50% 75% 100%
> 54.00 67.50 77.50 90.25 112.00
>
> Christos Argyropoulos
>
>
> ----------------------------------------
>> Date: Fri, 18 Jun 2010 21:02:41 -0700
>> From: jwiley.psych at gmail.com
>> To: r-help at r-project.org
>> Subject: [R] quantile() depends on order of probs?
>>
>> Hello All,
>>
>> I am trying to figure out the rational behind why quantile() returns
>> different values for the same probabilities depending on whether 0 is
>> first.
>>
>> Here is an example:
>>
>> quantile(c(54, 72, 83, 112), type=6, probs=c(0, .25, .5, .75, 1))
>> quantile(c(54, 72, 83, 112), type=6, probs=c(.25, .5, .75, 1, 0))
>>
>> It seems to come down to this part of the code for quantile:
>>
>> fuzz <- 4 * .Machine$double.eps
>> nppm <- a + probs * (n + 1 - a - b)
>> j <- floor(nppm + fuzz)
>> h <- nppm - j
>> qs <- x[j + 2L]
>> qs[h == 1] <- x[j + 3L][h == 1]
>> other <- (h> 0) && (h < 1)
>> if (any(other))
>> qs[other] <- ((1 - h) * x[j + 2L] + h * x[j + 3L])[other]
>>
>> In my example, a and b are both 0, and n = 4. Particularly, the
>> alternate formula for qs is only used when the first element of h is
>> both> 0 and < 1. Any ideas on this? It seems like a simple
>> alternative would be
>>
>> other <- (h> 0) & (h < 1)
>>
>> but I do not know if that would cause problems for other quantile
>> formulae. By the way, this comes around lines 39-70 in
>> quantile.default in:
>>
>>> version
>> _
>> platform x86_64-pc-mingw32
>> arch x86_64
>> os mingw32
>> system x86_64, mingw32
>> status
>> major 2
>> minor 11.1
>> year 2010
>> month 05
>> day 31
>> svn rev 52157
>> language R
>> version.string R version 2.11.1 (2010-05-31)
>>
>>
>> Best regards,
>>
>> Josh
>>
>> --
>> Joshua Wiley
>> Ph.D. Student
>> Health Psychology
>> University of California, Los Angeles
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> _________________________________________________________________
> Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
> https://signup.live.com/signup.aspx?id=60969
--
Joshua Wiley
Ph.D. Student
Health Psychology
University of California, Los Angeles
More information about the R-help
mailing list