[R] quantile function

Knut M. Wittkowski kmw at mail.rockefeller.edu
Fri Feb 6 18:19:47 CET 2004


Another problem with the R function "quantile" is that its definition of 
"quantiles" may be not what you expect. Consider the following:

 > x <- matrix(c(1:4))
 > quantile(x,c(0,.25,.5,.75,1))
   0%  25%  50%  75% 100%
1.00 1.75 2.50 3.25 4.00

 > x <- matrix(c(1:6))
 > quantile(x,c(0,.25,.5,.75,1))
   0%  25%  50%  75% 100%
1.00 2.25 3.50 4.75 6.00

 > x <- matrix(c(1:8))
 > quantile(x,c(0,.25,.5,.75,1))
   0%  25%  50%  75% 100%
1.00 2.75 4.50 6.25 8.00

With your implicit definition of quantiles (splitting the data set into 
classes of equal size), each class should have 1.5 observations, so that 
the quantiles should be

 > x <- matrix(c(1:4))
 > equalSizeClasses(x,c(0,.25,.5,.75,1))
   0%  25%  50%  75% 100%
-Inf  1.50 2.50 3.50 +Inf

 > x <- matrix(c(1:6))
 > equalSizeClasses(x,c(0,.25,.5,.75,1))
   0%  25%  50%  75% 100%
-Inf  2.00 3.50 5.00 +Inf

 > x <- matrix(c(1:8))
 > equalSizeClasses(x,c(0,.25,.5,.75,1))
   0%  25%  50%  75% 100%
-Inf  2.50 4.50 6.50 +Inf

Knut

At 09:30 2004-02-06 -0600, Giovanni Petris wrote:

>I am trying to `cut' a continuous variable into contiguous classes
>containing approximately an equal number of observations. I thought
>quantile() was the appropriate function to use in order to find the
>breakpoints, but I end up with classes of different sizes - see
>example below. Does anybody have an explanation for that? And what is
>the `recommended' way of computing what I am looking for?
>
>Example:
>
> > ca$age
>  [1] 28 42 46 45 34 44 48 45 38 45 49 45 41 46 49 46 44 48 52 48 45 50 53 
> 57 46
>[26] 52 54 57 47 52 55 59 50 54 57 60 51 55 46 63 51 59 48 35 53 59 57 37 
>55 32
>[51] 60 43 59 37 30 47 60 38 34 48 32 38 36 49 33 42 38 58 35 43 39 59 39 
>43 42
>[76] 60 40 44
> > table(cut(ca$age,breaks=c(-Inf,quantile(ca$age, seq(0,1,length=11)[-1]))))
>
>(-Inf,35] (35,38.4] (38.4,43]   (43,45] (45,46.5] 
>(46.5,49]   (49,52]   (52,55]
>         9         7        10         8         5        10         7 
>      7
>   (55,59]   (59,63]
>        10         5
>
>Thanks in advance,
>Giovanni
>
>--
>
>  __________________________________________________
>[                                                  ]
>[ Giovanni Petris                 GPetris at uark.edu ]
>[ Department of Mathematical Sciences              ]
>[ University of Arkansas - Fayetteville, AR 72701  ]
>[ Ph: (479) 575-6324, 575-8630 (fax)               ]
>[ http://definetti.uark.edu/~gpetris/              ]
>[__________________________________________________]
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Knut M. Wittkowski, PhD,DSc
------------------------------------------
The Rockefeller University, GCRC
Experimental Design and Biostatistics
1230 York Ave #121B, Box 322, NY,NY 10021
+1(212)327-7175, +1(212)327-8450 (Fax)
kmw at rockefeller.edu
http://www.rucares.org/clinicalresearch/dept/biometry/




More information about the R-help mailing list