[R] quantile function
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Fri Feb 6 18:16:44 CET 2004
On Fri, 6 Feb 2004 09:30:31 -0600 (CST)
Giovanni Petris <GPetris at uark.edu> wrote:
>
> I am trying to `cut' a continuous variable into contiguous classes
> containing approximately an equal number of observations. I thought
> quantile() was the appropriate function to use in order to find the
> breakpoints, but I end up with classes of different sizes - see
> example below. Does anybody have an explanation for that? And what is
> the `recommended' way of computing what I am looking for?
>
> Example:
>
> > ca$age
> [1] 28 42 46 45 34 44 48 45 38 45 49 45 41 46 49 46 44 48 52 48 45 50
> 53 57 46
> [26] 52 54 57 47 52 55 59 50 54 57 60 51 55 46 63 51 59 48 35 53 59 57
> 37 55 32[51] 60 43 59 37 30 47 60 38 34 48 32 38 36 49 33 42 38 58 35 43
> 39 59 39 43 42[76] 60 40 44
> > table(cut(ca$age,breaks=c(-Inf,quantile(ca$age,
> > seq(0,1,length=11)[-1]))))
>
> (-Inf,35] (35,38.4] (38.4,43] (43,45] (45,46.5] (46.5,49] (49,52]
> (52,55]
> 9 7 10 8 5 10 7
> 7
> (55,59] (59,63]
> 10 5
>
> Thanks in advance,
> Giovanni
>
> --
>
The cut2 function in the Hmisc package tries to do this the best it can.
Frank
---
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list