[R] quantile / centile

Peter Dalgaard p.dalgaard at biostat.ku.dk
Sat Sep 27 16:21:47 CEST 2008


Donald Braman wrote:
> Thanks, for the response!
> Unfortunately, I was unclear; my problem is not that I need to know what the
> percentile ranges are, but that I need to assign an appropriate percentile
> range to each of the records in my dataframe.   My dataframe contains
> somewhere between 1000 and 9000 rows/records in my dataframe (depending on
> context), not a hundred rows.   That is, I'd like to assign a corresponding
> quantile value to each row that corresponds to the quantile() result for
> each record in my 1000-9000 row data frame.
>
> Thanks again for any help!
>
>
>   


You can use

cnt <- cut(x, quantile(x, seq(0,1,0.01)), include=TRUE)
names(cnt) <- 1:100 # if you want to get rid of ugly interval labels

With Harrells Hmisc packages, there's also

cnt <- cut2(x, g=100)

Or you can take a more basic approach and do

N <-sum(!is.na(x))
cnt <- ceiling(rank(x)/N*100)

> On Sat, Sep 27, 2008 at 8:54 AM, Henrique Dallazuanna <wwwhsd at gmail.com>wrote:
>
>   
>> Try this:
>>
>> my.df$my.newvar <- quantile(my.df$my.var, probs = seq(0.01,1, 0.01))
>>
>>
>> On Sat, Sep 27, 2008 at 3:50 AM, Donald Braman <dbraman at law.gwu.edu>
>> wrote:
>>     
>>> I'm wondering if there is a simple way to assign a quantile to a vector
>>>       
>> in a
>>     
>>> data frame, much like one could in Stata using centile. Let's say I want
>>>       
>> 100
>>     
>>> slices in my assignation. I can easily see what the limits of each slice
>>>       
>> by
>>     
>>> using quantile:
>>> quantile(my.df$my.var, probs=seq(0, 1, 0.01))
>>>
>>> But how do I assign the appropriate value to each row/record in my data
>>> frame? Clearly the following won't work, but what will?
>>>
>>> my.df$my.new.var <- quantile(my.df$my.var, probs=seq(0, 1, 0.01))
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>>       
>> http://www.R-project.org/posting-guide.html
>>     
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>       
>>
>> --
>> Henrique Dallazuanna
>> Curitiba-Paraná-Brasil
>> 25° 25' 40" S 49° 16' 22" O
>>
>>     
>
> 	[[alternative HTML version deleted]]
>
>   
> ------------------------------------------------------------------------
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>   


-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907



More information about the R-help mailing list