[R] get the percentage rank of a value based on an empirical data vector

Wed Jan 11 16:07:43 CET 2012

If performance is an issue, I think mean(x < y) will be as quick as it can be done in R alone (you could do it in C in a single pass if needed which might be a good first exercise in using compiled code)

Michael

On Jan 11, 2012, at 8:58 AM, David Winsemius <dwinsemius at comcast.net> wrote:

> 
> On Jan 11, 2012, at 8:12 AM, Martin Batholdy wrote:
> 
>> Hi,
>> 
>> I have a vector with values:
>> 
>> x <- rnorm(1000, 5, 2)
>> 
>> 
>> and one single value:
>> y <- 6.2
>> 
>> now I would like to know the percent rank of y based on the 'population'-vector x.
>> Is there a convenient function that calculates the percent rank of a y for the given vector x?
> 
> Two options :
> 1) sort x and use findInterval, divide the index by length(x) and multiply by 100
> (It can all be done as a one-liner.)
> 
> 2) I generally "reach for" the `ecdf` "function making machine" when I see sample quantile problems and see if I can cast the problem in terms for which it applies.
> 
> For my random draw I get:
> > findInterval(6.2, sort(x))
> [1] 704
> > xecdf <- ecdf(x)
> > xecdf(6.2)
> [1] 0.704
> 
> -- 
> David Winsemius, MD
> West Hartford, CT
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.