[R] Percentiles/Quantiles with Weighting
Stavros Macrakis
macrakis at alum.mit.edu
Tue Feb 17 23:48:49 CET 2009
Here is one kind of weighted quantile function.
The basic idea is very simple:
wquantile <- function( v, w, p )
{
v <- v[order(v)]
w <- w[order(v)]
v [ which.max( cumsum(w) / sum(w) >= p ) ]
}
With some more error-checking and general clean-up, it looks like this:
# Simple weighted quantile
#
# v A numeric vector of observations
# w A numeric vector of positive weights
# p The probability 0<=p<=1
#
# Nothing fancy: no interpolation etc.
# Basic idea
wquantile <- function( v, w, p )
{
v <- v[order(v)]
w <- w[order(v)]
v [ which.max( cumsum(w) / sum(w) >= p ) ]
}
# Simple weighted quantile
#
# v A numeric vector of observations
# w A numeric vector of positive weights
# p The probability 0<=p<=1
#
# Nothing fancy: no interpolation etc.
wquantile <- function(v,w=rep(1,length(v)),p=.5)
{
if (!is.numeric(v) || !is.numeric(w) || length(v) != length(w))
stop("Values and weights must be equal-length numeric vectors")
if ( !is.numeric(p) || any( p<0 | p>1 ) )
stop("Quantiles must be 0<=p<=1")
ranking <- order(v)
sumw <- cumsum(w[ranking])
if ( is.na(w[1]) || w[1]<0 ) stop("Weights must be non-negative numbers")
plist <- sumw/sumw[length(sumw)]
sapply(p, function(p) v [ ranking [ which.max( plist >= p ) ] ])
}
I would appreciate any comments people have on this -- whether
correctness, efficiency, style, ....
-s
On Tue, Feb 17, 2009 at 11:57 AM, Brigid Mooney <bkmooney at gmail.com> wrote:
> Hi All,
>
> I am looking at applications of percentiles to time sequenced data. I had
> just been using the quantile function to get percentiles over various
> periods, but am more interested in if there is an accepted (and/or
> R-implemented) method to apply weighting to the data so as to weigh recent
> data more heavily.
>
> I wrote the following function, but it seems quite inefficient, and not
> really very flexible in its applications - so if anyone has any suggestions
> on how to look at quantiles/percentiles within R while also using a
> weighting schema, I would be very interested.
>
> Note - this function supposes the data in X is time-sequenced, with the most
> recent (and thus heaviest weighted) data at the end of the vector
>
> WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
> {
> Xprime <- NA
>
> for(i in 1:length(X))
> {
> Xprime <- c(Xprime, rep(X[i], times=i))
> }
>
> print("Percentiles:")
> print(quantile(X, pctile))
> print("Weighted:")
> print(Xprime)
> print("Weighted Percentiles:")
> print(quantile(Xprime, pctile, na.rm=TRUE))
> }
>
> WtPercentile(1:10)
> WtPercentile(rnorm(10))
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list