[R] standard error for quantile

Wed Oct 31 19:10:20 CET 2012

[see in-line below]

On 31-Oct-2012 10:26:14 PIKAL Petr wrote:
> Hi Ted
> 
>> -----Original Message-----
>> From: ted at deb [mailto:ted at deb] On Behalf Of Ted Harding
>> Sent: Tuesday, October 30, 2012 6:41 PM
>> To: r-help at r-project.org
> 
> <snip>
> 
>> 
>> The general asymptotic result for the pth quantile (0<p<1) X.p of a
>> sample of size n is that it is asymptotically Normally distributed with
>> mean the pth quantile Q.p of the parent distribution and
>> 
>>   var(X.p) = p*(1-p)/(n*f(Q.p)^2)
>> 
>> where f(x) is the probability density function of the parent
>> distribution.
> 
> So if I understand correctly p*(1-p) is biggest when p=0.5 and decreases with
> smaller or bigger p. The var(X.p) then depends on ratio to parent
> distribution at this p probability. For lognorm distribution and 200 values
> the resulting var is
> 
>> (0.5*(1-.5))/(200*qlnorm(.5, log(200), log(2))^2)
> [1] 3.125e-08
>> (0.1*(1-.1))/(200*qlnorm(.1, log(200), log(2))^2)
> [1] 6.648497e-08
> 
> so 0.1 var is slightly bigger than 0.5 var. For different distributions this
> can be reversed as Jim pointed out.
> 
> Did I manage to understand?
> 
> Thank you very much.
> Regards
> Petr 

Yes, it looks as though you understand! As a further illustration,
here is some R code applied to examples where the parent distrbution
is uniform or Normal. For each case, the reraults are stated as
first: simulated; second: by the formula. It can be seen that for
n=200 the formula and the simulations are close.
Ted.

###################################################################
## Test of formula for var(quantile)
varQ <- function(p,n,f.p) {
  p*(1-p)/(n*(f.p^2))
}

## Test 1: Uniform (0,1), n = 200
n <- 200
## Pick one of (a), (b), (c):
## (a)# p <- 0.50 ; q <- 100 ; f.p <- 1
## (b)# p <- 0.25 ; q <-  50 ; f.p <- 1
## (c)# p <- 0.10 ; q <-  25 ; f.p <- 1
Nsim <- 1000
Qs   <- numeric(Nsim)
for( i in (1:Nsim) ){
  Qs[i] <- sort(runif(n))[q]
}
var(Qs)
varQ(p,n,f.p)
## (a) 0.001239982
##     0.00125
## (b) 0.0008877879
##     0.0009375
## (c) 0.0005619348
##     0.00045

## Test 2: N(0,1), n = 200
n <- 200
## Pick one of (a), (b), (c):
## (a)# p <- 0.50 ; q <- 100 ; f.p <- dnorm(qnorm(0.50))
## (b)# p <- 0.25 ; q <-  50 ; f.p <- dnorm(qnorm(0.25))
## (c)# p <- 0.10 ; q <-  20 ; f.p <- dnorm(qnorm(0.10))
Nsim <- 1000
Qs   <- numeric(Nsim)
for( i in (1:Nsim) ){
  Qs[i] <- sort(rnorm(n))[q]
}
var(Qs)
varQ(p,n,f.p)
## (a) 0.007633568
##     0.007853982
## (b) 0.009370099
##     0.009283837
## (c) 0.01420517
##     0.01461055
###################################################################

>> This is not necessarily very helpful for small sample sizes (depending
>> on the parent distribution).
>> 
>> However, it is possible to obtain a general result giving an exact
>> confidence interval for Q.p given the entire ordered sample, though
>> there is only a restricted set of confidence levels to which it
>> applies.
>> 
>> If you'd like more detail about the above, I could write up derivations
>> and make the write-up available.
>> 
>> Hoping this helps,
>> Ted.
>> 
>> -------------------------------------------------
>> E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
>> Date: 30-Oct-2012  Time: 17:40:55
>> This message was sent by XFMail
>> -------------------------------------------------
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 31-Oct-2012  Time: 18:10:16
This message was sent by XFMail