[R] standard error for quantile
(Ted Harding)
Ted.Harding at wlandres.net
Wed Oct 31 19:10:20 CET 2012
[see in-line below]
On 31-Oct-2012 10:26:14 PIKAL Petr wrote:
> Hi Ted
>
>> -----Original Message-----
>> From: ted at deb [mailto:ted at deb] On Behalf Of Ted Harding
>> Sent: Tuesday, October 30, 2012 6:41 PM
>> To: r-help at r-project.org
>
> <snip>
>
>>
>> The general asymptotic result for the pth quantile (0<p<1) X.p of a
>> sample of size n is that it is asymptotically Normally distributed with
>> mean the pth quantile Q.p of the parent distribution and
>>
>> var(X.p) = p*(1-p)/(n*f(Q.p)^2)
>>
>> where f(x) is the probability density function of the parent
>> distribution.
>
> So if I understand correctly p*(1-p) is biggest when p=0.5 and decreases with
> smaller or bigger p. The var(X.p) then depends on ratio to parent
> distribution at this p probability. For lognorm distribution and 200 values
> the resulting var is
>
>> (0.5*(1-.5))/(200*qlnorm(.5, log(200), log(2))^2)
> [1] 3.125e-08
>> (0.1*(1-.1))/(200*qlnorm(.1, log(200), log(2))^2)
> [1] 6.648497e-08
>
> so 0.1 var is slightly bigger than 0.5 var. For different distributions this
> can be reversed as Jim pointed out.
>
> Did I manage to understand?
>
> Thank you very much.
> Regards
> Petr
Yes, it looks as though you understand! As a further illustration,
here is some R code applied to examples where the parent distrbution
is uniform or Normal. For each case, the reraults are stated as
first: simulated; second: by the formula. It can be seen that for
n=200 the formula and the simulations are close.
Ted.
###################################################################
## Test of formula for var(quantile)
varQ <- function(p,n,f.p) {
p*(1-p)/(n*(f.p^2))
}
## Test 1: Uniform (0,1), n = 200
n <- 200
## Pick one of (a), (b), (c):
## (a)# p <- 0.50 ; q <- 100 ; f.p <- 1
## (b)# p <- 0.25 ; q <- 50 ; f.p <- 1
## (c)# p <- 0.10 ; q <- 25 ; f.p <- 1
Nsim <- 1000
Qs <- numeric(Nsim)
for( i in (1:Nsim) ){
Qs[i] <- sort(runif(n))[q]
}
var(Qs)
varQ(p,n,f.p)
## (a) 0.001239982
## 0.00125
## (b) 0.0008877879
## 0.0009375
## (c) 0.0005619348
## 0.00045
## Test 2: N(0,1), n = 200
n <- 200
## Pick one of (a), (b), (c):
## (a)# p <- 0.50 ; q <- 100 ; f.p <- dnorm(qnorm(0.50))
## (b)# p <- 0.25 ; q <- 50 ; f.p <- dnorm(qnorm(0.25))
## (c)# p <- 0.10 ; q <- 20 ; f.p <- dnorm(qnorm(0.10))
Nsim <- 1000
Qs <- numeric(Nsim)
for( i in (1:Nsim) ){
Qs[i] <- sort(rnorm(n))[q]
}
var(Qs)
varQ(p,n,f.p)
## (a) 0.007633568
## 0.007853982
## (b) 0.009370099
## 0.009283837
## (c) 0.01420517
## 0.01461055
###################################################################
>> This is not necessarily very helpful for small sample sizes (depending
>> on the parent distribution).
>>
>> However, it is possible to obtain a general result giving an exact
>> confidence interval for Q.p given the entire ordered sample, though
>> there is only a restricted set of confidence levels to which it
>> applies.
>>
>> If you'd like more detail about the above, I could write up derivations
>> and make the write-up available.
>>
>> Hoping this helps,
>> Ted.
>>
>> -------------------------------------------------
>> E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
>> Date: 30-Oct-2012 Time: 17:40:55
>> This message was sent by XFMail
>> -------------------------------------------------
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 31-Oct-2012 Time: 18:10:16
This message was sent by XFMail
More information about the R-help
mailing list