[R] quantile from quantile table calculation without original data
David Winsemius
dw|n@em|u@ @end|ng |rom comc@@t@net
Fri Mar 5 19:23:40 CET 2021
On 3/5/21 1:14 AM, PIKAL Petr wrote:
> Dear all
>
> I have table of quantiles, probably from lognormal distribution
>
> dput(temp)
> temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069,
> 0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1,
> 0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent"
> ), row.names = c(NA, -9L), class = "data.frame")
>
> and I need to calculate quantile for size 0.1
>
> plot(temp$size, temp$percent, pch=19, xlim=c(0,2))
> ss <- approxfun(temp$size, temp$percent)
> points((0:100)/50, ss((0:100)/50))
> abline(v=.1)
>
> If I had original data it would be quite easy with ecdf/quantile function but without it I am lost what function I could use for such task.
The quantiles are in reverse order so tryoing to match the data to
quantiles from candidate parameters requires subtracting them from unity:
> temp$size
[1] 1.6000 0.9466 0.8062 0.6477 0.5069 0.3781 0.3047 0.2681 0.1907
> qlnorm(1-temp$percent, -.5)
[1] 6.21116124 3.14198142 2.18485959 1.19063854 0.60653066 0.30897659
0.16837670 0.11708517 0.05922877
> qlnorm(1-temp$percent, -.9)
[1] 4.16346589 2.10613313 1.46455518 0.79810888 0.40656966 0.20711321
0.11286628 0.07848454 0.03970223
> qlnorm(1-temp$percent, -2)
[1] 1.38589740 0.70107082 0.48750807 0.26566737 0.13533528 0.06894200
0.03756992 0.02612523 0.01321572
> qlnorm(1-temp$percent, -1.6)
[1] 2.06751597 1.04587476 0.72727658 0.39632914 0.20189652 0.10284937
0.05604773 0.03897427 0.01971554
> qlnorm(1-temp$percent, -1.6, .5)
[1] 0.64608380 0.45951983 0.38319004 0.28287360 0.20189652 0.14410042
0.10637595 0.08870608 0.06309120
> qlnorm(1-temp$percent, -1, .5)
[1] 1.1772414 0.8372997 0.6982178 0.5154293 0.3678794 0.2625681
0.1938296 0.1616330 0.1149597
> qlnorm(1-temp$percent, -1, .4)
[1] 0.9328967 0.7103066 0.6142340 0.4818106 0.3678794 0.2808889
0.2203318 0.1905308 0.1450700
> qlnorm(1-temp$percent, -0.5, .4)
[1] 1.5380866 1.1710976 1.0127006 0.7943715 0.6065307 0.4631076
0.3632657 0.3141322 0.2391799
> qlnorm(1-temp$percent, -0.55, .4)
[1] 1.4630732 1.1139825 0.9633106 0.7556295 0.5769498 0.4405216
0.3455491 0.2988118 0.2275150
> qlnorm(1-temp$percent, -0.55, .35)
[1] 1.3024170 1.0260318 0.9035201 0.7305712 0.5769498 0.4556313
0.3684158 0.3244257 0.2555795
> qlnorm(1-temp$percent, -0.55, .45)
[1] 1.6435467 1.2094723 1.0270578 0.7815473 0.5769498 0.4259129
0.3241016 0.2752201 0.2025322
> qlnorm(1-temp$percent, -0.53, .45)
[1] 1.6767486 1.2339052 1.0478057 0.7973356 0.5886050 0.4345169
0.3306489 0.2807799 0.2066236
> qlnorm(1-temp$percent, -0.57, .45)
[1] 1.6110023 1.1855231 1.0067207 0.7660716 0.5655254 0.4174793
0.3176840 0.2697704 0.1985218
Seems like it might be an acceptable fit. modulo the underlying data
gathering situation which really should be considered.
You can fiddle with that result. My statistical hat (not of PhD level
certification) says that the middle quantiles in this sequence probably
have the lowest sampling error for a lognormal, but I'm rather unsure
about that. A counter-argument might be that since there is a hard lower
bound of 0 for the 0-th quantile that you should be more worried about
matching the 0.1907 value to the 0.01 order statistic, since 99% of the
data is know to be above it. Seems like efforts at matching the 0.50
quantile to 0.5069 for the logmean parameter and matching the 0.01
quantile 0.1907 for estimation of the variance estimate might be
preferred to worrying too much about the 1.6 value which would be in the
right tail (and far away from your region of extrapolation.)
Further trial and error:
> qlnorm(1-temp$percent, -0.58, .47)
[1] 1.6709353 1.2129813 1.0225804 0.7687497 0.5598984 0.4077870
0.3065638 0.2584427 0.1876112
> qlnorm(1-temp$percent, -0.65, .47)
[1] 1.5579697 1.1309763 0.9534476 0.7167775 0.5220458 0.3802181
0.2858382 0.2409704 0.1749275
> qlnorm(1-temp$percent, -0.65, .5)
[1] 1.6705851 1.1881849 0.9908182 0.7314290 0.5220458 0.3726018
0.2750573 0.2293682 0.1631355
> qlnorm(1-temp$percent, -0.65, .4)
[1] 1.3238434 1.0079731 0.8716395 0.6837218 0.5220458 0.3986004
0.3126657 0.2703761 0.2058641
> qlnorm(1-temp$percent, -0.68, .4)
[1] 1.2847179 0.9781830 0.8458786 0.6635148 0.5066170 0.3868200
0.3034251 0.2623852 0.1997799
> qlnorm(1-temp$percent, -0.65, .39)
[1] 1.2934016 0.9915290 0.8605402 0.6791257 0.5220458 0.4012980
0.3166985 0.2748601 0.2107093
>
> qlnorm(1-temp$percent, -0.65, .42)
[1] 1.3868932 1.0416839 0.8942693 0.6930076 0.5220458 0.3932595
0.3047536 0.2616262 0.1965053
> qlnorm(1-temp$percent, -0.68, .42)
[1] 1.3459043 1.0108975 0.8678396 0.6725261 0.5066170 0.3816369
0.2957468 0.2538940 0.1906976
(I did make an effort at searching for quantile matching as a method for
distribution fitting, but came up empty.)
--
David.
>
> Please, give me some hint where to look.
>
>
> Best regards
>
> Petr
>[[alternative HTML version deleted]]
More information about the R-help
mailing list