[R] single-pass algorithm for quantile calculation
Roger Koenker
roger at ysidro.econ.uiuc.edu
Tue Apr 3 22:08:00 CEST 2001
As it happens, Gib Bassett recently mentioned to me a rather extreme version of
a procedure like this. Consider the following estimator of the median:
Given iid observations arranged as a matrix {X_ij: i=1,...r, j=1,...,c}
with rc=n, let
betahat_n = min_j {max_i {X_ij}}.
If you let r = r_n = [log_2 n] you can show that that betahat is a median
unbiased estimator of the median and you need only make one pass through
the data, with two storage locations (one for the current column max
and the other for the current row min and you need only n + [log_2 n]
comparisons. Of course from a statistical standpoint the procedure
is awful, converging in distribution at rate 1/log_2 n. A somewhat
less extreme version of this was suggested in Pearl, J. (1981) A
space efficient on-line method of computing quantile estimates, J. of
Algs, 2, 164-177.
url: http://www.econ.uiuc.edu Roger Koenker
email roger at ysidro.econ.uiuc.edu Department of Economics
vox: 217-333-4558 University of Illinois
fax: 217-244-6678 Champaign, IL 61820
On Tue, 3 Apr 2001, Vadim Ogranovich wrote:
> Dear R users, I am looking for a reference to an algorithm for estimation of
> sample quantiles which does not require bringing the whole data into memory
> (more precisely its memory complexity should be much less than linear,
> ideally constant). I realize that such an algorithm can only be approximate
> and actually quite wrong for some samples, but that's fine with me.
>
> Thank you,
> Vadim
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list