[R] Median of streaming data
Rolf Turner
r.turner at auckland.ac.nz
Thu Sep 25 01:44:38 CEST 2014
On 24/09/14 20:16, Martin Maechler wrote:
<SNIP>
> 1) has your proposal ever been provided in R?
> I'd be happy to add it to the robustX
> (http://cran.ch.r-project.org/web/packages/robustX) or even
> robustbase (http://cran.ch.r-project.org/web/packages/robustbase) package.
<SNIP>
I have coded up the algorithm from the Cameron and Turner paper. Dunno
if it gives exactly the same results as my (Splus?) code from lo these
many years ago (the code that is lost in the mists of time), but it
*seems* to work.
It is not designed to work with actual "streaming" data --- I don't know
how to do that. It takes a complete data vector as input. Someone who
knows about streaming data should be able to adapt it pretty easily.
Said he, the proverbial optimist.
The function code and a help file are attached. These files have had
their names changed to end in ".txt" so that they will get through the
mailing list processor without being stripped. With a bit of luck.
If they *don't* get through, anyone who is interested should contact me
and I will send them to you "privately".
cheers,
Rolf
--
Rolf Turner
Technical Editor ANZJS
-------------- next part --------------
rlas <- function(y,b=0.2,mfn=function(n){0.1*n^(-0.25)},
nstart=30,scon=NULL) {
# Initialize:
y0 <- y[1:nstart]
alpha <- median(y0)
s <- if(is.null(scon)) mean(abs(y0-alpha)) else scon
mu <- mfn(nstart)/s
eps <- s/nstart^b
kount <- sum(abs(alpha-y0) < eps)
g <- kount/(eps*nstart)
ny <- length(y)
n <- nstart+1
locn <- numeric(ny)
locn[1:(nstart-1)] <- NA
locn[nstart] <- alpha
scale <- numeric(ny)
scale[1:(nstart-1)] <- NA
scale[nstart] <- s
# Calculate recursively:
while(n <= ny) {
s <- if(is.null(scon)) ((n-1)*s + abs(y[n]-alpha))/n else scon
mu <- mfn(n)/s
eye <- if(abs(alpha - y[n]) < s/n^b) 1 else 0
g <- (1-1/n)*g + n^(b-1)*eye/s
a <- max(mu,g)
alpha <- alpha + sign(y[n]-alpha)/(a*n)
locn[n] <- alpha
scale[n] <- s
n <- n+1
}
list(locn=locn,scale=scale)
}
-------------- next part --------------
\name{rlas}
\alias{rlas}
\title{
Recursive location and scale.
}
\description{
Calculate a recursive estimate of location, asymptotically
equivalent to the median, and a recursive estimate of scale
equal to the \bold{MEAN} absolute deviation.
}
\usage{
rlas(y, b = 0.2, mfn = function(n){0.1 * n^(-0.25)},
nstart = 30, scon = NULL)
}
\arguments{
\item{y}{
A numeric vector of i.i.d. data whose location and scale
parameters are to be estimated.
}
\item{b}{
A tuning parameter (default value equal to that used by
Holst, 1987).
}
\item{mfn}{
Function of the index of the data which must be positive and
and tend to 0 as the index tends to infinity. The default
function is that used by Holst, 1987.
}
\item{nstart}{
Starting values for the algorithm are formed from the first
\code{nstart} values of \code{y}.
}
\item{scon}{
A constant value for the scale parameter \code{s}. If this
is provided then the algorithm becomes equivalent to the
algorithm of Holst, 1987.
}
}
\value{
A list with entries
\item{locn}{The successive recursive estimates of location. The
first \code{nstart - 1} of these are \code{NA}.}
\item{scale}{The successive recursive estimates of scale. If
\code{scon} is specified these are all equal to \code{scon}.}
}
\references{
Cameron, Murray A. and Turner, T. Rolf (1993). Recursive location and
scale estimators. \emph{Commun. Statist. --- Theory Meth.} \bold{22}
(9) 2503--2515.
Holst, U. (1987). Recursive estimators of location.
\emph{Commun. Statist. --- Theory Meth.} \bold{16} (8) 2201--2226.
}
\author{
\email{r.turner at auckland.ac.nz}
\url{http://www.stat.auckland.ac.nz/~rolf}
}
\examples{
set.seed(42)
y <- rnorm(10000)
z1 <- rlas(y)
z2 <- rlas(y,scon=1)
z3 <- rlas(y,scon=100)
OP <- par(mfrow=c(3,1))
plot(z1$locn,type="l")
abline(h=median(y),col="red")
plot(z2$locn,type="l")
abline(h=median(y),col="red")
plot(z3$locn,type="l")
abline(h=median(y),col="red")
par(OP)
}
}
\keyword{ univar }
\keyword{ robust }
More information about the R-help
mailing list