Running Median and Mean

Warnes, Gregory R gregory_r_warnes@groton.pfizer.com
Mon, 16 Sep 2002 12:13:22 -0400


I am willing to add these functions into the gregmisc package to complement
the already-existing 'running' function.

-Greg

> -----Original Message-----
> From: David Brahm [mailto:brahm@alum.mit.edu]
> Sent: Monday, September 16, 2002 11:10 AM
> To: r-devel@stat.math.ethz.ch
> Cc: maechler@stat.math.ethz.ch; Ray.Brownrigg@mcs.vuw.ac.nz
> Subject: Running Median and Mean
> 
> 
> R gurus,
> 
> On Aug 20, 2002, I asked in R-help about calculating a 
> running 5-day median on
> a large matrix.  Thanks to Martin Maechler 
> <maechler@stat.math.ethz.ch> and Ray
> Brownrigg <Ray.Brownrigg@mcs.vuw.ac.nz> for responding.
> 
> I ended up writing C code (and an R interface) to do it, 
> which is about 1000x
> faster than the naive method!  (72s became .09s on a 223 x 
> 520 matrix).  I
> added a running mean function too, which is very simple and 
> is about 15x
> faster than apply(..., filter, ...).  I've packaged the code as
>   <http://cran.r-project.org/src/contrib/Devel/runStat_1.1.tar.gz>
> 
> The key insight was that it *is* in fact a good idea to 
> calculate the ranks of
> the 5 data points, because then it is easy to calculate how 
> those ranks
> *change* as you step through the column.  For example, 
> suppose the last few
> points in a column look like:
>    data=  20 25 18 14 78 55 29
>    rank=      3  2  1  5  4
> where I have ranked the 5 points prior to the last one.  The 
> median is 25 (the
> point with rank 3).  Stepping to the left:
>    data=  20 25 18 14 78 55 29
>    rank=   3  4  2  1  5
> For the 4 points that overlap, their ranks remain the same, 
> except they are
> incremented by one for points >=20, and decremented by one 
> for points >55.  The
> rank of the new point (20) is just 5 minus the number of 
> increments done.
> 
> The code is copied below for your amusement.  Suggestions 
> welcome.  Note I have
> not dealt with NA's, even values of N, row-wise medians, or 
> non-lagged medians
> (i.e. I put the median of points 1:5 into position 6, not 
> position 5).  It
> would be great if somebody (Martin?) wanted to incorporate 
> these functions into
> an existing package or even "base"; otherwise I'll post the 
> final product on
> CRAN/src/contrib.
>                               -- David Brahm (brahm@alum.mit.edu)
> 
> ############### C Code: ##################
> 
> #include <R.h>
> 
> void runMedian(double *m, int *d, int *N) {
>   int i, j, k, a, R=(*N+1)/2, *r=(int*) R_alloc(*N, sizeof(int));
>   double old, new, *x;
> 
>   for (j=1; j <= d[1]; j++) {	                        /* Loop 
> over columns */
>     x = m + j*d[0] - 1 - *N;                         /* x[*N] 
> = m[nrow(m),j] */
> 
>     for (i=0; i < *N; i++) {	              /* Calculate 
> initial ranks "r" */
>       r[i] = 1;	                              /* r 
> higher if larger or later */
>       for (k=0;   k < i;  k++) if (x[i] >= x[k]) r[i]++;
>       for (k=i+1; k < *N; k++) if (x[i] >  x[k]) r[i]++;
>       if (r[i]==R) x[*N] = x[i];            /* If rank=R, 
> this is the median */
>     }
> 
>     for (i=d[0]-1; i > *N; i--) {                      /* Now 
> x[*N] = m[i,j] */
>       x--;  old=x[*N];  new=x[0];  a=*N;              /* a = 
> rank of new guy */
>       for (k=*N-1; k > 0; k--) {	  /* Recalculate each 
> element's rank */
> 	r[k] = r[k-1];		         /* Shift previous 
> iteration's ranks */
> 	if (x[k] >= new) {r[k]++; a--;}	    /* Are we adding a 
> lower number? */
> 	if (x[k] >  old) {r[k]--;}       /* Are we dropping a 
> higher number? */
> 	if (r[k]==R) x[*N] = x[k];          /* If rank=R, this 
> is the median */
>       }
>       r[0] = a;
>       if (a==R) x[*N] = new;	               /* Is the new 
> guy the median? */
>     }
>   }
> }
> 
> 
> void runMean(double *m, int *d, int *N) {
>   int i, j, k;
>   double new, *x;
> 
>   for (j=1; j <= d[1]; j++) {	                        /* Loop 
> over columns */
>     x = m + j*d[0] - 1 - *N;
>     for (i=d[0]; i > *N; i--) {                            /* 
> x[*N] = m[i,j] */
>       for (k=0, new=0; k < *N; k++) new += x[k];
>       x[*N] = new / *N;  x--;
>     }
>   }
> }
> 
> ############### R Code: ##################
> 
> runMedian <- function(m, N=5) {
>   if (!(N %% 2)) stop("N should be odd.")
>   d <- od <- dim(m)
>   if (!(nd <- length(d))) d <- c(length(m), 1)               
> # Vector -> matrix
>   if (nd > 2) d <- c(d[1], prod(d[-1]))                      
> # Array  -> matrix
>   if (d[1] <= N) {m[] <- NA; return(m)}
>   m <- array(.C("runMedian", m=as.real(m), as.integer(d), 
> as.integer(N))$m, d)
>   m[1:N, ] <- NA
>   if (nd==2) m else if (!nd) m[ ,1] else array(m, od)     # 
> Matrix/Vector/Array
> }
> 
> runMean <- function(m, N=21) {
>   d <- od <- dim(m)
>   if (!(nd <- length(d))) d <- c(length(m), 1)               
> # Vector -> matrix
>   if (nd > 2) d <- c(d[1], prod(d[-1]))                      
> # Array  -> matrix
>   if (d[1] <= N) {m[] <- NA; return(m)}
>   m <- array(.C("runMean", m=as.real(m), as.integer(d), 
> as.integer(N))$m, d)
>   m[1:N, ] <- NA
>   if (nd==2) m else if (!nd) m[ ,1] else array(m, od)     # 
> Matrix/Vector/Array
> }
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
> -.-.-.-.-.-.-.-.-
> r-devel mailing list -- Read 
> http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: 
> r-devel-request@stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
> _._._._._._._._._
> 


LEGAL NOTICE
Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately.
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._