[R] finding peaks in a simple dataset with R

Dylan Beaudette dylan.beaudette at gmail.com
Sun Nov 27 05:48:11 CET 2005


On Nov 25, 2005, at 10:27 AM, Martin Maechler wrote:

> Let me try to summarize my view on this:
>
> - I still it would make sense to have a *simple* peaks() function
>   in R which provides the same (or more) functionality as the
>   corresponding S-plus one.From
>   For a proper data analysis situation, I think one would have to
>   do something more sophisticated, based on a model (with a random
>   component), such as nonparametric regression, time-series,....
>   Hence peaks() should be kept as simple as reasonable.
>
> - Of course I know that  which() or %in% can be used to deal
>   with logicals containing NAs {As a matter of fact, I've had
>   my fingers in both implementations for R!}.
>   Still, the main use of logical vectors in S often is for
>   situations where NAs only appear because of missing data:
>
>   Indexing ([]), all(), any(), sum()  are all very nice and
>   useful for logical vectors particularly when there are no NAs.
>
> - I agree that a different more flexible function returning
>   values from {-1,0,1} would be desirable, "for symmetry reasons".
>   ===> added a peaksign() function
>
> Here's code that implements the above {and other concerns
> mentioned in this thread}, including some ``consistency
> checking'' :
>
> peaks <- function(series, span = 3, do.pad = TRUE) {
>     if((span <- as.integer(span)) %% 2 != 1) stop("'span' must be odd")
>     s1 <- 1:1 + (s <- span %/% 2)
>     if(span == 1) return(rep.int(TRUE, length(series)))
>     z <- embed(series, span)
>     v <- apply(z[,s1] > z[, -s1, drop=FALSE], 1, all)
>     if(do.pad) {
>         pad <- rep.int(FALSE, s)
>         c(pad, v, pad)
>     } else v
> }
>
> peaksign <- function(series, span = 3, do.pad = TRUE)
> {
>     ## Purpose: return (-1 / 0 / 1) if series[i] is ( trough / 
> "normal" / peak )
>     ## 
> ----------------------------------------------------------------------
>     ## Author: Martin Maechler, Date: 25 Nov 2005
>
>     if((span <- as.integer(span)) %% 2 != 1 || span == 1)
>         stop("'span' must be odd and >= 3")
>     s1 <- 1:1 + (s <- span %/% 2)
>     z <- embed(series, span)
>     d <- z[,s1] - z[, -s1, drop=FALSE]
>     ans <- rep.int(0:0, nrow(d))
>     ans[apply(d > 0, 1, all)] <- as.integer(1)
>     ans[apply(d < 0, 1, all)] <- as.integer(-1)
>     if(do.pad) {
>         pad <- rep.int(0:0, s)
>         c(pad, ans, pad)
>     } else ans
> }
>
>
> check.pks <- function(y, span = 3)
>     stopifnot(identical(peaks( y, span), peaksign(y, span) ==  1),
>               identical(peaks(-y, span), peaksign(y, span) == -1))
>
> for(y in list(1:10, rep(1,10), c(11,2,2,3,4,4,6,6,6))) {
>     for(sp in c(3,5,7))
>         check.pks(y, span = sp)
>     stopifnot(peaksign(y) == 0)
> }
>
> y <- c(1,4,1,1,6,1,5,1,1) ; (ii <- which(peaks(y))); y[ii]
> ##- [1] 2 5 7
> ##- [1] 4 6 5
> check.pks(y)
>
> set.seed(7)
> y <- rpois(100, lambda = 7)
> check.pks(y)
> py <- peaks(y)
> plot(y, type="o", cex = 1/4, main = "y and peaks(y,3)")
> points(seq(y)[py], y[py], col = 2, cex = 1.5)
>
> p7 <- peaks(y,7)
> points(seq(y)[p7], y[p7], col = 3, cex = 2)
> mtext("peaks(y,7)", col=3)
>
> set.seed(2)
> x <- round(rnorm(500), 2)
> y <- cumsum(x)
> check.pks(y)
>
> plot(y, type="o", cex = 1/4)
> p15 <- peaks(y,15)
> points(seq(y)[p15], y[p15], col = 3, cex = 2)
> mtext("peaks(y,15)", col=3)
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>
>

Wow! This was the exact sort of simple peak finding algorithm I was 
looking for. Would it be ok for me to post this to our dept. webpage so 
that others may use it?

Cheers,

--
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341




More information about the R-help mailing list