[R] finding peaks in a simple dataset with R
Dylan Beaudette
dylan.beaudette at gmail.com
Sun Nov 27 05:48:11 CET 2005
On Nov 25, 2005, at 10:27 AM, Martin Maechler wrote:
> Let me try to summarize my view on this:
>
> - I still it would make sense to have a *simple* peaks() function
> in R which provides the same (or more) functionality as the
> corresponding S-plus one.From
> For a proper data analysis situation, I think one would have to
> do something more sophisticated, based on a model (with a random
> component), such as nonparametric regression, time-series,....
> Hence peaks() should be kept as simple as reasonable.
>
> - Of course I know that which() or %in% can be used to deal
> with logicals containing NAs {As a matter of fact, I've had
> my fingers in both implementations for R!}.
> Still, the main use of logical vectors in S often is for
> situations where NAs only appear because of missing data:
>
> Indexing ([]), all(), any(), sum() are all very nice and
> useful for logical vectors particularly when there are no NAs.
>
> - I agree that a different more flexible function returning
> values from {-1,0,1} would be desirable, "for symmetry reasons".
> ===> added a peaksign() function
>
> Here's code that implements the above {and other concerns
> mentioned in this thread}, including some ``consistency
> checking'' :
>
> peaks <- function(series, span = 3, do.pad = TRUE) {
> if((span <- as.integer(span)) %% 2 != 1) stop("'span' must be odd")
> s1 <- 1:1 + (s <- span %/% 2)
> if(span == 1) return(rep.int(TRUE, length(series)))
> z <- embed(series, span)
> v <- apply(z[,s1] > z[, -s1, drop=FALSE], 1, all)
> if(do.pad) {
> pad <- rep.int(FALSE, s)
> c(pad, v, pad)
> } else v
> }
>
> peaksign <- function(series, span = 3, do.pad = TRUE)
> {
> ## Purpose: return (-1 / 0 / 1) if series[i] is ( trough /
> "normal" / peak )
> ##
> ----------------------------------------------------------------------
> ## Author: Martin Maechler, Date: 25 Nov 2005
>
> if((span <- as.integer(span)) %% 2 != 1 || span == 1)
> stop("'span' must be odd and >= 3")
> s1 <- 1:1 + (s <- span %/% 2)
> z <- embed(series, span)
> d <- z[,s1] - z[, -s1, drop=FALSE]
> ans <- rep.int(0:0, nrow(d))
> ans[apply(d > 0, 1, all)] <- as.integer(1)
> ans[apply(d < 0, 1, all)] <- as.integer(-1)
> if(do.pad) {
> pad <- rep.int(0:0, s)
> c(pad, ans, pad)
> } else ans
> }
>
>
> check.pks <- function(y, span = 3)
> stopifnot(identical(peaks( y, span), peaksign(y, span) == 1),
> identical(peaks(-y, span), peaksign(y, span) == -1))
>
> for(y in list(1:10, rep(1,10), c(11,2,2,3,4,4,6,6,6))) {
> for(sp in c(3,5,7))
> check.pks(y, span = sp)
> stopifnot(peaksign(y) == 0)
> }
>
> y <- c(1,4,1,1,6,1,5,1,1) ; (ii <- which(peaks(y))); y[ii]
> ##- [1] 2 5 7
> ##- [1] 4 6 5
> check.pks(y)
>
> set.seed(7)
> y <- rpois(100, lambda = 7)
> check.pks(y)
> py <- peaks(y)
> plot(y, type="o", cex = 1/4, main = "y and peaks(y,3)")
> points(seq(y)[py], y[py], col = 2, cex = 1.5)
>
> p7 <- peaks(y,7)
> points(seq(y)[p7], y[p7], col = 3, cex = 2)
> mtext("peaks(y,7)", col=3)
>
> set.seed(2)
> x <- round(rnorm(500), 2)
> y <- cumsum(x)
> check.pks(y)
>
> plot(y, type="o", cex = 1/4)
> p15 <- peaks(y,15)
> points(seq(y)[p15], y[p15], col = 3, cex = 2)
> mtext("peaks(y,15)", col=3)
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>
Wow! This was the exact sort of simple peak finding algorithm I was
looking for. Would it be ok for me to post this to our dept. webpage so
that others may use it?
Cheers,
--
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341
More information about the R-help
mailing list