[R] Suggestion about "R equivalent of Splus peaks() function"
Earl F. Glynn
efg at stowers-institute.org
Thu Feb 8 23:34:31 CET 2007
In 2004 there was this R-Help posting from Jan 2004:
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/33097.html
R equivalent of Splus peaks() function?
The peaks function there has worked well for me on a couple of projects, but
some code using "peaks" failed today, which had worked fine in the past.
I was looking for a peak in a test case that was a sine curve over one
cycle, so there should have been only one peak. My unexpected surprise was
to sometimes get one peak, or two adjoining peaks (a tie), but the no peaks
case cause subsequent code to fail. I wanted to eliminate this "no peak"
case when there was an obvious peak.
I thought it was odd that the peak failure could be controlled by the random
number seed.
# R equivalent of Splus peaks() function
# http://finzi.psych.upenn.edu/R/Rhelp02a/archive/33097.html
peaks <- function(series,span=3)
{
z <- embed(series, span)
s <- span%/%2
v <- max.col(z) == 1 + s
result <- c(rep(FALSE,s),v)
result <- result[1:(length(result)-s)]
result
}
> set.seed(19)
> peaks(c(1,4,4,1,6,1,5,1,1),3)
[1] FALSE TRUE FALSE FALSE TRUE FALSE TRUE
> peaks(c(1,4,4,1,6,1,5,1,1),3)
[1] FALSE TRUE FALSE FALSE TRUE FALSE TRUE
> peaks(c(1,4,4,1,6,1,5,1,1),3)
[1] FALSE TRUE TRUE FALSE TRUE FALSE TRUE
> peaks(c(1,4,4,1,6,1,5,1,1),3)
[1] FALSE FALSE FALSE FALSE TRUE FALSE TRUE
> peaks(c(1,4,4,1,6,1,5,1,1),3)
[1] FALSE FALSE TRUE FALSE TRUE FALSE TRUE
Above, the "4" peak at positions 2 and 3 is shown by the TRUE and FALSE in
positions 2 and 3 above. Case 4 of FALSE, FALSE was most unexpected -- no
peak.
I studied the peaks code and found the problem seems to be in max.col:
> z
[,1] [,2] [,3]
[1,] 4 4 1
[2,] 1 4 4
[3,] 6 1 4
[4,] 1 6 1
[5,] 5 1 6
[6,] 1 5 1
[7,] 1 1 5
> max.col(z)
[1] 2 3 1 2 3 2 3
> max.col(z)
[1] 2 2 1 2 3 2 3
> max.col(z)
[1] 1 2 1 2 3 2 3
> max.col(z)
[1] 2 2 1 2 3 2 3
> max.col(z)
[1] 1 3 1 2 3 2 3
> max.col(z)
[1] 2 2 1 2 3 2 3
The ?max.col help shows that it has a ties.method that defaults to "random".
I want a peak, any peak if there is a tie, but I don't want the case that a
tie is treated as "no peak". For now, I added a "first" parameter to
max.col in peaks:
# Break ties by using "first"
peaks <- function(series,span=3)
{
z <- embed(series, span)
s <- span%/%2
v <- max.col(z, "first") == 1 + s
result <- c(rep(FALSE,s),v)
result <- result[1:(length(result)-s)]
result
}
A better solution might be a ties.method parameter to peaks, which can be
passed to max.col.
I did all of this in R 2.4.1, but the problem seems to be in earlier
versions too.
Just in case anyone else is using this "peaks" function.
efg
Earl F. Glynn
Stowers Institute for Medical Research
More information about the R-help
mailing list