[R] Identifying breakpoints/inflection points?

Ravi Varadhan rvaradhan at jhmi.edu
Sat Jun 19 20:49:46 CEST 2010


Hi Charlotte,

This may be a bit too late, but I just remembered your question.  I have written some functions to extract various features of a time-series, uusing functional data analytic methods.  These would be part of an R package that will be soon released.  This package can analyze a large collection of time-series.  

Here is how you can use that to solve your problem:

source("features.txt")

year <- c(1967,1968,1969,1970,1971,1972,1973,1974,1975,1976,1977,1978,1979,1980,1981,1982,1983,1984,1985,1986,1987,1988,1989,1990,1991,1992,1993,1994,1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007,2008,2009)

piproute<-c(0.733333333,0.945945946,1.886363636,1.607843137,4.245614035,3.175675676,2.169014085,2,2.136363636,2.65625,2.080645161,2.114754098,2.090909091,3.012195122,2.935897436,2.592105263,1.075757576,1.210526316,1,1.1875,1.903614458,1.385542169,1.788990826,1.163793103,1.558558559,1.595238095,1.758333333,1.858267717,2.169117647,1.403225806,2.859375,3.236220472,2.054263566,3.854166667,1.812080537,2.708029197,2.75862069,2.625954198,4.540740741,3.686567164,2.8,2.968253968,3.517730496)


ans <- features.mat(year, piproute, smoother="glk", plot.it=TRUE)

ans 

ans$cptmat # critical points of the function (minima/maxima)


# The answers depend on how you smooth the data.  Here is a result showing smoothing using a pemalized spline smoother.

ans <- features.mat(year, piproute, smoother="spm", plot.it=TRUE)

ans 

ans$cptmat # critical points of the function (minima/maxima)


Hope this is helpful,
Ravi.
____________________________________________________________________

Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvaradhan at jhmi.edu


----- Original Message -----
From: Charlotte Chang <c.h.w.chang at gmail.com>
Date: Tuesday, April 27, 2010 2:25 am
Subject: Re: [R] Identifying breakpoints/inflection points?
To: Clint Bowman <clint at ecy.wa.gov>
Cc: r-help at r-project.org


> Hi Clint,
>  
>  Thank you for your help with the code. The span recommendation really
>  improved the fit of my LOESS curve. I appreciate your thoughtful
>  assistance!
>  
>  My remaining question is how could I go about identifying the
>  inflection points for the LOESS curve? I was thinking about trying to
>  find the 2nd derivative and then using the uniroot function.
>  
>  My code is here (but it's buggy and doesn't work):
>  
>  birds.lo<-loess.smooth(x,y,span=0.45)
>  d2 <- function(x) {
>  	predict(birds.lo, x, deriv=2)$y
>  }
>  x<-year
>  y<-piproute
>  
>  > d2(x)
>  Error in predict(birds.lo, x, deriv = 2)$y :
>    $ operator is invalid for atomic vectors
>  
>  #Desired next step:
>  uniroot(d2,c(7,10))
>  
>  Any ideas about this would be profoundly appreciated! I'm hitting a 
> dead end.
>  
>  Yours,
>  
>  Charlotte
>  
>  On Mon, Apr 26, 2010 at 3:32 PM, Clint Bowman <clint at ecy.wa.gov> wrote:
>  > Charlotte,
>  >
>  > Try:
>  >
>  > birds.lo <- loess(piproute~year,span=.25)
>  > # play with span to see your desired pattern
>  > birds.pr<-predict(birds.lo, data.frame(year = seq(1967, 2009, 1)), 
> se =
>  > FALSE)
>  > #
>  > plot($year,birds.pr$fit,ylim=c(0,5))
>  > par(new=T)
>  > plot(year,birds.pr$fit,pch="+",col=2,ylim=c(0,5))
>  >
>  >
>  > --
>  > Clint Bowman                    INTERNET:       clint at ecy.wa.gov
>  > Air Quality Modeler             INTERNET:       clint at math.utah.edu
>  > Department of Ecology           VOICE:          (360) 407-6815
>  > PO Box 47600                    FAX:            (360) 407-7534
>  > Olympia, WA 98504-7600
>  >
>  > On Mon, 26 Apr 2010, Charlotte Chang wrote:
>  >
>  >> Hello!
>  >> I have a dataset with the following two vectors:
>  >>
>  >>
>  >> year<-c(1967,1968,1969,1970,1971,1972,1973,1974,1975,1976,1977,1978,1979,1980,1981,1982,1983,1984,1985,1986,1987,1988,1989,1990,1991,1992,1993,1994,1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007,2008,2009)
>  >>
>  >>
>  >> piproute<-c(0.733333333,0.945945946,1.886363636,1.607843137,4.245614035,3.175675676,2.169014085,2,2.136363636,2.65625,2.080645161,2.114754098,2.090909091,3.012195122,2.935897436,2.592105263,1.075757576,1.210526316,1,1.1875,1.903614458,1.385542169,1.788990826,1.163793103,1.558558559,1.595238095,1.758333333,1.858267717,2.169117647,1.403225806,2.859375,3.236220472,2.054263566,3.854166667,1.812080537,2.708029197,2.75862069,2.625954198,4.540740741,3.686567164,2.8,2.968253968,3.517730496)
>  >>
>  >> Pipits is the response variable (it is the number of birds counted 
> at
>  >> each survey site in each year) and year is the independent variable.
>  >> If you plot it in R (plot(year,piproute,pch=19)), you'll see that 
> the
>  >> relationship looks like a quintic polynomial.
>  >>
>  >> Initially I was trying to fit this curve using an iterative equation,
>  >> but it's not working. I suspect that the curve-fitting equation itself
>  >> is inappropriate (it's a modified version of the logistic growth
>  >> equation). Now what I'd like to do is identify the 3 break/inflection
>  >> points in the population trend. That way, I can make an argument that
>  >> the break points corresponded to shifts in government policy with
>  >> respect to land use management. I've been looking at the segmented
>  >> package, and initially I looked at change.pt test in the circ.stats
>  >> package (which is inappropriate b/c my data is not amenable to
>  >> circular statistical analysis). Any ideas on what I could do would 
> be
>  >> appreciated!
>  >>
>  >> Thank you!
>  >>
>  >> -Charlotte
>  >>
>  >> ______________________________________________
>  >> R-help at r-project.org mailing list
>  >> 
>  >> PLEASE do read the posting guide
>  >> 
>  >> and provide commented, minimal, self-contained, reproducible code.
>  >>
>  >
>  
>  ______________________________________________
>  R-help at r-project.org mailing list
>  
>  PLEASE do read the posting guide 
>  and provide commented, minimal, self-contained, reproducible code. 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: features.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100619/aade035e/attachment.txt>


More information about the R-help mailing list