[R] Changepoint analysis--is it possible to attribute changpoints to explanatory variables?

Jones, Kristopher@DWR Kristopher.Jones at water.ca.gov
Tue Sep 16 22:17:00 CEST 2014


Hello, 

I would like to evaluate the relationship between flows and phytoplankton abundance (or Chlorophyll a concentrations) using a changepoint analysis.  Specifically, I have two study questions:

Study Question 1: Are there certain flow thresholds that result in spikes in phytoplankton abundance?
Study Question 2: Are the duration of certain flows important for phytoplankton abundance (e.g., would a certain flow value need to be reached for 1 day, 1 week etc. to create a spike in phytoplankton abundance)? 

Many of the examples I've seen online have only looked for change points in a time series.  However, I have not seen any examples, which look at whether changes in the mean or variance can be attributed to a particular factor (e.g., changes in abundance relative to an environmental factor).  

Question #1: Is it possible to attribute changes in the mean or variance of a time series (e.g., of phytoplankton abundance) to a particular environmental variable (e.g., flows)?  If so, can you provide guidance for how to do that in R (or refer me to a good example)?

Question #2: is it possible to take Question #1 a step further, adding a time component (as described in study question 2, above)? If so, can you provide guidance for how to do that in R (or refer me to a good example)?

One resource on changepoint analyses (using changepoint package) that I have been trying to model my work after (at least the R code) is by Killick and Eckley (Lancaster University).  
http://www.lancs.ac.uk/~killick/Pub/KillickEckley2011.pdf

Their descriptions and the accompanying code were really helpful (although, their questions were not similar to mine, as described above).  In reviewing this document, and other descriptions online, I've noticed that data for changepoint analyses need to be in a time series.  My data is set up with columns of sampling date, Chlorophyll a concentration, and stage (a surrogate for flow).  In reviewing the help online regarding changepoint, I realized that the data I am using would likely not be considered a 'time series', as the sampling did not occur at uniform time intervals. 

Question #3: Do data for changepoint analyses in R need to be at uniform time intervals?  If so, is there an appropriate way to transform my data (which was not collected at uniform time intervals) to make it work in changepoint?

Question #4: Do data in the time series need to be transformed (e.g., Chlorophyll a and Stage)?

Hopefully, I've laid out my question in a way that makes sense.  Any help you can provide would be much appreciated.  I've been trying to read up on this for a while, and have tried to narrow my questions down to those with which I am still struggling.

Thanks in advance for your help.

Kris





  



More information about the R-help mailing list