[R] Classifying time series by shape over time
Gabor Grothendieck
ggrothendieck at gmail.com
Tue Mar 21 17:41:50 CET 2006
If its good enough just to examine the number of strictly positive runs then
sum(rle(sign(id1$hits))$values == 1)
will give 1 in the good case (one run) and > 1 in the bad case (multiple runs).
On 3/21/06, Andreas Neumann <Andreas.Neumann at em.uni-karlsruhe.de> wrote:
> Dear all,
>
> I have hundreds of thousands of univariate time series of the form:
> character "seriesid", vector of Date, vector of integer
> (some exemplary data is at the end of the mail)
>
> I am trying to find the ones which somehow "have a shape" over time that
> looks like the histogramm of a (skewed) normal distribution:
> > hist(rnorm(200,10,2))
> The "mean" is not interesting, i.e. it does not matter if the first
> nonzero observation happens in the 2. or the 40. month of observation.
> So all that matters is: They should start sometime, the hits per month
> increase, at some point they decrease and then they more or less
> disappear.
>
> Short Example (hits at consecutive months (Dates omitted)):
> 1. series: 0 0 0 2 5 8 20 42 30 19 6 1 0 0 0 -> Good
> 2. series: 0 3 8 9 20 6 0 3 25 67 7 1 0 4 60 20 10 0 4 -> Bad
>
> Series 1 would be an ideal case of what I am looking for.
>
> Graphical inspection would be easy but is not an option due to the huge
> amount of series.
>
> Questions:
>
> 1. Which (if at all) of the many packages that handle time series is
> appropriate for my problem?
>
> 2. Which general approach seems to be the most straightforward and best
> supported by R?
> - Is there a way to test the time series directly (preferably)?
> - Or do I need to "type-cast" them as some kind of histogram
> data and then test against the pdf of e.g. a normal distribution (but
> how)?
> - Or something totally different?
>
>
> Thank you for your time,
>
> Andreas Neumann
>
>
>
>
> Data Examples (id1 is good, id2 is bad):
>
> > id1
> dates hits
> 1 2004-12-01 3
> 2 2005-01-01 4
> 3 2005-02-01 10
> 4 2005-03-01 6
> 5 2005-04-01 35
> 6 2005-05-01 14
> 7 2005-06-01 33
> 8 2005-07-01 13
> 9 2005-08-01 3
> 10 2005-09-01 9
> 11 2005-10-01 8
> 12 2005-11-01 4
> 13 2005-12-01 3
>
>
> > id2
> dates hits
> 1 2001-01-01 6
> 2 2001-02-01 5
> 3 2001-03-01 5
> 4 2001-04-01 6
> 5 2001-05-01 2
> 6 2001-06-01 5
> 7 2001-07-01 1
> 8 2001-08-01 6
> 9 2001-09-01 4
> 10 2001-10-01 10
> 11 2001-11-01 0
> 12 2001-12-01 3
> 13 2002-01-01 6
> 14 2002-02-01 5
> 15 2002-03-01 1
> 16 2002-04-01 2
> 17 2002-05-01 4
> 18 2002-06-01 4
> 19 2002-07-01 0
> 20 2002-08-01 1
> 21 2002-09-01 0
> 22 2002-10-01 2
> 23 2002-11-01 2
> 24 2002-12-01 2
> 25 2003-01-01 2
> 26 2003-02-01 3
> 27 2003-03-01 7
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
More information about the R-help
mailing list