[R] Classifying time series by shape over time

Gabor Grothendieck ggrothendieck at gmail.com
Tue Mar 21 17:41:50 CET 2006


If its good enough just to examine the number of strictly positive runs then

sum(rle(sign(id1$hits))$values == 1)

will give 1 in the good case (one run) and > 1 in the bad case (multiple runs).

On 3/21/06, Andreas Neumann <Andreas.Neumann at em.uni-karlsruhe.de> wrote:
> Dear all,
>
> I have hundreds of thousands of univariate time series of the form:
> character "seriesid", vector of Date, vector of integer
> (some exemplary data is at the end of the mail)
>
> I am trying to find the ones which somehow "have a shape" over time that
> looks like the histogramm of a (skewed) normal distribution:
> >  hist(rnorm(200,10,2))
> The "mean" is not interesting, i.e. it does not matter if the first
> nonzero observation happens in the 2. or the 40. month of observation.
> So all that matters is: They should start sometime, the hits per month
> increase, at some point they decrease and then they more or less
> disappear.
>
> Short Example (hits at consecutive months (Dates omitted)):
> 1. series: 0 0 0 2 5 8 20 42 30 19 6 1 0 0 0                -> Good
> 2. series: 0 3 8 9 20 6 0 3 25 67 7 1 0 4 60 20 10 0 4      -> Bad
>
> Series 1 would be an ideal case of what I am looking for.
>
> Graphical inspection would be easy but is not an option due to the huge
> amount of series.
>
> Questions:
>
> 1. Which (if at all) of the many packages that handle time series is
> appropriate for my problem?
>
> 2. Which general approach seems to be the most straightforward and best
> supported by R?
> - Is there a way to test the time series directly (preferably)?
> - Or do I need to "type-cast" them as some kind of histogram
>  data and then test against the pdf of e.g. a normal distribution (but
>  how)?
> - Or something totally different?
>
>
> Thank you for your time,
>
>     Andreas Neumann
>
>
>
>
> Data Examples (id1 is good, id2 is bad):
>
> > id1
>        dates       hits
> 1  2004-12-01         3
> 2  2005-01-01         4
> 3  2005-02-01        10
> 4  2005-03-01         6
> 5  2005-04-01        35
> 6  2005-05-01        14
> 7  2005-06-01        33
> 8  2005-07-01        13
> 9  2005-08-01         3
> 10 2005-09-01         9
> 11 2005-10-01         8
> 12 2005-11-01         4
> 13 2005-12-01         3
>
>
> > id2
>        dates       hits
> 1  2001-01-01         6
> 2  2001-02-01         5
> 3  2001-03-01         5
> 4  2001-04-01         6
> 5  2001-05-01         2
> 6  2001-06-01         5
> 7  2001-07-01         1
> 8  2001-08-01         6
> 9  2001-09-01         4
> 10 2001-10-01        10
> 11 2001-11-01         0
> 12 2001-12-01         3
> 13 2002-01-01         6
> 14 2002-02-01         5
> 15 2002-03-01         1
> 16 2002-04-01         2
> 17 2002-05-01         4
> 18 2002-06-01         4
> 19 2002-07-01         0
> 20 2002-08-01         1
> 21 2002-09-01         0
> 22 2002-10-01         2
> 23 2002-11-01         2
> 24 2002-12-01         2
> 25 2003-01-01         2
> 26 2003-02-01         3
> 27 2003-03-01         7
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list