[R] Novice question : Classification of time series

Christian Hennig hennig at stat.math.ethz.ch
Fri Feb 1 14:08:34 CET 2002


something about your classification issue. It seems to me that you do not
want to assume any known class memberships, therefore this is cluster
analysis, unsupervised classification, respectively.

I assume that you have read your data (read.table or scan) and that you can
compute pairwise correlations (cor; the background information you provided
does not suffice for me to tell anything about how to tackle the varying
time spans) between your series.

My suggestion is as follows. Compute a similarity matrix between the
series, similarity defined as correlation+1 or abs(correlation), depending
on how negative correlation is interpreted in your setup.

Perform a distance based cluster analysis method on the matrix. The method of
choice depends on your application and data: How many classes do you want
(and how exactly do you know that)?
How do you expect or want the "shapes" of clusters? - It may be a good idea
to look at the results of a multidimensional scaling on the similarity 

Possible methods are e.g. pam (in library cluster) and the various
hierarchical methods provided by hclust and hierclust (they seem to do
almost the same and I do not know which one is better). Take a look on the
help pages to learn more about them.


On Fri, 1 Feb 2002, Neil Osborne wrote:

> Hello all,
> I know this may not be the right forum for this. But I'm relatively new to R
> (and it's been a while since I did any serious statistical research). I need
> some help in using R for my project. This is what I need to do :
> 1. Read time series data (of varying time spans) from text files into R
> arrays
> 2. Segregating the time series into "sets" of "classes" that have members
> that are highly correlated to fellow members of the same set, but have a low
> correlation with members from another set. In other words, members of a set
> would "tend" to move together.
> 3. Print the resulting classifications to a text file.
> I'm not sure what would be the most appropriate methodology to use, and even
> less sure about which commands to use. I will therefore be extremely
> grateful to anyone who can offer some general guidance or at least, point me
> in the right direction.
> Many thanks in advance
> Neil
> PS (I'm running R v1.30 on Win2k)
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Christian Hennig
Seminar fuer Statistik, ETH-Zentrum (LEO), CH-8092 Zuerich (current)
and Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
hennig at stat.math.ethz.ch, http://stat.ethz.ch/~hennig/
hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/
ich empfehle www.boag.de

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list