[R-SIG-Finance] [R-sig-finance] http://www.market-topology.com/

Spencer Graves spencer.graves at pdf.com
Sun Mar 2 18:49:55 CET 2008


      The text on 'www.market-topology.com' includes the following:  
"The engine developed by Market Topology SPRL emphasizes the relevant 
correlations which connect the equities. ... In addition, a procedure 
selects the highest coefficients for placing them in a connected graph 
which is a tree." 

      This suggests they are doing some sort of cluster analysis.  
RSiteSearch('cluster analysis', 'fun') just produced 314 hits for me.  I 
rarely use cluster analysis, but I would guess that the most popular 
methods would include hclust{stats}, agnes{cluster}, and Mclust{mclust}. 

      Note, however, that many cluster analysis methods are similar to 
reading tea leaves (http://en.wikipedia.org/wiki/Tasseography) in the 
sense that they will find relationships, independent of whether there 
are relationships to find.  Mclust is model based, which suggests to me 
that it may be less subject to "false positives" than other methods.  
However, you should not take my word for this;  perhaps someone with 
more experience with these methods will enlighten us. 

      A major problem is that you are looking for relationships among 
5,000 or so financial time series, possibly with fewer than 5,000 
observations.  To obtain a full rank estimate of a correlation matrix of 
that size, you need at least that many observations -- and much of the 
subtle structure on which this analysis depends would typically be 
poorly estimated.  Moreover, it would take roughly 20 years of daily log 
returns, for example, just to get a full rank estimate.  Just computing 
that matrix assumes that the relationships are all constant over that 
period of time. 

      It may still be useful, but one would need to understand its 
limitations. 

      Hope this helps. 
      Spencer

Yuri Volchik wrote:
> Hi,
>
> was wondering what algorithm they are using and if it is available in R.
> Using my own judgment it looks like minimum spanning tree algorithm and in R
> it's available in the packages igraph and vegan, is it correct?
>
> Thanks 
>



More information about the R-SIG-Finance mailing list