[R-SIG-Finance] Scaling and Clustering of Financial Data

stefano iacus stefano.iacus at unimi.it
Sat May 10 06:11:45 CEST 2014


Hi Alan,
if the interest is in the time series per se, maybe you can have a look at MOdist function in the sde package and related paper. MOdist does not need any normalization as it clusters on the whole Markov property of the process.
dtw is another option, but there you may want to standardize.
As usual it depends on the purpose of your study.

Stefano

Il giorno 25/apr/2014, alle ore 06:21, Adam Ginensky <adamno227 at gmail.com> ha scritto:

> All,
> 
> Thank you for your comments.  It definitely reinforced my feeling that
> scaling is absolutely necessary and gave me some pointers on where to look
> for further thoughts.
> 
> Adam
> 
> 
> On Wed, Apr 23, 2014 at 2:40 PM, Dominykas Grigonis <
> dominykasgrigonis at gmail.com> wrote:
> 
>> Standard standardising would be subtract mean and divide by standard
>> deviation. i.e. it would be clustering by mahalanobis distance.
>> 
>> 
>> Kind regards,
>> --
>> Dominykas Grigonis
>> 
>> On Wednesday, 23 April 2014 at 15:53, Adam Ginensky wrote:
>> 
>> I'm looking at clustering of stocks based on their fundamental financial
>> data. I have about 80 variables per stock. I have the standard k-means
>> package. Firstly, I am wondering if there are any other R packages that
>> may be more useful for clustering of financial data.
>> My second, and more important (to me), question is- Should one scale the
>> data before clustering. I'm particularly worried that since certain
>> variables can be orders of magnitude larger than other equally interesting
>> variables (-think market cap and p/e). I realize this is not an R question
>> per se, but I feel I am more likely to get a good answer out of this forum
>> than any other because of the concentration of financial practitioners. Of
>> course, I apologize in advance, if it is too 'off-topic' and then simply
>> ask for a better place to post. Thanks.
>> 
>> Adam Ginensky
>> 
>> [[alternative HTML version deleted]]
>> 
>> _______________________________________________
>> R-SIG-Finance at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only. If you want to post, subscribe first.
>> -- Also note that this is not the r-help list where general R questions
>> should go.
>> 
>> 
>> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-SIG-Finance at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.


****************
Il 5 x mille alla nostra Università è un investimento sui giovani,
sui loro migliori progetti.

Sostiene la libera ricerca.
Alimenta le loro speranze nel futuro.

Investi il tuo 5 x mille sui giovani.

Università degli Studi di Milano
codice fiscale 80012650158

http://www.unimi.it/13084.htm?utm_source=firmaMail&utm_medium=email&utm_content=linkFirmaEmail&utm_campaign=5xmille



More information about the R-SIG-Finance mailing list