[R] correlation as distance/dissimilarity
Martin Maechler
maechler at stat.math.ethz.ch
Wed Sep 14 17:49:10 CEST 2005
I've been asked (privately)
>>>>> "CarlosJ" == jaramilloc <jaramilloc at si.edu>
>>>>> on Wed, 14 Sep 2005 09:40:22 -0400 writes:
..........
CarlosJ> In Kaufman & Rousseeuw 2000 book on Cluster Analysis, it says that
CarlosJ> Daisy can compute Pearson correlation between variables and then
CarlosJ> transform these to dissimilarities.
I don't think it does say this. But it does talk about doing it
"your self", e.g., on pages 17--19.
CarlosJ> Has this capability being
CarlosJ> implemented in the Cluster package for R? It seems that is not
CarlosJ> there. How could I do that using R?
CarlosJ> I would appreciate your help.
It has never been explicitly in R, because in the past 'everyone'
has thought this was obvious and trivial. The "past" here was
when S was used by statisticians, mathematicians or engineers...
Anyway, here is an example on how to do this.
> dd <- as.dist((1 - cor(USJudgeRatings))/2)
> plot(hclust(dd))
> round(1000 * dd)
CONT INTG DMNR DILG CFMG DECI PREP FAMI ORAL WRIT PHYS
INTG 567
DMNR 577 18
DILG 494 64 82
CFMG 432 93 93 21
DECI 457 99 98 22 9
PREP 494 61 72 11 21 21
FAMI 513 66 79 21 32 29 5
ORAL 506 44 47 23 25 26 8 9
WRIT 522 46 53 20 29 27 7 5 3
PHYS 473 129 106 94 60 64 76 78 54 72
RTEN 517 31 28 35 36 38 25 29 9 16 47
I'm going to add the example to the help page for 'dist' in R-2.2.0
Martin Maechler
More information about the R-help
mailing list