[Rd] question about similarities cluster using hierclust

Martin Maechler maechler at stat.math.ethz.ch
Thu Jun 10 09:30:37 CEST 2004


Hmm,

why on earth are you using hierclust() from the ORPHANED package
'multiv',  when there's  hclust() in the core 'stats' package
and 'agnes' in the recommended 'cluster' package ?

To your question  "similarities -> dissimilarities"
the textbooks all deal with this.

Assuming similarities s_ij in [0,1]  {which you can get by scaling},
things mentioned are
e.g.,
       d_ij := 1 - s_ij
       d_ij := sqrt(1 - (s_ij)^2)
also   d_ij := sqrt(1 -   s_ij)

but really, in your situation where you're defining your
similarities yourself, you probably should rather think about
defining your dissimilarities yourself *directly* {i.e. not via
the above formulae}.

Martin Maechler

>>>>> "Xinan" == Xinan Yang <xinan at molgen.mpg.de>
>>>>>     on Thu, 10 Jun 2004 09:04:05 +0200 writes:

    Xinan> my major is bioinformatics, and i'm trying to cluster ( agglomerate
    Xinan> the closest pari of observations ) in R.


    Xinan> i have already got my own similarities metric, but do not know how to
    Xinan> clust it based on similarities instead of dissimilarities.


    Xinan> since the help document of hierclust mentions the parameter "sim",
    Xinan> which seems good to me, but it doesn't appear in the code of
    Xinan> hierclust() function again? and no sample about it.  so could anybody
    Xinan> please help me as author?

    Xinan> thanks in advance

    Xinan> xinan yang
    Xinan> xinan at molgen.mpg.de



More information about the R-devel mailing list