[R] [cluster package question] What is the "sum of the dissimilarities" in the pam command ?
maechler at stat.math.ethz.ch
Mon Mar 30 11:31:46 CEST 2009
>>>>> "TG" == Tal Galili <tal.galili at gmail.com>
>>>>> on Sun, 29 Mar 2009 03:09:17 +0300 writes:
TG> Hello Martin Maechler and All,
TG> A simple question (I hope):
TG> How can I compute the "sum of the dissimilarities" that appears in the pam
TG> command (from the cluster package) ?
TG> Is it the "manhattan" distance (such as the one implemented by "dist") ?
well, it first depends if 'x' in pam(x, k, dist, metric, ...)
is *itself* a dissimilarity object or not.
--> help(daisy) and help(dist)
If it is *not* --- which I assume from your question ---
then the answer depends on the 'metric' argument of pam().
As you did not mention that, I assume you left 'metric' at its
default which is "euclidean", i.e.,
TG> I am asking since I am running clustering on a dataset. I found 7 medoids
TG> with the pam command, and from it I have the medoid to which each
TG> observation belongs to. But when I check it, I find only (about) 90% of
TG> observations has the minimum manhattan distance to the medoids that pam
TG> If this is the manhattan distance that is used, I will create some toy data
TG> to see if I can reproduce this.
Yes, specifying some reproducible toy data and specific R code
is almost always useful and typically more productive when
asking such questions by e-mail.
Martin Maechler, ETH Zurich
TG> My contact information:
TG> Tal Galili
TG> Phone number: 972-50-3373767
TG> FaceBook: Tal Galili
TG> My Blogs:
More information about the R-help