[R] Specifying medoids in PAM?
Martin Maechler
maechler at stat.math.ethz.ch
Thu Jun 9 01:08:50 CEST 2005
>>>>> "MM" == Martin Maechler <maechler at stat.math.ethz.ch>
>>>>> on Wed, 8 Jun 2005 18:57:55 +0200 writes:
>>>>> "David" == David Finlayson <david.p.finlayson at gmail.com>
>>>>> on Wed, 8 Jun 2005 09:24:54 -0700 writes:
David> Sorry, I wasn't trying to submit a bug report just yet.
MM> the posting guide asks you to provide reproducible examples, in
MM> any case, not just for bug reports ...
MM> {and strictly speaking, you still haven't provided one, since
MM> it's a bit painful to read in your table below -- because of the
MM> extra row names ... but here I'm nit picking a bit }
David> I wanted to see if I was using the command correctly.
MM> Yes, you were.
>>> pam(stats.table, metric="euclidean", stand=TRUE, medoids=c(1,3,20,2,5), k=5)
David> This command crashes RGUI.exe and windows sends an error report to
David> Microsoft. It also crashes if I first subtract the NA rows from
David> stats.table.
MM> I can confirm to get segmentation faults using this example data
MM> with k=5 , so effectively, it seems you've uncovered a bug in pam().
MM> I will investigate and patch eventually.
I found and fixed the bug:
Some part of the C code was assuming that the indices in
'medoids' were sorted (increasingly).
I.e., for the moment you can easily work around the problem by
using
pam(stats.table, ...., medoids=c(1,2,3,5,20), k=5)
instead of
pam(stats.table, ...., medoids=c(1,3,20,2,5), k=5)
The next version of the cluster package which allows to specify
the "fuzzyness exponent" in fanny() will have this problem
fixed.
Martin Maechler,
ETH Zurich
More information about the R-help
mailing list