[R] PAM clustering (using triangular matrix)
Friedrich Leisch
Friedrich.Leisch at ci.tuwien.ac.at
Wed Jan 10 10:30:48 CET 2001
>>>>> On Tue, 09 Jan 2001 15:42:30 -0700,
>>>>> Jose Quesada (JQ) wrote:
> Hi,
> I'm trying to use a similarity matrix (triangular) as input for pam() or
> fanny() clustering algorithms.
> The problem is that this algorithms can only accept a dissimilarity
> matrix, normally generated by daisy().
> However, daisy only accept 'data matrix or dataframe. Dissimilarities
> will be computed between the rows of x'.
> Is there any way to say to that your data are already a similarity
> matrix (triangular)?
> In Kaufman and Rousseeuw's FORTRAN implementation (1990), they showed an
> option like this one:
> "Maybe you already have correlations coefficients between variables.
> Your input data constist on a lower triangular matrix of pairwise
> correlations. You wish to calculate dissimilarities between the
> variables."
> But I couldn't find this alternative in the R implementation.
> I can not use foo <- as.dist(foo), neither daisy(foo...) because
> "Dissimilarities will be computed between the rows of x", and this is
> not
> what I mean.
> You can easily transform your similarities into dissimilarities like
> this (also recommended in Kaufman and Rousseeuw ,1990):
> foo <- (1 - abs(foo)) # where foo are similarities
> But then pam() will complain like this:
> " x is not of class dissimilarity and can not be converted to this
> class."
> Can anyone help me? I also appreciate any advice about other clustering
> algorithms that can accept this type of input.
Hmm, I don't understand your problem, because proceeding as the docs
describe it works for me ...
If foo is a similarity matrix (with 1 meaning identical objects), then
bar <- as.dist(1 - abs(foo))
fanny(bar, ...)
works for me:
## create a random 12x12 similarity matrix, make it symmetric and set the
## diagonal to 1
> x <- matrix(runif(144), nc=12)
> x <- x+t(x)
> diag(x) <- 1
## now proceed as described in the docs
> y <- as.dist(1-x)
> fanny(y, 3)
iterations objective
42.000000 3.303235
Membership coefficients:
[,1] [,2] [,3]
1 0.3333333 0.3333333 0.3333333
2 0.3333333 0.3333333 0.3333333
3 0.3333334 0.3333333 0.3333333
4 0.3333333 0.3333333 0.3333333
...
--
-------------------------------------------------------------------
Friedrich Leisch
Institut für Statistik Tel: (+43 1) 58801 10715
Technische Universität Wien Fax: (+43 1) 58801 10798
Wiedner Hauptstraße 8-10/1071 Friedrich.Leisch at ci.tuwien.ac.at
A-1040 Wien, Austria http://www.ci.tuwien.ac.at/~leisch
-------------------------------------------------------------------
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list