[R] PAM clustering (using triangular matrix)

Friedrich Leisch Friedrich.Leisch at ci.tuwien.ac.at
Wed Jan 10 10:30:48 CET 2001


>>>>> On Tue, 09 Jan 2001 15:42:30 -0700,
>>>>> Jose Quesada (JQ) wrote:

  > Hi,
  > I'm trying to use a similarity matrix (triangular) as input for pam() or
  > fanny() clustering algorithms.
  > The problem is that this algorithms can only accept a dissimilarity
  > matrix, normally generated by daisy().

  > However, daisy only accept 'data matrix or dataframe. Dissimilarities
  > will be computed between the rows of x'.
  > Is there any way to say to that your data are already a similarity
  > matrix (triangular)?
  > In Kaufman and Rousseeuw's FORTRAN implementation (1990), they showed an
  > option like this one:

  > "Maybe you already have correlations coefficients between variables.
  > Your input data constist on a lower triangular matrix of pairwise
  > correlations. You wish to calculate dissimilarities between the
  > variables."

  > But I couldn't find this alternative in the R implementation.

  > I can not use foo <- as.dist(foo), neither daisy(foo...) because
  > "Dissimilarities will be computed between the rows of x", and this is
  > not
  > what I mean.

  > You can easily transform your similarities into dissimilarities like
  > this (also recommended in Kaufman and Rousseeuw ,1990):

  > foo <- (1 - abs(foo)) # where foo are similarities

  > But then pam() will complain like this:

  > " x is not of class dissimilarity and can not be converted to this
  > class."

  > Can anyone help me? I also appreciate any advice about other clustering
  > algorithms that can accept this type of input.

Hmm, I don't understand your problem, because proceeding as the docs
describe it works for me ...

If foo is a similarity matrix (with 1 meaning identical objects), then

bar <- as.dist(1 - abs(foo))
fanny(bar, ...)

works for me:

## create a random 12x12 similarity matrix, make it symmetric and set the
## diagonal to 1
> x <- matrix(runif(144), nc=12)
> x <- x+t(x)
> diag(x) <- 1

## now proceed as described in the docs
> y <- as.dist(1-x)
> fanny(y, 3)
iterations  objective 
 42.000000   3.303235 
Membership coefficients:
        [,1]      [,2]      [,3]
1  0.3333333 0.3333333 0.3333333
2  0.3333333 0.3333333 0.3333333
3  0.3333334 0.3333333 0.3333333
4  0.3333333 0.3333333 0.3333333
...

-- 
-------------------------------------------------------------------
                        Friedrich  Leisch 
Institut für Statistik                     Tel: (+43 1) 58801 10715
Technische Universität Wien                Fax: (+43 1) 58801 10798
Wiedner Hauptstraße 8-10/1071      Friedrich.Leisch at ci.tuwien.ac.at
A-1040 Wien, Austria             http://www.ci.tuwien.ac.at/~leisch
-------------------------------------------------------------------

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list