[R] question on silhouette colours

Martin Maechler maechler at stat.math.ethz.ch
Mon Aug 29 11:33:45 CEST 2011


>>>>> Gordon Robertson <grobertson at bcgsc.ca>
>>>>>     on Wed, 24 Aug 2011 22:21:22 -0700 writes:

    > I'm fairly new to the silhouette functionality in the
    > cluster package, so apologize if I'm asking something
    > naive.  If I run the 'agnes(ruspini)' example from the
    > silhouette section of the cluster package vignette, and
    > assign colours to clusters, two clusters have what appear
    > to be incorrect colours in the silhouette plot.

    > library(cluster)
    > data(ruspini)
    > ar<- agnes(ruspini)
    > si3<- silhouette(cutree(ar, k = 5), daisy(ruspini))

Thank you, Gordon, for the simple reproducible example.

    > # 1. This gives a mid-gray silhouette plot, which does not show the problem
    > plot(si3, nmax = 80, cex.names = 0.5) 
    > # 2. This gives a multicolour silhouette plot, but there are three black lines/bars in the yellow cluster, and the cluster that should be black is actually yellow?
    > plot(si3, nmax = 80, cex.names = 0.5, col=c("red","blue","yellow","black","green"))

    > # 3. Check sorting by writing out sorted results to a file, then plotting from the file

    > si3.sorted<-
    > write.table(si3.sorted,"/...myPath.../si3.sorted.txt",sep="\t")

well, just  
    > sortSilhouette(si3) # printing to the console
is sufficient to inspect ...

    > Inspecting the si3.sorted.txt file, cluster numbers are ordered as expected (1's then 2's then...), and sil_width's within each cluster appear correctly sorted (descending). Given this, if I load the file into say Mathematica, and plot it with colours, I easily generate a graphic that is like the one from R, but in which all cluster colours are as expected, i.e. there are no black bars in the yellow region, and the cluster that should be black -is- black. 

    > Again, I apologize if I'm missing something simple. Thanks for your help in understanding this behaviour.

As a matter of fact, I'm pretty sure you found a bug.
Note that it would be better in such cases (a function in an R package)
to first contact the package maintainer, in this case

   > maintainer("cluster")
  [1] "Martin Maechler <maechler at stat........>"

but I did see your message on R-help "by luck" and so have been able to
act on it.

The next version of cluster, '1.14.1' will have this buglet
fixed.

Thank you for your "question"!
Best regards,
Martin Maechler, ETH Zurich



More information about the R-help mailing list