[R] estimate the number of clusters

Martin Maechler maechler at stat.math.ethz.ch
Tue Jun 10 19:03:28 CEST 2003


>>>>> "MM" == Martin Maechler <maechler at stat.math.ethz.ch>
>>>>>     on Tue, 10 Jun 2003 18:12:36 +0200 writes:

    MM> Ping, you found another bug in silhouette.default() --
    MM> which can happen when there's one cluster with exactly
    MM> one observation.

    MM> I'll let you know more, once I have a complete fix.

The patch for this bug  {against an *installed* version of cluster}
is this :
---------------------------

--- ........cluster-version-1.7-2..../library/cluster/R/cluster	Thu Jun  5 04:00:15 2003
+++ ........fixed............................/cluster/R/cluster	Tue Jun 10 18:56:17 2003
@@ -2019,11 +2019,11 @@
         wds[iC, "cluster"] <- j
         a.i <- if(Nj > 1) colSums(dmatrix[iC, iC])/(Nj - 1) else 0 # length(a.i)= Nj
         ## minimal distances to points in all other clusters:
-        diC <- rbind(apply(dmatrix[!iC, iC], 2,
+        diC <- rbind(apply(dmatrix[!iC, iC, drop = FALSE], 2,
                            function(r) tapply(r, x[!iC], mean)))# (k-1) x Nj
         minC <- max.col(-t(diC))
         wds[iC,"neighbor"] <- clid[-j][minC]
-        b.i <- diC[cbind(minC, seq(minC))]
+        b.i <- diC[cbind(minC, seq(along = minC))]
         s.i <- (b.i - a.i) / pmax(b.i, a.i)
         wds[iC,"sil_width"] <- s.i
     }

---------------------------

i.e. you add  ", drop = FALSE" in line 2022
     and      "along = "       in line 2026
in the appropriate places.

A fixed version of cluster should appear soon, and also together
with R 1.7.1.

Martin Maechler <maechler at stat.math.ethz.ch>	http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO C16	Leonhardstr. 27
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1228			<><




More information about the R-help mailing list