[BioC] semi-supervised clustering
friedrich.leisch at stat.uni-muenchen.de
friedrich.leisch at stat.uni-muenchen.de
Fri Oct 26 09:33:38 CEST 2007
>>>>> On Thu, 25 Oct 2007 09:12:32 -0700 (PDT),
>>>>> Tim Smith (TS) wrote:
> Hi,
> Is there any package that implements semi-supervised clustering
> through 'must-link' and 'cannot-link' constraints?
Package flexclust on CRAN can do constrained clustering. The feature
is not well documented in the current release version, but
myfam <- kccaFamily("kmeans", groupFun = "minSumClusters")
clres <- kcca(x, k, myfam, group=mygroups)
will assign all points which belong to one group to the same
cluster using kmeans (but flexclust can use other distances than
Euclidean, too).
groupFun = "minSumClusters" will assign to the cluster where the
center has minimal average distance to all group members.
groupFun = "majorityClusters" assigns the all group members to the
cluster the majority belongs to.
groupFun = "differentClusters" implements a 'cannot-link'
constraint, obviously the group sizes must be smaller
than the number of clusters in this case.
Some details on the algorithms used can be found in
http://www.ci.tuwien.ac.at/papers/Leisch+Gruen-2006.pdf
Hope this helps,
Fritz
More information about the Bioconductor
mailing list