[R] kmeans Clustering
Mark Hempelmann
neo27 at t-online.de
Thu Mar 23 21:25:02 CET 2006
Dear WizaRds,
My goal is to program the VS-KM algorithm by Brusco and Cradit 01 and I have
come to a complete stop in my efforts. Maybe anybody is willing to follow my
thoughts and offer some help.
In a first step, I want to use a single variable for the partitioning process.
As the center-matrix I use the objects that belong to the cluster I found with
the hierarchial Ward algorithm. Then, I have to take all possible variable pairs
and apply kmeans again, which is quite confusing to me. Here is
what I do:
## 0. data
mat <- matrix( c(6,7,8,2,3,4,12,14,14, 14,15,13,3,1,2,3,4,2,
15,3,10,5,11,7,13,6,1, 15,4,10,6,12,8,12,7,1), ncol=9, byrow=T )
rownames(mat) <- paste("v", 1:4, sep="" )
tmat <- t(mat)
## 1. Provide clusters via Ward:
ward <- hclust(d=dist(tmat), method = "ward", members=NULL)
## 2. Compute cluster centers and create center-matrix for kmeans:
groups <- cutree(ward, k = 3, h = NULL)
centroids <- vector(mode="numeric", length=3)
obj <- vector(mode="list", length=3)
for (i in 1:3){
where <- which(groups==i) # which object belongs to which group?
centroids[i] <- mean( tmat[ where, ] )
obj[[i]] <- tmat[where,]
}
P <- vector(mode="numeric", dim(mat)[2] )
pj <- vector(mode="list", length=dim(mat)[1])
for (i in 1:dim(mat)[1]){
pj[[i]] <- kmeans( tmat[,i], centers=centroids, iter.max=10, algorithm="MacQueen")
P <- rbind(P, pj[[i]]$cluster)
}
P <- P[-1,]
## gives a matrix of partitions using each single variable
## (I'm sure, P can be programmed much easier)
## 3. kmeans using all possible pairs of variables, here just e.g. variables 1
and 3:
wjk <- kmeans(tmat[,c(1,3)], centers=centroids, iter.max=10, algorithm="MacQueen")
###
which, of course, gives an error message since "centroids" is not a matrix of
the cluster centers. How on earth do I correctly construct a matrix of centers
corresponding to the pairwise variables? Is it always the same matrix no matter
which pair of variables I choose?
I apologize for my lack of clustering knowledge and expertise - any help is
welcome. Thank you very much.
Many greetings
mark
More information about the R-help
mailing list